Mailing List Archive

Fixed-decimal types
While on the subject of RDBMS systems, a common need is to be able to
work with fixed-decimal data. I think a standard Python fixed-decimal
type would help to make Python database interfaces alot more robust.
I even wonder if the Python long type might be hijacked for this purpose
by adding a "scale" that indicates the number of digits to the right
of the decimal point. For example, an expression like:

1000000000.2500L

would create a fixed decimal number with a scale of 4.

People have built Python classes for fixed-decimal
types, but when working with RDBMS data, one often deals with
lots of data and efficiency matters. I also suspect that adding
scale to longs wouldn't be that hard and would be a fairly natural
extension.

In any case, a "standard" (being in the standard library would
be sufficient) fixed-decimal type would probably lead to better
database interfaces that (at least more) properly handled
fixed-decimal data.

Jim

--
Jim Fulton mailto:jim@digicool.com Python Powered!
Technical Director (888) 344-4332 http://www.python.org
Digital Creations http://www.digicool.com http://www.zope.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.
Re: Fixed-decimal types [ In reply to ]
What would be scale of the product of two fixed-decimal numbers?
E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are
arguments for either. Same question for division (harder, I think).

I like the idea of using the dd.ddL notation for this.

I have no time to implement it but would not be unwilling to accept
patches. They would have to be accompanied with a wet signature, see
http://www.python.org/1.5/wetsign.html.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Fixed-decimal types [ In reply to ]
--- Guido van Rossum <guido@CNRI.Reston.VA.US> wrote:
> What would be scale of the product of two
> fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to
> 4.00L? There are
> arguments for either. Same question for division
> (harder, I think).
Most commonly one is trying to avoid rounding errors
when dealing with money - a few cents rounding error
tends to result in a few billable hours with the
accountants at the end of the year!

SQL dialects and type-safe languages would make you
specify the precision of the variable to be assigned,
so the issue does not arise for other languages.

For the work I do, simply taking the precision of the
most precise input (4.00L)would do the trick, but your
answer (4.0000L) is purer. We should provide a
rounding function, and in practice anyone using such a
function would round (or floor, or ceiling) to get to
the desired precision immediately.

I'm not sure on division either but I'm sure there are
precedents to look at.

On the subject of adding new types to the standard
library, what are the plans on dates and times? Would
a cut-down mxDateTime ever be considered? It is fully
Open Source (unlike mxODBC) and was designed for the
DBAPI.

Regards,

Andy

=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

__________________________________________________
Do You Yahoo!?
Thousands of Stores. Millions of Products. All in one place.
Yahoo! Shopping: http://shopping.yahoo.com
Re: Date and timetypes (was: Fixed-decimal types) [ In reply to ]
> On the subject of adding new types to the standard
> library, what are the plans on dates and times? Would
> a cut-down mxDateTime ever be considered? It is fully
> Open Source (unlike mxODBC) and was designed for the
> DBAPI.

I don't know much about date/time types, or about mxDateTime.
My intuition is that there are too many ways to do it, and that being
compatible with commercial databases may not be the right way to do it
for core Python.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Fixed-decimal types [ In reply to ]
Guido van Rossum wrote:
>
> What would be scale of the product of two fixed-decimal numbers?
> E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are
> arguments for either. Same question for division (harder, I think).

I'd be inclined to start by doing some research to see if some standard
(SQL?) defines this somewhere. It would be nice if someone has already
done the requirements work for us. :)

> I like the idea of using the dd.ddL notation for this.
>
> I have no time to implement

Me neither.

> it but would not be unwilling to accept patches.

Cool. If no one else volunteers, then I'll try to find a way
to get this done (not necessarily by me). I think it is pretty
important.

> They would have to be accompanied with a wet signature, see
> http://www.python.org/1.5/wetsign.html.

Yup.

Jim

--
Jim Fulton mailto:jim@digicool.com Python Powered!
Technical Director (888) 344-4332 http://www.python.org
Digital Creations http://www.digicool.com http://www.zope.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.
Re: Date and timetypes (was: Fixed-decimal types) [ In reply to ]
Sorry, should have replied to the list...

--- Andy Robinson <captainrobbo@yahoo.com> wrote:
> Date: Thu, 23 Dec 1999 08:37:18 -0800 (PST)
> From: Andy Robinson <captainrobbo@yahoo.com>
> Reply-to: andy@robanal.demon.co.uk
> Subject: Re: [Python-Dev] Date and timetypes (was:
> Fixed-decimal types)
> To: Guido van Rossum <guido@CNRI.Reston.VA.US>
>
> --- Guido van Rossum <guido@CNRI.Reston.VA.US>
> wrote:
> > I don't know much about date/time types, or about
> > mxDateTime.
> > My intuition is that there are too many ways to do
> > it, and that being
> > compatible with commercial databases may not be
> the
> > right way to do it
> > for core Python.
> >
>
> OK. Let me rephrase it. Say we form a consensus on
> 'the right way'. Are you amenable to some solution
> which goes back before 1970 and after 2038 going
> into
> the standard library?
>
> And does your answer change if it involves some
> compiled code as well?
>
> I mention mxDateTime because it was agreed by a
> Python
> SIG, is mature and stable, and I find it very
> useful.
> And the core type is pretty small - much of the
> helper
> stuff in the package now could be kept separate from
> the main Python distribution.
>
> - Andy
>
>
> =====
> Andy Robinson
> Robinson Analytics Ltd.
> ------------------
> My opinions are the official policy of Robinson
> Analytics Ltd.
> They just vary from day to day.
>
>
_________________________________________________________
> Do You Yahoo!?
> Get your free @yahoo.com address at
> http://mail.yahoo.com
>


=====
Andy Robinson
Robinson Analytics Ltd.
------------------
My opinions are the official policy of Robinson Analytics Ltd.
They just vary from day to day.

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com
Re: Date and timetypes (was: Fixed-decimal types) [ In reply to ]
> > OK. Let me rephrase it. Say we form a consensus on 'the right
> > way'. Are you amenable to some solution which goes back before
> > 1970 and after 2038 going into the standard library?

No problem.

> > And does your answer change if it involves some
> > compiled code as well?

I'd rather not.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Fixed-decimal types [ In reply to ]
Jim Fulton wrote:
>
> Guido van Rossum wrote:
> >
> > What would be scale of the product of two fixed-decimal numbers?
> > E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are
> > arguments for either. Same question for division (harder, I think).
>
> I'd be inclined to start by doing some research to see if some standard
> (SQL?) defines this somewhere. It would be nice if someone has already
> done the requirements work for us. :)

Here is what the book "SQL-99 Complete, Really" says that the SQL
standard says:

- for addition and subtraction of two "exact" (fixed-decimal)
numbers, the result has the maximum of the scales.

- for multiplication of two "exact" (fixed-decimal)
numbers, the result has the sum of the scales.

- punts on division

- for addition, subtraction, multiplication or division
between "exact" (fixed point) and "approximate" (floating point)
yields an approximate result. This means that fixed-decimal
coerces to float.

I'm curious to see who else chips in with examples from other systems.

Jim

--
Jim Fulton mailto:jim@digicool.com Python Powered!
Technical Director (888) 344-4332 http://www.python.org
Digital Creations http://www.digicool.com http://www.zope.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.
Re: Fixed-decimal types [ In reply to ]
Jim Fulton wrote:

> - for addition and subtraction of two "exact" (fixed-decimal)
> numbers, the result has the maximum of the scales.

One could argue that this is incorrect: if "3.1" means that I know the
value to one decimal of precision, and "2.01" means that I know that
value to two decimals of precision, stating the result of their sum as
"5.11" suggests that I know the result to two decimals of precision,
which is of course false: because I only knew one decimal of precision
for one of the operands, I only know (at most!) one decimal of
precision for the result.

Not arguing for this interpretation, just indicating that doing fixed
precision arithmetic right is hard. I'm waiting for Tim Peters'
contribution, but he's on vacation so it may be a while.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Date and timetypes (was: Fixed-decimal types) [ In reply to ]
Guido van Rossum wrote:
>
> > > OK. Let me rephrase it. Say we form a consensus on 'the right
> > > way'. Are you amenable to some solution which goes back before
> > > 1970 and after 2038 going into the standard library?
>
> No problem.
>
> > > And does your answer change if it involves some
> > > compiled code as well?
>
> I'd rather not.

As far as mxDateTime goes, I'd rather not see it in the core
distribution. Including the mx stuff in a separate PythonPowerTools
distribution would be cool though. For a start in this direction
see e.g.:

http://startship.skyport.net/~lemburg/PPowerTools-0.2.zip

Note that I'll wrap all my mx extensions into a new mx package
which will come in several flavours next year. There will no
longer be separate packages due to the various naming
collisions and to enable intra-mx-package dependencies.

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 7 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
RE: Fixed-decimal types [ In reply to ]
[Guido]
> ...
> Not arguing for this interpretation, just indicating that doing
> fixed precision arithmetic right is hard.

It's not so much hard as it is arbitrary. The floating-point world is
standardized now, but the fixed-point world remains a mish-mash of
incompatible legacy schemes carried across generations of products for no
reason other than product-specific compatibility. So despite that
fixed-point has a specialty audience, whatever rules Python chooses will
leave it incompatible with much of that audience's (mixed!) expectations.

If fixed-point is needed, and my FixedPoint.py isn't good enough (all other
fixed point pkgs I've seen for Python were braindead), then it should be
implemented such that developers can control both rounding and precision
propagation. I'll attach suitable kernels; they haven't been tested but any
bugs discovered will be trivial to fix (there are no difficulties here, but
typos are likely); the kernels supply the bulk of what's required, whether
implemented in Python or C; various packages can wrap them to supply
whatever policies they like; see FixedPoint.py for exact string<->FixedPoint
and exact float->FixedPoint conversions; and that's the end of my
involvement in fixed-point <wink>.

Python should certainly *not* add a "scale factor" to its current long
implementation; fixed-point should be a distinct type, as scale-factor
fiddling is clumsy and pervasive (long arithmetic is challenging enough to
get correct and quick without this obfuscating distraction; and by leaving
scale factors out of it, it's much easier to plug in alternative bigint
implementations (like GMP)).

One other point: some people are going to want BCD (binary-coded decimal),
which suffers the same mish-mash of legacy policies, but with a different
data representation. The point is that many commercial applications spend
much more time doing I/O conversions than arithmetic, and BCD accepts slow
arithmetic (in the absence of special HW support) in return for fast scaling
& I/O conversion.

Forgetting the database-heads for a moment, decimal *floating*-point is what
calculators do, so that's what "real people" are most comfortable with. The
IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that
completely. Add a means to boost "global" precision (a la REXX), and it's a
powerful tool even for experts (benefits approximating those of unbounded
rational arithmetic but with bounded & user-controllable expense).

can-never-have-too-many-numeric-types-but-always-have-
too-few-literal-notations-ly y'rs - tim


# Kernels for fixed-point decimal arithmetic.

# _add, _sub, _mul, _div all have arglist
# n1, p1, n2, p2, p, round=DEFAULT_ROUND
# n1 and n2 are longs; p1, p2 and p ints >= 0.
# The inputs are exactly n1/10**p1 and n2/10**p2.
#
# The return value is the integer n such that n/10**p is the best
# approximation to the infinite-precision result. In other words, p1
# and p2 are the input precisions and p is the desired output
# precision, where precision is the # of digits *after* the decimal
# point.
#
# What "best approximation" means is determined by the round function.
# In many cases rounding isn't required, but when it is
# round(top, bot)
# is returned. top and bot are longs, with bot > 0 guaranteed. The
# infinite-precision result is top/bot. round must return an integer
# (long) approximation to top/bot, using whichever rounding discipline
# you want. By default, IEEE round-to-nearest/even is used; see the
# _roundXXX functions for examples of suitable rounding functions.
#
# Note: The only code here that knows we're working in decimal is
# function _tento; simply change the "10L" in that to do fixed-point
# arithmetic in some other base.
#
# Example:
#
# >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7
# >>> r7
# 14285714285714285714L
# >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5
# >>> r5
# 20000000000000000000L
# >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35
# >>> sum
# 34285714285714285714L
# >>> _mul(sum, 20, 35L, 0, 20)
# 1199999999999999999990L
# >>> _mul(sum, 20, 35L, 0, 18)
# 12000000000000000000L
# >>> _mul(sum, 20, 35L, 0, 0)
# 12L
# >>>

###################################################################
# Sample rounding functions.
###################################################################

# Round to minus infinity.

def _roundminf(top, bot):
assert bot > 0
return top / bot

# Round to plus infinity.

def _roundpinf(top, bot):
assert bot > 0
q, r = divmod(top, bot)
# answer is exactly q + r/bot; and 0 <= r < bot
if r:
q = q + 1
return q

# IEEE nearest/even rounding (closest integer; in case of tie closest
# even integer).

def _roundne(top, bot):
assert bot > 0
q, r = divmod(top, bot)
# answer is exactly q + r/bot; and 0 <= r < bot
c = cmp(r << 1, bot)
# c < 0 <-> r < bot/2, etc
if c > 0 or (c == 0 and (q & 1) == 1):
q = q + 1
return q

# "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder
# >= half away from 0).

def _roundhalf(top, bot):
assert bot > 0
q, r = divmod(top, bot)
# answer is exactly q + r/bot; and 0 <= r < bot
c = cmp(r << 1, bot)
# c < 0 <-> r < bot/2, etc
if c > 0 or (c == 0 and q >= 0):
q = q + 1
return q

# Round toward 0 (throw away remainder).

def _roundchop(top, bot):
assert bot > 0
q, r = divmod(top, bot)
# answer is exactly q + r/bot; and 0 <= r < bot
if r and q < 0:
q = q + 1
return q

###################################################################
# Kernels for + - * /.
###################################################################

DEFAULT_ROUND = _roundne

def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
assert p1 >= 0
assert p2 >= 0
assert p >= 0
# (n1/10**p1 + n2/10**p2) * 10**p ==
# (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p
max = p1 # until proven otherwise
if p1 < p2:
n1 = n1 * _tento(p2 - p1)
max = p2
elif p2 < p1:
n2 = n2 * _tento(p1 - p2)
n3 = n1 + n2
p3 = p - max
if p3 > 0:
n3 = n3 * _tento(p3)
elif p3 < 0:
n3 = round(n3, _tento(-p3))
return n3

def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
assert p1 >= 0
assert p2 >= 0
assert p >= 0
return _add(n1, p1, -n2, p2, p, round)

def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
assert p1 >= 0
assert p2 >= 0
assert p >= 0
# (n1/10**p1 * n2/10**p2) * 10**p ==
# (n1*n2)/10**(p1+p2) * 10**p
n3 = n1 * n2
p3 = p - p1 - p2
if p3 > 0:
n3 = n3 * _tento(p3)
elif p3 < 0:
n3 = round(n3, _tento(-p3))
return n3

def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND):
assert p1 >= 0
assert p2 >= 0
assert p >= 0
if n2 == 0:
raise ZeroDivisionError("scaled integer")
# (n1/10**p1 / n2/10**p2) * 10**p ==
# (n1/n2) * 10**(p2-p1+p)
p3 = p2 - p1 + p
if p3 > 0:
n1 = n1 * _tento(p3)
elif p3 < 0:
n2 = n2 * _tento(-p3)
if n2 < 0:
n1 = -n1
n2 = -n2
return round(n1, n2)

def _tento(i, _cache={}):
assert i >= 0
try:
return _cache[i]
except KeyError:
answer = _cache[i] = 10L ** i
return answer