Mailing List Archive: on floating-point numbers

on floating-point numbers

hrouselle at jevedi

Sep 2, 2021, 6:51 AM

Post #1 of 59 (756 views)

Just sharing a case of floating-point numbers. Nothing needed to be
solved or to be figured out. Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do
know how to work with base 2 and scientific notation. So the idea of
expressing a number as

mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on
what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last
position in the list. (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example. It's not so easy. Although I could
display the violation of commutativity by moving just a single number in
the list, I also see that 7.23 commutes with every other number in the
list.

(*) My request

I would like to just get some clarity. I guess I need to translate all
these numbers into base 2 and perform the addition myself to see the
situation coming up?
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

pkpearson at nowhere

Sep 2, 2021, 7:41 AM

Post #2 of 59 (756 views)

On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/...
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001

Welcome to the exciting world of roundoff error:

Python 3.5.3 (default, Jul 9 2020, 13:00:10)
[GCC 6.3.0 20170516] on linux

>>> 0.1 + 0.2 + 9.3 == 0.1 + 9.3 + 0.2
False
>>>

--
To email me, substitute nowhere->runbox, invalid->com.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

auriocus at gmx

Sep 2, 2021, 7:47 AM

Post #3 of 59 (756 views)

Am 02.09.21 um 15:51 schrieb Hope Rouselle:
> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)

I believe it is not commutativity, but associativity, that is violated.
Even for floating point, a+b=b+a except for maybe some extreme cases
like denormliazed numbers etc.

But in general (a+b)+c != a+ (b+c)

Consider decimal floating point with 2 digits.
a=1
b=c=0.04

Then you get LHS;
(1 + 0.04) + 0.04 = 1 + 0.04 = 1

RHS:

1 + (0.04 + 0.04) = 1 + 0.08 = 1.1

Your sum is evaluated like (((a + b) + c) + ....) and hence, if you
permute the numbers, it can be unequal. If you need better accuracy,
there is the Kahan summation algorithm and other alternatives:
https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Christian
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

auriocus at gmx

Sep 2, 2021, 7:49 AM

Post #4 of 59 (756 views)

Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
> On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
>> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>
>>> 39.60000000000001
>>
>> Welcome to the exciting world of roundoff error:
>
> Welcome to the exiting world of Usenet.
>
> *Plonk*

Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
right afterwards.

Christian
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

hrouselle at jevedi

Sep 2, 2021, 8:08 AM

Post #5 of 59 (756 views)

Hope Rouselle <hrouselle@jevedi.com> writes:

> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> bit (AMD64)]'
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)

Suppose these numbers are prices in dollar, never going beyond cents.
Would it be safe to multiply each one of them by 100 and therefore work
with cents only? For instance

--8<---------------cut here---------------start------------->8---
>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(map(lambda x: int(x*100), ls)) / 100
39.6

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(map(lambda x: int(x*100), ls)) / 100
39.6
--8<---------------cut here---------------end--------------->8---

Or multiplication by 100 isn't quite ``safe'' to do with floating-point
numbers either? (It worked in this case.)

I suppose that if I multiply it by a power of two, that would be an
operation that I can be sure will not bring about any precision loss
with floating-point numbers. Do you agree?
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

rosuav at gmail

Sep 2, 2021, 11:43 AM

Post #6 of 59 (756 views)

On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
> >>> import sys
> >>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'
>
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(ls)
> 39.599999999999994
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)
>

It's not about the commutativity of any particular pair of operands -
that's always guaranteed. What you're seeing here is the results of
intermediate rounding. Try this:

>>> def sum(stuff):
... total = 0
... for thing in stuff:
... total += thing
... print(thing, "-->", total)
... return total
...
>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
7.23 --> 7.23
8.41 --> 15.64
6.15 --> 21.79
2.31 --> 24.099999999999998
7.73 --> 31.83
7.77 --> 39.599999999999994
39.599999999999994
>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
8.41 --> 8.41
6.15 --> 14.56
2.31 --> 16.87
7.73 --> 24.6
7.77 --> 32.370000000000005
7.23 --> 39.60000000000001
39.60000000000001
>>>

Nearly all floating-point confusion stems from an assumption that the
input values are exact. They usually aren't. Consider:

>>> from fractions import Fraction
>>> for n in ls: print(n, Fraction(*n.as_integer_ratio()))
...
8.41 2367204554136617/281474976710656
6.15 3462142213541069/562949953421312
2.31 5201657569612923/2251799813685248
7.73 2175801569973371/281474976710656
7.77 2187060569041797/281474976710656
7.23 2035064081618043/281474976710656

Those are the ACTUAL values you're adding. Do the same exercise with
the partial sums, and see where the rounding happens. It's probably
happening several times, in fact.

The naive summation algorithm used by sum() is compatible with a
variety of different data types - even lists, although it's documented
as being intended for numbers - but if you know for sure that you're
working with floats, there's a more accurate algorithm available to
you.

>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
39.6
>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
39.6

It seeks to minimize loss to repeated rounding and is, I believe,
independent of data order.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

rosuav at gmail

Sep 2, 2021, 12:24 PM

Post #7 of 59 (756 views)

On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Hope Rouselle <hrouselle@jevedi.com> writes:
>
> > Just sharing a case of floating-point numbers. Nothing needed to be
> > solved or to be figured out. Just bringing up conversation.
> >
> > (*) An introduction to me
> >
> > I don't understand floating-point numbers from the inside out, but I do
> > know how to work with base 2 and scientific notation. So the idea of
> > expressing a number as
> >
> > mantissa * base^{power}
> >
> > is not foreign to me. (If that helps you to perhaps instruct me on
> > what's going on here.)
> >
> > (*) A presentation of the behavior
> >
> >>>> import sys
> >>>> sys.version
> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > bit (AMD64)]'
> >
> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>> sum(ls)
> > 39.599999999999994
> >
> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>> sum(ls)
> > 39.60000000000001
> >
> > All I did was to take the first number, 7.23, and move it to the last
> > position in the list. (So we have a violation of the commutativity of
> > addition.)
>
> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only? For instance

Yes and no. It absolutely *is* safe to always work with cents, but to
do that, you have to be consistent: ALWAYS work with cents, never with
floating point dollars.

(Or whatever other unit you choose to use. Most currencies have a
smallest-normally-used-unit, with other currency units (where present)
being whole number multiples of that minimal unit. Only in forex do
you need to concern yourself with fractional cents or fractional yen.)

But multiplying a set of floats by 100 won't necessarily solve your
problem; you may have already fallen victim to the flaw of assuming
that the numbers are represented accurately.

> --8<---------------cut here---------------start------------->8---
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
> --8<---------------cut here---------------end--------------->8---
>
> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> numbers either? (It worked in this case.)

You're multiplying and then truncating, which risks a round-down
error. Try adding a half onto them first:

int(x * 100 + 0.5)

But that's still not a perfect guarantee. Far safer would be to
consider monetary values to be a different type of value, not just a
raw number. For instance, the value $7.23 could be stored internally
as the integer 723, but you also know that it's a value in USD, not a
simple scalar. It makes perfect sense to add USD+USD, it makes perfect
sense to multiply USD*scalar, but it doesn't make sense to multiply
USD*USD.

> I suppose that if I multiply it by a power of two, that would be an
> operation that I can be sure will not bring about any precision loss
> with floating-point numbers. Do you agree?

Assuming you're nowhere near 2**53, yes, that would be safe. But so
would multiplying by a power of five. The problem isn't precision loss
from the multiplication - the problem is that your input numbers
aren't what you think they are. That number 7.23, for instance, is
really....

>>> 7.23.as_integer_ratio()
(2035064081618043, 281474976710656)

... the rational number 2035064081618043 / 281474976710656, which is
very close to 7.23, but not exactly so. (The numerator would have to
be ...8042.88 to be exactly correct.) There is nothing you can do at
this point to regain the precision, although a bit of multiplication
and rounding can cheat it and make it appear as if you did.

Floating point is a very useful approximation to real numbers, but
real numbers aren't the best way to represent financial data. Integers
are.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:25 PM

Post #8 of 59 (756 views)

On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle <hrouselle@jevedi.com>
declaimed the following:

>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
>39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
>39.60000000000001
>
>All I did was to take the first number, 7.23, and move it to the last
>position in the list. (So we have a violation of the commutativity of
>addition.)
>

https://www.amazon.com/Real-Computing-Made-Engineering-Calculations-dp-B01K0Q03AA/dp/B01K0Q03AA/ref=mt_other?_encoding=UTF8&me=&qid=

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:48 PM

Post #9 of 59 (756 views)

On Thu, 02 Sep 2021 12:08:21 -0300, Hope Rouselle <hrouselle@jevedi.com>
declaimed the following:

>Suppose these numbers are prices in dollar, never going beyond cents.
>Would it be safe to multiply each one of them by 100 and therefore work
>with cents only? For instance
>

A lot of software with a "monetary" data type uses scaled INTEGERS for
that... M$ Excel uses four decimal places, internally scaled.

The Ada language has both FIXED and FLOAT data types; for FIXED one
specifies the delta between adjacent values that must be met (the compiler
is free to use something with more resolution internally).

Money should never be treated as a floating value.

>I suppose that if I multiply it by a power of two, that would be an
>operation that I can be sure will not bring about any precision loss
>with floating-point numbers. Do you agree?

Are we talking IEEE floats? Or some of the ancient formats used for
computers that may not have had hardware floating point units, or predate
the IEEE standard.

Normalized with suppressed leading bit? (If normalization always puts
the most significant bit at the binary point... Why store that bit?, shift
it out and gain another bit at the small end)

Xerox Sigma floats used an exponent based on radix 16. A normalized
mantissa could have up to three leading 0 bits.

Motorola Fast Floating Point (software float implementation used on
base Amiga systems -- the exponent was in the low byte)

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:53 PM

Post #10 of 59 (756 views)

On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <rosuav@gmail.com>
declaimed the following:

>
>The naive summation algorithm used by sum() is compatible with a
>variety of different data types - even lists, although it's documented
>as being intended for numbers - but if you know for sure that you're
>working with floats, there's a more accurate algorithm available to
>you.
>
>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
>39.6
>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
>39.6
>
>It seeks to minimize loss to repeated rounding and is, I believe,
>independent of data order.
>

Most likely it sorts the data so the smallest values get summed first,
and works its way up to the larger values. That way it minimizes the losses
that occur when denormalizing a value (to set the exponent equal to that of
the next larger value).

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

grant.b.edwards at gmail

Sep 2, 2021, 2:41 PM

Post #11 of 59 (756 views)

On 2021-09-02, Hope Rouselle <hrouselle@jevedi.com> wrote:

> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only?

The _practical_ answer is that no, it's not safe to use floating point
when doing normal bookeeping type stuff with money. At least not if
you want everything to balance correctly at the end of the day (week,
month, quarter, year or etc.). Use integer cents, or mills or
whatever. If you have to use floating point to calculate a payment or
credit/debit amount, always round or truncate the result back to an
integer value in your chosen units before actually using that amount
for anything.

In theory, decimal floating point should be usable, but I've never
personally worked with it. Back in the day (1980's) microcomputers
didn't have floating point hardware, and many compilers allowed you to
choose between base-2 floating point and base-10 (BCD) floating
point. The idea was that if you were doing financial stuff, you could
use BCD floating point.

--
Grant

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

rosuav at gmail

Sep 2, 2021, 4:21 PM

Post #12 of 59 (756 views)

On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>
> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <rosuav@gmail.com>
> declaimed the following:
>
> >
> >The naive summation algorithm used by sum() is compatible with a
> >variety of different data types - even lists, although it's documented
> >as being intended for numbers - but if you know for sure that you're
> >working with floats, there's a more accurate algorithm available to
> >you.
> >
> >>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >39.6
> >>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >39.6
> >
> >It seeks to minimize loss to repeated rounding and is, I believe,
> >independent of data order.
> >
>
> Most likely it sorts the data so the smallest values get summed first,
> and works its way up to the larger values. That way it minimizes the losses
> that occur when denormalizing a value (to set the exponent equal to that of
> the next larger value).
>

I'm not sure, but that sounds familiar. It doesn't really matter
though - the docs just say that it is an "accurate floating point
sum", so the precise algorithm is an implementation detail.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

roel at roelschroeven

Sep 3, 2021, 12:45 AM

Post #13 of 59 (756 views)

Op 2/09/2021 om 17:08 schreef Hope Rouselle:
> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>> sum(ls)
> > 39.599999999999994
> >
> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>> sum(ls)
> > 39.60000000000001
> >
> > All I did was to take the first number, 7.23, and move it to the last
> > position in the list. (So we have a violation of the commutativity of
> > addition.)
>
> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only?
For working with monetary values, or any value that needs to accurate
correspondence to 10-based values, best use Python's Decimal; see the
documentation: https://docs.python.org/3.8/library/decimal.html

Example:

from decimal import Decimal as D
ls1 = [D('7.23'), D('8.41'), D('6.15'), D('2.31'), D('7.73'), D('7.77')]
ls2 = [D('8.41'), D('6.15'), D('2.31'), D('7.73'), D('7.77'), D('7.23')]
print(sum(ls1), sum(ls2))

Output:
39.60 39.60

(Note that I initialized the values with strings instead of numbers, to
allow Decimal access to the exact number without it already being
converted to a float that doesn't necessarily exactly correspond to the
decimal value)

--
"Your scientists were so preoccupied with whether they could, they didn't
stop to think if they should"
-- Dr. Ian Malcolm

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

auriocus at gmx

Sep 3, 2021, 1:11 AM

Post #14 of 59 (741 views)

Am 02.09.21 um 21:02 schrieb Julio Di Egidio:
> On Thursday, 2 September 2021 at 20:43:36 UTC+2, Chris Angelico wrote:
>> On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrou...@jevedi.com> wrote:
>
>>> All I did was to take the first number, 7.23, and move it to the last
>>> position in the list. (So we have a violation of the commutativity of
>>> addition.)
>>>
>> It's not about the commutativity of any particular pair of operands -
>> that's always guaranteed.
>
> Nope, that is rather *not* guaranteed, as I have quite explained up thread.
>

No, you haven't explained that. You linked to the famous Goldberg paper.
Where in the paper does it say that operations on floats are not
commutative?

I'd be surprised because it is generally wrong.
Unless you have special numbers like NaN or signed zeros etc., a+b=b+a
and a*b=b*a holds also for floats.

Christiah
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

nospam at please

Sep 3, 2021, 2:18 AM

Post #15 of 59 (741 views)

Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
>> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
>>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
>>> declaimed the following:
>>>
>>>> The naive summation algorithm used by sum() is compatible with a
>>>> variety of different data types - even lists, although it's documented
>>>> as being intended for numbers - but if you know for sure that you're
>>>> working with floats, there's a more accurate algorithm available to
>>>> you.
>>>>
>>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
>>>> 39.6
>>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
>>>> 39.6
>>>>
>>>> It seeks to minimize loss to repeated rounding and is, I believe,
>>>> independent of data order.
>>>
>>> Most likely it sorts the data so the smallest values get summed first,
>>> and works its way up to the larger values. That way it minimizes the losses
>>> that occur when denormalizing a value (to set the exponent equal to that of
>>> the next larger value).
>>>
>> I'm not sure, but that sounds familiar. It doesn't really matter
>> though - the docs just say that it is an "accurate floating point
>> sum", so the precise algorithm is an implementation detail.
>
> The docs are quite misleading there, it is not accurate without further qualifications.
>
> <https://docs.python.org/3.8/library/math.html#math.fsum>
> <https://code.activestate.com/recipes/393090/>
>
> That said, fucking pathetic, when Dunning-Kruger is a compliment...
>
> *Plonk*
>
> Julio
>

https://en.wikipedia.org/wiki/IEEE_754
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

rosuav at gmail

Sep 3, 2021, 5:45 AM

Post #16 of 59 (741 views)

On Fri, Sep 3, 2021 at 10:42 PM jak <nospam@please.ty> wrote:
>
> Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> > On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
> >> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> >>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
> >>> declaimed the following:
> >>>
> >>>> The naive summation algorithm used by sum() is compatible with a
> >>>> variety of different data types - even lists, although it's documented
> >>>> as being intended for numbers - but if you know for sure that you're
> >>>> working with floats, there's a more accurate algorithm available to
> >>>> you.
> >>>>
> >>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >>>> 39.6
> >>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >>>> 39.6
> >>>>
> >>>> It seeks to minimize loss to repeated rounding and is, I believe,
> >>>> independent of data order.
> >>>
> >>> Most likely it sorts the data so the smallest values get summed first,
> >>> and works its way up to the larger values. That way it minimizes the losses
> >>> that occur when denormalizing a value (to set the exponent equal to that of
> >>> the next larger value).
> >>>
> >> I'm not sure, but that sounds familiar. It doesn't really matter
> >> though - the docs just say that it is an "accurate floating point
> >> sum", so the precise algorithm is an implementation detail.
> >
> > The docs are quite misleading there, it is not accurate without further qualifications.
> >
> > <https://docs.python.org/3.8/library/math.html#math.fsum>
> > <https://code.activestate.com/recipes/393090/>
> >
>
> https://en.wikipedia.org/wiki/IEEE_754

I believe the definition of "accurate" here is that, if you take all
of the real numbers represented by those floats, add them all together
with mathematical accuracy, and then take the nearest representable
float, that will be the exact value that fsum will return. In other
words, its accuracy is exactly as good as the final result can be.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

oscar.j.benjamin at gmail

Sep 3, 2021, 6:15 AM

Post #17 of 59 (741 views)

On Fri, 3 Sept 2021 at 13:48, Chris Angelico <rosuav@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 10:42 PM jak <nospam@please.ty> wrote:
> >
> > Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> > > On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
> > >> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> > >>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
> > >>> declaimed the following:
> > >>>
> > >>>> The naive summation algorithm used by sum() is compatible with a
> > >>>> variety of different data types - even lists, although it's documented
> > >>>> as being intended for numbers - but if you know for sure that you're
> > >>>> working with floats, there's a more accurate algorithm available to
> > >>>> you.
> > >>>>
> > >>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> > >>>> 39.6
> > >>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> > >>>> 39.6
> > >>>>
> > >>>> It seeks to minimize loss to repeated rounding and is, I believe,
> > >>>> independent of data order.
> > >>>
> > >>> Most likely it sorts the data so the smallest values get summed first,
> > >>> and works its way up to the larger values. That way it minimizes the losses
> > >>> that occur when denormalizing a value (to set the exponent equal to that of
> > >>> the next larger value).
> > >>>
> > >> I'm not sure, but that sounds familiar. It doesn't really matter
> > >> though - the docs just say that it is an "accurate floating point
> > >> sum", so the precise algorithm is an implementation detail.
> > >
> > > The docs are quite misleading there, it is not accurate without further qualifications.
> > >
> > > <https://docs.python.org/3.8/library/math.html#math.fsum>
> > > <https://code.activestate.com/recipes/393090/>
> > >
> >
> > https://en.wikipedia.org/wiki/IEEE_754
>
> I believe the definition of "accurate" here is that, if you take all
> of the real numbers represented by those floats, add them all together
> with mathematical accuracy, and then take the nearest representable
> float, that will be the exact value that fsum will return. In other
> words, its accuracy is exactly as good as the final result can be.

It's as good as it can be if the result must fit into a single float.
Actually the algorithm itself maintains an exact result for the sum
internally using a list of floats whose exact sum is the same as that
of the input list. In essence it compresses a large list of floats to
a small list of say 2 or 3 floats while preserving the exact value of
the sum.

Unfortunately fsum does not give any way to access the internal exact
list so using fsum repeatedly suffers the same problems as plain float
arithmetic e.g.:
>>> x = 10**20
>>> fsum([fsum([1, x]), -x])
0.0

--
Oscar
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

o1bigtenor at gmail

Sep 3, 2021, 7:08 AM

Post #18 of 59 (741 views)

On Thu, Sep 2, 2021 at 2:27 PM Chris Angelico <rosuav@gmail.com> wrote:

> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
> >
> > Hope Rouselle <hrouselle@jevedi.com> writes:
> >
> > > Just sharing a case of floating-point numbers. Nothing needed to be
> > > solved or to be figured out. Just bringing up conversation.
> > >
> > > (*) An introduction to me
> > >
> > > I don't understand floating-point numbers from the inside out, but I do
> > > know how to work with base 2 and scientific notation. So the idea of
> > > expressing a number as
> > >
> > > mantissa * base^{power}
> > >
> > > is not foreign to me. (If that helps you to perhaps instruct me on
> > > what's going on here.)
> > >
> > > (*) A presentation of the behavior
> > >
> > >>>> import sys
> > >>>> sys.version
> > > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > > bit (AMD64)]'
> > >
> > >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>>> sum(ls)
> > > 39.599999999999994
> > >
> > >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>>> sum(ls)
> > > 39.60000000000001
> > >
> > > All I did was to take the first number, 7.23, and move it to the last
> > > position in the list. (So we have a violation of the commutativity of
> > > addition.)
> >
> > Suppose these numbers are prices in dollar, never going beyond cents.
> > Would it be safe to multiply each one of them by 100 and therefore work
> > with cents only? For instance
>
> Yes and no. It absolutely *is* safe to always work with cents, but to
> do that, you have to be consistent: ALWAYS work with cents, never with
> floating point dollars.
>
> (Or whatever other unit you choose to use. Most currencies have a
> smallest-normally-used-unit, with other currency units (where present)
> being whole number multiples of that minimal unit. Only in forex do
> you need to concern yourself with fractional cents or fractional yen.)
>
> But multiplying a set of floats by 100 won't necessarily solve your
> problem; you may have already fallen victim to the flaw of assuming
> that the numbers are represented accurately.
>
> > --8<---------------cut here---------------start------------->8---
> > >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> >
> > >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> > --8<---------------cut here---------------end--------------->8---
> >
> > Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> > numbers either? (It worked in this case.)
>
> You're multiplying and then truncating, which risks a round-down
> error. Try adding a half onto them first:
>
> int(x * 100 + 0.5)
>
> But that's still not a perfect guarantee. Far safer would be to
> consider monetary values to be a different type of value, not just a
> raw number. For instance, the value $7.23 could be stored internally
> as the integer 723, but you also know that it's a value in USD, not a
> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
> sense to multiply USD*scalar, but it doesn't make sense to multiply
> USD*USD.
>
> > I suppose that if I multiply it by a power of two, that would be an
> > operation that I can be sure will not bring about any precision loss
> > with floating-point numbers. Do you agree?
>
> Assuming you're nowhere near 2**53, yes, that would be safe. But so
> would multiplying by a power of five. The problem isn't precision loss
> from the multiplication - the problem is that your input numbers
> aren't what you think they are. That number 7.23, for instance, is
> really....
>
> >>> 7.23.as_integer_ratio()
> (2035064081618043, 281474976710656)
>
> ... the rational number 2035064081618043 / 281474976710656, which is
> very close to 7.23, but not exactly so. (The numerator would have to
> be ...8042.88 to be exactly correct.) There is nothing you can do at
> this point to regain the precision, although a bit of multiplication
> and rounding can cheat it and make it appear as if you did.
>
> Floating point is a very useful approximation to real numbers, but
> real numbers aren't the best way to represent financial data. Integers
> are.
>
>
Hmmmmmmm - - - ZI would suggest that you haven't looked into
taxation yet!
In taxation you get a rational number that MUST be multiplied by
the amount in currency.
The error rate here is stupendous.
Some organizations track each transaction with its taxes rounded.
Then some track using use untaxed and then calculate the taxes
on the whole (when you have 2 or 3 or 4 (dunno about more but
who knows there are some seriously tax loving jurisdictions out there))
the differences between adding amounts and then calculating taxes
and calculating taxes on each amount and then adding all items
together can have some 'interesting' differences.

So financial data MUST be able to handle rational numbers.
(I have been bit by the differences enumerated in the previous!)

Regards
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

rosuav at gmail

Sep 3, 2021, 8:13 AM

Post #19 of 59 (741 views)

On Sat, Sep 4, 2021 at 12:08 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
> Hmmmmmmm - - - ZI would suggest that you haven't looked into
> taxation yet!
> In taxation you get a rational number that MUST be multiplied by
> the amount in currency.

(You can, of course, multiply a currency amount by any scalar. Just
not by another currency amount.)

> The error rate here is stupendous.
> Some organizations track each transaction with its taxes rounded.
> Then some track using use untaxed and then calculate the taxes
> on the whole (when you have 2 or 3 or 4 (dunno about more but
> who knows there are some seriously tax loving jurisdictions out there))
> the differences between adding amounts and then calculating taxes
> and calculating taxes on each amount and then adding all items
> together can have some 'interesting' differences.
>
> So financial data MUST be able to handle rational numbers.
> (I have been bit by the differences enumerated in the previous!)

The worst problem is knowing WHEN to round. Sometimes you have to do
intermediate rounding in order to make something agree with something
else :(

But if you need finer resolution than the cent, I would still
recommend trying to use fixed-point arithmetic. The trouble is
figuring out exactly how much precision you need. Often, 1c precision
is actually sufficient.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

python at mrabarnett

Sep 3, 2021, 8:39 AM

Post #20 of 59 (741 views)

On 2021-09-03 16:13, Chris Angelico wrote:
> On Sat, Sep 4, 2021 at 12:08 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
>> Hmmmmmmm - - - ZI would suggest that you haven't looked into
>> taxation yet!
>> In taxation you get a rational number that MUST be multiplied by
>> the amount in currency.
>
> (You can, of course, multiply a currency amount by any scalar. Just
> not by another currency amount.)
>
>> The error rate here is stupendous.
>> Some organizations track each transaction with its taxes rounded.
>> Then some track using use untaxed and then calculate the taxes
>> on the whole (when you have 2 or 3 or 4 (dunno about more but
>> who knows there are some seriously tax loving jurisdictions out there))
>> the differences between adding amounts and then calculating taxes
>> and calculating taxes on each amount and then adding all items
>> together can have some 'interesting' differences.
>>
>> So financial data MUST be able to handle rational numbers.
>> (I have been bit by the differences enumerated in the previous!)
>
> The worst problem is knowing WHEN to round. Sometimes you have to do
> intermediate rounding in order to make something agree with something
> else :(
>
> But if you need finer resolution than the cent, I would still
> recommend trying to use fixed-point arithmetic. The trouble is
> figuring out exactly how much precision you need. Often, 1c precision
> is actually sufficient.
>
At some point, some finance/legal person has to specify how any
fractional currency should be handled.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

pkpearson at nowhere

Sep 3, 2021, 8:39 AM

Post #21 of 59 (741 views)

On Thu, 2 Sep 2021 07:54:27 -0700 (PDT), Julio Di Egidio wrote:
> On Thursday, 2 September 2021 at 16:51:24 UTC+2, Christian Gollwitzer wrote:
>> Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
>> > On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
>> >> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>> >
>> >>> 39.60000000000001
>> >>
>> >> Welcome to the exciting world of roundoff error:
>> >
>> > Welcome to the exiting world of Usenet.
>> >
>> > *Plonk*
>>
>> Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
>> right afterwards.
>
> And I thought you were not seeing my posts...
>
> Given that I have already given a full explanation, you guys, that you
> realise it or not, are simply adding noise for the usual pub-level
> discussion I must most charitably guess.
>
> Anyway, just my opinion. (EOD.)

Although we are in the world of Usenet, comp.lang.python is by
no means typical of Usenet. This is a positive, helpful, welcoming
community in which "Plonk", "EOD", and "RTFM" (appearing in another
post) are seldom seen, and in which I have never before seen the
suggestion that everybody else should be silent so that the silver
voice of the chosen one can be heard.

--
To email me, substitute nowhere->runbox, invalid->com.
--
https://mail.python.org/mailman/listinfo/python-list

RE: on floating-point numbers [ In reply to ]

Joseph.Schachner at Teledyne

Sep 3, 2021, 8:55 AM

Post #22 of 59 (741 views)

What's really going on is that you are printing out more digits than you are entitled to. 39.60000000000001 : 16 decimal digits. 4e16 should require 55 binary bits (in the mantissa) to represent, at least as I calculate it.

Double precision floating point has 52 bits in the mantissa, plus one assumed due to normalization. So 53 bits.

The actual minor difference in sums that you see is because when you put the largest value 1st it makes a difference in the last few bits of the mantissa.

I recommend that you print out double precision values to at most 14 digits. Then you will never see this kind of issue. If you don't like that suggestion, you can create your own floating point representation using a Python integer as the mantissa, so it can grow as large as you have memory to represent the value; and a sign and an exponent. It would be slow, but it could have much more accuracy (if implemented to preserve accuracy).

By the way, this is why banks and other financial institutions use BCD (binary coded decimal). They cannot tolerate sums that have fraction of a cent errors.

I should also point out another float issue: subtractive cancellation. Try 1e14 + 0.1 - 1e14. The result clearly should be 0.1, but it won't be. That's because 0.1 cannot be accurately represented in binary, and it was only represented in the bottom few bits. I just tried it: I got 0.09375 This is not a Python issue. This is a well known issue when using binary floating point. So, when you sum a large array of data, to avoid these issues, you could either
1) sort the data smallest to largest ... may be helpful, but maybe not.
2) Create multiple sums of a few of the values. Next layer: Sum a few of the sums. Top layer: Sum the sum of sums to get the final sum. This is much more likely to work accurately than adding up all the values in one summation except the last, and then adding the last (which could be a relatively small value).

--- Joseph S.

Teledyne Confidential; Commercially Sensitive Business Data

-----Original Message-----
From: Hope Rouselle <hrouselle@jevedi.com>
Sent: Thursday, September 2, 2021 9:51 AM
To: python-list@python.org
Subject: on floating-point numbers

Just sharing a case of floating-point numbers. Nothing needed to be solved or to be figured out. Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do know how to work with base 2 and scientific notation. So the idea of expressing a number as

mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last position in the list. (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example. It's not so easy. Although I could display the violation of commutativity by moving just a single number in the list, I also see that 7.23 commutes with every other number in the list.

(*) My request

I would like to just get some clarity. I guess I need to translate all these numbers into base 2 and perform the addition myself to see the situation coming up?
--
https://mail.python.org/mailman/listinfo/python-list

RE: on floating-point numbers [ In reply to ]

Joseph.Schachner at Teledyne

Sep 3, 2021, 9:05 AM

Post #23 of 59 (741 views)

Actually, Python has an fsum function meant to address this issue.

>>> math.fsum([1e14, 1, -1e14])
1.0
>>>

Wow it works.

--- Joseph S.

Teledyne Confidential; Commercially Sensitive Business Data

-----Original Message-----
From: Hope Rouselle <hrouselle@jevedi.com>
Sent: Thursday, September 2, 2021 9:51 AM
To: python-list@python.org
Subject: on floating-point numbers

Just sharing a case of floating-point numbers. Nothing needed to be solved or to be figured out. Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do know how to work with base 2 and scientific notation. So the idea of expressing a number as

mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last position in the list. (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example. It's not so easy. Although I could display the violation of commutativity by moving just a single number in the list, I also see that 7.23 commutes with every other number in the list.

(*) My request

I would like to just get some clarity. I guess I need to translate all these numbers into base 2 and perform the addition myself to see the situation coming up?
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

nospam at please

Sep 3, 2021, 1:24 PM

Post #24 of 59 (741 views)

Il 03/09/2021 14:45, Chris Angelico ha scritto:
> I believe the definition of "accurate" here is that, if you take all
> of the real numbers represented by those floats, add them all together
> with mathematical accuracy, and then take the nearest representable
> float, that will be the exact value that fsum will return. In other
> words, its accuracy is exactly as good as the final result can be.

yup, I agree and this is the because of the link.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

greg.ewing at canterbury

Sep 3, 2021, 5:45 PM

Post #25 of 59 (741 views)

On 3/09/21 8:11 pm, Christian Gollwitzer wrote:
> Unless you have special numbers like NaN or signed zeros etc., a+b=b+a
> and a*b=b*a holds also for floats.

The only exception I'm aware of is for NaNs, and it's kind of pendantic:
you can't say that x + NaN == NaN + x, but only because NaNs never
compare equal. You still get a NaN either way, so for all practical
purposes it's commutative.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list