Mailing List Archive

On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/...
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001

Welcome to the exciting world of roundoff error:

Python 3.5.3 (default, Jul 9 2020, 13:00:10)
[GCC 6.3.0 20170516] on linux

>>> 0.1 + 0.2 + 9.3 == 0.1 + 9.3 + 0.2
False
>>>

--
To email me, substitute nowhere->runbox, invalid->com.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 7:47 AM

Post #3 of 59 (758 views)

Am 02.09.21 um 15:51 schrieb Hope Rouselle:
> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)

I believe it is not commutativity, but associativity, that is violated.
Even for floating point, a+b=b+a except for maybe some extreme cases
like denormliazed numbers etc.

But in general (a+b)+c != a+ (b+c)

Consider decimal floating point with 2 digits.
a=1
b=c=0.04

Then you get LHS;
(1 + 0.04) + 0.04 = 1 + 0.04 = 1

RHS:

1 + (0.04 + 0.04) = 1 + 0.08 = 1.1

Your sum is evaluated like (((a + b) + c) + ....) and hence, if you
permute the numbers, it can be unequal. If you need better accuracy,
there is the Kahan summation algorithm and other alternatives:
https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Christian
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 7:49 AM

Post #4 of 59 (758 views)

Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
> On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
>> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>
>>> 39.60000000000001
>>
>> Welcome to the exciting world of roundoff error:
>
> Welcome to the exiting world of Usenet.
>
> *Plonk*

Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
right afterwards.

Christian
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 8:08 AM

Post #5 of 59 (758 views)

Hope Rouselle <hrouselle@jevedi.com> writes:

> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
>>>> import sys
>>>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> bit (AMD64)]'
>
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)

Suppose these numbers are prices in dollar, never going beyond cents.
Would it be safe to multiply each one of them by 100 and therefore work
with cents only? For instance

--8<---------------cut here---------------start------------->8---
>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(map(lambda x: int(x*100), ls)) / 100
39.6

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(map(lambda x: int(x*100), ls)) / 100
39.6
--8<---------------cut here---------------end--------------->8---

Or multiplication by 100 isn't quite ``safe'' to do with floating-point
numbers either? (It worked in this case.)

I suppose that if I multiply it by a power of two, that would be an
operation that I can be sure will not bring about any precision loss
with floating-point numbers. Do you agree?
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 11:43 AM

Post #6 of 59 (758 views)

On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Just sharing a case of floating-point numbers. Nothing needed to be
> solved or to be figured out. Just bringing up conversation.
>
> (*) An introduction to me
>
> I don't understand floating-point numbers from the inside out, but I do
> know how to work with base 2 and scientific notation. So the idea of
> expressing a number as
>
> mantissa * base^{power}
>
> is not foreign to me. (If that helps you to perhaps instruct me on
> what's going on here.)
>
> (*) A presentation of the behavior
>
> >>> import sys
> >>> sys.version
> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'
>
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(ls)
> 39.599999999999994
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(ls)
> 39.60000000000001
>
> All I did was to take the first number, 7.23, and move it to the last
> position in the list. (So we have a violation of the commutativity of
> addition.)
>

It's not about the commutativity of any particular pair of operands -
that's always guaranteed. What you're seeing here is the results of
intermediate rounding. Try this:

>>> def sum(stuff):
... total = 0
... for thing in stuff:
... total += thing
... print(thing, "-->", total)
... return total
...
>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
7.23 --> 7.23
8.41 --> 15.64
6.15 --> 21.79
2.31 --> 24.099999999999998
7.73 --> 31.83
7.77 --> 39.599999999999994
39.599999999999994
>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
8.41 --> 8.41
6.15 --> 14.56
2.31 --> 16.87
7.73 --> 24.6
7.77 --> 32.370000000000005
7.23 --> 39.60000000000001
39.60000000000001
>>>

Nearly all floating-point confusion stems from an assumption that the
input values are exact. They usually aren't. Consider:

>>> from fractions import Fraction
>>> for n in ls: print(n, Fraction(*n.as_integer_ratio()))
...
8.41 2367204554136617/281474976710656
6.15 3462142213541069/562949953421312
2.31 5201657569612923/2251799813685248
7.73 2175801569973371/281474976710656
7.77 2187060569041797/281474976710656
7.23 2035064081618043/281474976710656

Those are the ACTUAL values you're adding. Do the same exercise with
the partial sums, and see where the rounding happens. It's probably
happening several times, in fact.

The naive summation algorithm used by sum() is compatible with a
variety of different data types - even lists, although it's documented
as being intended for numbers - but if you know for sure that you're
working with floats, there's a more accurate algorithm available to
you.

>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
39.6
>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
39.6

It seeks to minimize loss to repeated rounding and is, I believe,
independent of data order.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 12:24 PM

Post #7 of 59 (758 views)

On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Hope Rouselle <hrouselle@jevedi.com> writes:
>
> > Just sharing a case of floating-point numbers. Nothing needed to be
> > solved or to be figured out. Just bringing up conversation.
> >
> > (*) An introduction to me
> >
> > I don't understand floating-point numbers from the inside out, but I do
> > know how to work with base 2 and scientific notation. So the idea of
> > expressing a number as
> >
> > mantissa * base^{power}
> >
> > is not foreign to me. (If that helps you to perhaps instruct me on
> > what's going on here.)
> >
> > (*) A presentation of the behavior
> >
> >>>> import sys
> >>>> sys.version
> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > bit (AMD64)]'
> >
> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>> sum(ls)
> > 39.599999999999994
> >
> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>> sum(ls)
> > 39.60000000000001
> >
> > All I did was to take the first number, 7.23, and move it to the last
> > position in the list. (So we have a violation of the commutativity of
> > addition.)
>
> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only? For instance

Yes and no. It absolutely *is* safe to always work with cents, but to
do that, you have to be consistent: ALWAYS work with cents, never with
floating point dollars.

(Or whatever other unit you choose to use. Most currencies have a
smallest-normally-used-unit, with other currency units (where present)
being whole number multiples of that minimal unit. Only in forex do
you need to concern yourself with fractional cents or fractional yen.)

But multiplying a set of floats by 100 won't necessarily solve your
problem; you may have already fallen victim to the flaw of assuming
that the numbers are represented accurately.

> --8<---------------cut here---------------start------------->8---
> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
>
> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>> sum(map(lambda x: int(x*100), ls)) / 100
> 39.6
> --8<---------------cut here---------------end--------------->8---
>
> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> numbers either? (It worked in this case.)

You're multiplying and then truncating, which risks a round-down
error. Try adding a half onto them first:

int(x * 100 + 0.5)

But that's still not a perfect guarantee. Far safer would be to
consider monetary values to be a different type of value, not just a
raw number. For instance, the value $7.23 could be stored internally
as the integer 723, but you also know that it's a value in USD, not a
simple scalar. It makes perfect sense to add USD+USD, it makes perfect
sense to multiply USD*scalar, but it doesn't make sense to multiply
USD*USD.

> I suppose that if I multiply it by a power of two, that would be an
> operation that I can be sure will not bring about any precision loss
> with floating-point numbers. Do you agree?

Assuming you're nowhere near 2**53, yes, that would be safe. But so
would multiplying by a power of five. The problem isn't precision loss
from the multiplication - the problem is that your input numbers
aren't what you think they are. That number 7.23, for instance, is
really....

>>> 7.23.as_integer_ratio()
(2035064081618043, 281474976710656)

... the rational number 2035064081618043 / 281474976710656, which is
very close to 7.23, but not exactly so. (The numerator would have to
be ...8042.88 to be exactly correct.) There is nothing you can do at
this point to regain the precision, although a bit of multiplication
and rounding can cheat it and make it appear as if you did.

Floating point is a very useful approximation to real numbers, but
real numbers aren't the best way to represent financial data. Integers
are.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:25 PM

Post #8 of 59 (758 views)

On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle <hrouselle@jevedi.com>
declaimed the following:

>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
>39.599999999999994
>
>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>> sum(ls)
>39.60000000000001
>
>All I did was to take the first number, 7.23, and move it to the last
>position in the list. (So we have a violation of the commutativity of
>addition.)
>

https://www.amazon.com/Real-Computing-Made-Engineering-Calculations-dp-B01K0Q03AA/dp/B01K0Q03AA/ref=mt_other?_encoding=UTF8&me=&qid=

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:48 PM

Post #9 of 59 (758 views)

On Thu, 02 Sep 2021 12:08:21 -0300, Hope Rouselle <hrouselle@jevedi.com>
declaimed the following:

>Suppose these numbers are prices in dollar, never going beyond cents.
>Would it be safe to multiply each one of them by 100 and therefore work
>with cents only? For instance
>

A lot of software with a "monetary" data type uses scaled INTEGERS for
that... M$ Excel uses four decimal places, internally scaled.

The Ada language has both FIXED and FLOAT data types; for FIXED one
specifies the delta between adjacent values that must be met (the compiler
is free to use something with more resolution internally).

Money should never be treated as a floating value.

>I suppose that if I multiply it by a power of two, that would be an
>operation that I can be sure will not bring about any precision loss
>with floating-point numbers. Do you agree?

Are we talking IEEE floats? Or some of the ancient formats used for
computers that may not have had hardware floating point units, or predate
the IEEE standard.

Normalized with suppressed leading bit? (If normalization always puts
the most significant bit at the binary point... Why store that bit?, shift
it out and gain another bit at the small end)

Xerox Sigma floats used an exponent based on radix 16. A normalized
mantissa could have up to three leading 0 bits.

Motorola Fast Floating Point (software float implementation used on
base Amiga systems -- the exponent was in the low byte)

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 1:53 PM

Post #10 of 59 (758 views)

On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <rosuav@gmail.com>
declaimed the following:

>
>The naive summation algorithm used by sum() is compatible with a
>variety of different data types - even lists, although it's documented
>as being intended for numbers - but if you know for sure that you're
>working with floats, there's a more accurate algorithm available to
>you.
>
>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
>39.6
>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
>39.6
>
>It seeks to minimize loss to repeated rounding and is, I believe,
>independent of data order.
>

Most likely it sorts the data so the smallest values get summed first,
and works its way up to the larger values. That way it minimizes the losses
that occur when denormalizing a value (to set the exponent equal to that of
the next larger value).

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

grant.b.edwards at gmail

Sep 2, 2021, 2:41 PM

Post #11 of 59 (758 views)

On 2021-09-02, Hope Rouselle <hrouselle@jevedi.com> wrote:

> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only?

The _practical_ answer is that no, it's not safe to use floating point
when doing normal bookeeping type stuff with money. At least not if
you want everything to balance correctly at the end of the day (week,
month, quarter, year or etc.). Use integer cents, or mills or
whatever. If you have to use floating point to calculate a payment or
credit/debit amount, always round or truncate the result back to an
integer value in your chosen units before actually using that amount
for anything.

In theory, decimal floating point should be usable, but I've never
personally worked with it. Back in the day (1980's) microcomputers
didn't have floating point hardware, and many compilers allowed you to
choose between base-2 floating point and base-10 (BCD) floating
point. The idea was that if you were doing financial stuff, you could
use BCD floating point.

--
Grant

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 2, 2021, 4:21 PM

Post #12 of 59 (758 views)

On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>
> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <rosuav@gmail.com>
> declaimed the following:
>
> >
> >The naive summation algorithm used by sum() is compatible with a
> >variety of different data types - even lists, although it's documented
> >as being intended for numbers - but if you know for sure that you're
> >working with floats, there's a more accurate algorithm available to
> >you.
> >
> >>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >39.6
> >>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >39.6
> >
> >It seeks to minimize loss to repeated rounding and is, I believe,
> >independent of data order.
> >
>
> Most likely it sorts the data so the smallest values get summed first,
> and works its way up to the larger values. That way it minimizes the losses
> that occur when denormalizing a value (to set the exponent equal to that of
> the next larger value).
>

I'm not sure, but that sounds familiar. It doesn't really matter
though - the docs just say that it is an "accurate floating point
sum", so the precise algorithm is an implementation detail.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

roel at roelschroeven

Sep 3, 2021, 12:45 AM

Post #13 of 59 (758 views)

Op 2/09/2021 om 17:08 schreef Hope Rouselle:
> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>> sum(ls)
> > 39.599999999999994
> >
> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>> sum(ls)
> > 39.60000000000001
> >
> > All I did was to take the first number, 7.23, and move it to the last
> > position in the list. (So we have a violation of the commutativity of
> > addition.)
>
> Suppose these numbers are prices in dollar, never going beyond cents.
> Would it be safe to multiply each one of them by 100 and therefore work
> with cents only?
For working with monetary values, or any value that needs to accurate
correspondence to 10-based values, best use Python's Decimal; see the
documentation: https://docs.python.org/3.8/library/decimal.html

Example:

from decimal import Decimal as D
ls1 = [D('7.23'), D('8.41'), D('6.15'), D('2.31'), D('7.73'), D('7.77')]
ls2 = [D('8.41'), D('6.15'), D('2.31'), D('7.73'), D('7.77'), D('7.23')]
print(sum(ls1), sum(ls2))

Output:
39.60 39.60

(Note that I initialized the values with strings instead of numbers, to
allow Decimal access to the exact number without it already being
converted to a float that doesn't necessarily exactly correspond to the
decimal value)

--
"Your scientists were so preoccupied with whether they could, they didn't
stop to think if they should"
-- Dr. Ian Malcolm

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 3, 2021, 1:11 AM

Post #14 of 59 (743 views)

Am 02.09.21 um 21:02 schrieb Julio Di Egidio:
> On Thursday, 2 September 2021 at 20:43:36 UTC+2, Chris Angelico wrote:
>> On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrou...@jevedi.com> wrote:
>
>>> All I did was to take the first number, 7.23, and move it to the last
>>> position in the list. (So we have a violation of the commutativity of
>>> addition.)
>>>
>> It's not about the commutativity of any particular pair of operands -
>> that's always guaranteed.
>
> Nope, that is rather *not* guaranteed, as I have quite explained up thread.
>

No, you haven't explained that. You linked to the famous Goldberg paper.
Where in the paper does it say that operations on floats are not
commutative?

I'd be surprised because it is generally wrong.
Unless you have special numbers like NaN or signed zeros etc., a+b=b+a
and a*b=b*a holds also for floats.

Christiah
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

nospam at please

Sep 3, 2021, 2:18 AM

Post #15 of 59 (743 views)

Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
>> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
>>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
>>> declaimed the following:
>>>
>>>> The naive summation algorithm used by sum() is compatible with a
>>>> variety of different data types - even lists, although it's documented
>>>> as being intended for numbers - but if you know for sure that you're
>>>> working with floats, there's a more accurate algorithm available to
>>>> you.
>>>>
>>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
>>>> 39.6
>>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
>>>> 39.6
>>>>
>>>> It seeks to minimize loss to repeated rounding and is, I believe,
>>>> independent of data order.
>>>
>>> Most likely it sorts the data so the smallest values get summed first,
>>> and works its way up to the larger values. That way it minimizes the losses
>>> that occur when denormalizing a value (to set the exponent equal to that of
>>> the next larger value).
>>>
>> I'm not sure, but that sounds familiar. It doesn't really matter
>> though - the docs just say that it is an "accurate floating point
>> sum", so the precise algorithm is an implementation detail.
>
> The docs are quite misleading there, it is not accurate without further qualifications.
>
> <https://docs.python.org/3.8/library/math.html#math.fsum>
> <https://code.activestate.com/recipes/393090/>
>
> That said, fucking pathetic, when Dunning-Kruger is a compliment...
>
> *Plonk*
>
> Julio
>

https://en.wikipedia.org/wiki/IEEE_754
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 3, 2021, 5:45 AM

Post #16 of 59 (743 views)

oscar.j.benjamin at gmail

On Fri, Sep 3, 2021 at 10:42 PM jak <nospam@please.ty> wrote:
>
> Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> > On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
> >> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> >>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
> >>> declaimed the following:
> >>>
> >>>> The naive summation algorithm used by sum() is compatible with a
> >>>> variety of different data types - even lists, although it's documented
> >>>> as being intended for numbers - but if you know for sure that you're
> >>>> working with floats, there's a more accurate algorithm available to
> >>>> you.
> >>>>
> >>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> >>>> 39.6
> >>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> >>>> 39.6
> >>>>
> >>>> It seeks to minimize loss to repeated rounding and is, I believe,
> >>>> independent of data order.
> >>>
> >>> Most likely it sorts the data so the smallest values get summed first,
> >>> and works its way up to the larger values. That way it minimizes the losses
> >>> that occur when denormalizing a value (to set the exponent equal to that of
> >>> the next larger value).
> >>>
> >> I'm not sure, but that sounds familiar. It doesn't really matter
> >> though - the docs just say that it is an "accurate floating point
> >> sum", so the precise algorithm is an implementation detail.
> >
> > The docs are quite misleading there, it is not accurate without further qualifications.
> >
> > <https://docs.python.org/3.8/library/math.html#math.fsum>
> > <https://code.activestate.com/recipes/393090/>
> >
>
> https://en.wikipedia.org/wiki/IEEE_754

I believe the definition of "accurate" here is that, if you take all
of the real numbers represented by those floats, add them all together
with mathematical accuracy, and then take the nearest representable
float, that will be the exact value that fsum will return. In other
words, its accuracy is exactly as good as the final result can be.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 3, 2021, 6:15 AM

Post #17 of 59 (743 views)

On Fri, 3 Sept 2021 at 13:48, Chris Angelico <rosuav@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 10:42 PM jak <nospam@please.ty> wrote:
> >
> > Il 03/09/2021 09:07, Julio Di Egidio ha scritto:
> > > On Friday, 3 September 2021 at 01:22:28 UTC+2, Chris Angelico wrote:
> > >> On Fri, Sep 3, 2021 at 8:15 AM Dennis Lee Bieber <wlf...@ix.netcom.com> wrote:
> > >>> On Fri, 3 Sep 2021 04:43:02 +1000, Chris Angelico <ros...@gmail.com>
> > >>> declaimed the following:
> > >>>
> > >>>> The naive summation algorithm used by sum() is compatible with a
> > >>>> variety of different data types - even lists, although it's documented
> > >>>> as being intended for numbers - but if you know for sure that you're
> > >>>> working with floats, there's a more accurate algorithm available to
> > >>>> you.
> > >>>>
> > >>>>>>> math.fsum([7.23, 8.41, 6.15, 2.31, 7.73, 7.77])
> > >>>> 39.6
> > >>>>>>> math.fsum([8.41, 6.15, 2.31, 7.73, 7.77, 7.23])
> > >>>> 39.6
> > >>>>
> > >>>> It seeks to minimize loss to repeated rounding and is, I believe,
> > >>>> independent of data order.
> > >>>
> > >>> Most likely it sorts the data so the smallest values get summed first,
> > >>> and works its way up to the larger values. That way it minimizes the losses
> > >>> that occur when denormalizing a value (to set the exponent equal to that of
> > >>> the next larger value).
> > >>>
> > >> I'm not sure, but that sounds familiar. It doesn't really matter
> > >> though - the docs just say that it is an "accurate floating point
> > >> sum", so the precise algorithm is an implementation detail.
> > >
> > > The docs are quite misleading there, it is not accurate without further qualifications.
> > >
> > > <https://docs.python.org/3.8/library/math.html#math.fsum>
> > > <https://code.activestate.com/recipes/393090/>
> > >
> >
> > https://en.wikipedia.org/wiki/IEEE_754
>
> I believe the definition of "accurate" here is that, if you take all
> of the real numbers represented by those floats, add them all together
> with mathematical accuracy, and then take the nearest representable
> float, that will be the exact value that fsum will return. In other
> words, its accuracy is exactly as good as the final result can be.

It's as good as it can be if the result must fit into a single float.
Actually the algorithm itself maintains an exact result for the sum
internally using a list of floats whose exact sum is the same as that
of the input list. In essence it compresses a large list of floats to
a small list of say 2 or 3 floats while preserving the exact value of
the sum.

Unfortunately fsum does not give any way to access the internal exact
list so using fsum repeatedly suffers the same problems as plain float
arithmetic e.g.:
>>> x = 10**20
>>> fsum([fsum([1, x]), -x])
0.0

--
Oscar
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

o1bigtenor at gmail

Sep 3, 2021, 7:08 AM

Post #18 of 59 (743 views)

On Thu, Sep 2, 2021 at 2:27 PM Chris Angelico <rosuav@gmail.com> wrote:

> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
> >
> > Hope Rouselle <hrouselle@jevedi.com> writes:
> >
> > > Just sharing a case of floating-point numbers. Nothing needed to be
> > > solved or to be figured out. Just bringing up conversation.
> > >
> > > (*) An introduction to me
> > >
> > > I don't understand floating-point numbers from the inside out, but I do
> > > know how to work with base 2 and scientific notation. So the idea of
> > > expressing a number as
> > >
> > > mantissa * base^{power}
> > >
> > > is not foreign to me. (If that helps you to perhaps instruct me on
> > > what's going on here.)
> > >
> > > (*) A presentation of the behavior
> > >
> > >>>> import sys
> > >>>> sys.version
> > > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> > > bit (AMD64)]'
> > >
> > >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>>> sum(ls)
> > > 39.599999999999994
> > >
> > >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>>> sum(ls)
> > > 39.60000000000001
> > >
> > > All I did was to take the first number, 7.23, and move it to the last
> > > position in the list. (So we have a violation of the commutativity of
> > > addition.)
> >
> > Suppose these numbers are prices in dollar, never going beyond cents.
> > Would it be safe to multiply each one of them by 100 and therefore work
> > with cents only? For instance
>
> Yes and no. It absolutely *is* safe to always work with cents, but to
> do that, you have to be consistent: ALWAYS work with cents, never with
> floating point dollars.
>
> (Or whatever other unit you choose to use. Most currencies have a
> smallest-normally-used-unit, with other currency units (where present)
> being whole number multiples of that minimal unit. Only in forex do
> you need to concern yourself with fractional cents or fractional yen.)
>
> But multiplying a set of floats by 100 won't necessarily solve your
> problem; you may have already fallen victim to the flaw of assuming
> that the numbers are represented accurately.
>
> > --8<---------------cut here---------------start------------->8---
> > >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> >
> > >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> > >>> sum(map(lambda x: int(x*100), ls)) / 100
> > 39.6
> > --8<---------------cut here---------------end--------------->8---
> >
> > Or multiplication by 100 isn't quite ``safe'' to do with floating-point
> > numbers either? (It worked in this case.)
>
> You're multiplying and then truncating, which risks a round-down
> error. Try adding a half onto them first:
>
> int(x * 100 + 0.5)
>
> But that's still not a perfect guarantee. Far safer would be to
> consider monetary values to be a different type of value, not just a
> raw number. For instance, the value $7.23 could be stored internally
> as the integer 723, but you also know that it's a value in USD, not a
> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
> sense to multiply USD*scalar, but it doesn't make sense to multiply
> USD*USD.
>
> > I suppose that if I multiply it by a power of two, that would be an
> > operation that I can be sure will not bring about any precision loss
> > with floating-point numbers. Do you agree?
>
> Assuming you're nowhere near 2**53, yes, that would be safe. But so
> would multiplying by a power of five. The problem isn't precision loss
> from the multiplication - the problem is that your input numbers
> aren't what you think they are. That number 7.23, for instance, is
> really....
>
> >>> 7.23.as_integer_ratio()
> (2035064081618043, 281474976710656)
>
> ... the rational number 2035064081618043 / 281474976710656, which is
> very close to 7.23, but not exactly so. (The numerator would have to
> be ...8042.88 to be exactly correct.) There is nothing you can do at
> this point to regain the precision, although a bit of multiplication
> and rounding can cheat it and make it appear as if you did.
>
> Floating point is a very useful approximation to real numbers, but
> real numbers aren't the best way to represent financial data. Integers
> are.
>
>
Hmmmmmmm - - - ZI would suggest that you haven't looked into
taxation yet!
In taxation you get a rational number that MUST be multiplied by
the amount in currency.
The error rate here is stupendous.
Some organizations track each transaction with its taxes rounded.
Then some track using use untaxed and then calculate the taxes
on the whole (when you have 2 or 3 or 4 (dunno about more but
who knows there are some seriously tax loving jurisdictions out there))
the differences between adding amounts and then calculating taxes
and calculating taxes on each amount and then adding all items
together can have some 'interesting' differences.

So financial data MUST be able to handle rational numbers.
(I have been bit by the differences enumerated in the previous!)

Regards
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 3, 2021, 8:13 AM

Post #19 of 59 (743 views)

On Sat, Sep 4, 2021 at 12:08 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
> Hmmmmmmm - - - ZI would suggest that you haven't looked into
> taxation yet!
> In taxation you get a rational number that MUST be multiplied by
> the amount in currency.

(You can, of course, multiply a currency amount by any scalar. Just
not by another currency amount.)

> The error rate here is stupendous.
> Some organizations track each transaction with its taxes rounded.
> Then some track using use untaxed and then calculate the taxes
> on the whole (when you have 2 or 3 or 4 (dunno about more but
> who knows there are some seriously tax loving jurisdictions out there))
> the differences between adding amounts and then calculating taxes
> and calculating taxes on each amount and then adding all items
> together can have some 'interesting' differences.
>
> So financial data MUST be able to handle rational numbers.
> (I have been bit by the differences enumerated in the previous!)

The worst problem is knowing WHEN to round. Sometimes you have to do
intermediate rounding in order to make something agree with something
else :(

But if you need finer resolution than the cent, I would still
recommend trying to use fixed-point arithmetic. The trouble is
figuring out exactly how much precision you need. Often, 1c precision
is actually sufficient.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

python at mrabarnett

Sep 3, 2021, 8:39 AM

Post #20 of 59 (743 views)

On 2021-09-03 16:13, Chris Angelico wrote:
> On Sat, Sep 4, 2021 at 12:08 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
>> Hmmmmmmm - - - ZI would suggest that you haven't looked into
>> taxation yet!
>> In taxation you get a rational number that MUST be multiplied by
>> the amount in currency.
>
> (You can, of course, multiply a currency amount by any scalar. Just
> not by another currency amount.)
>
>> The error rate here is stupendous.
>> Some organizations track each transaction with its taxes rounded.
>> Then some track using use untaxed and then calculate the taxes
>> on the whole (when you have 2 or 3 or 4 (dunno about more but
>> who knows there are some seriously tax loving jurisdictions out there))
>> the differences between adding amounts and then calculating taxes
>> and calculating taxes on each amount and then adding all items
>> together can have some 'interesting' differences.
>>
>> So financial data MUST be able to handle rational numbers.
>> (I have been bit by the differences enumerated in the previous!)
>
> The worst problem is knowing WHEN to round. Sometimes you have to do
> intermediate rounding in order to make something agree with something
> else :(
>
> But if you need finer resolution than the cent, I would still
> recommend trying to use fixed-point arithmetic. The trouble is
> figuring out exactly how much precision you need. Often, 1c precision
> is actually sufficient.
>
At some point, some finance/legal person has to specify how any
fractional currency should be handled.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

pkpearson at nowhere

Sep 3, 2021, 8:39 AM

Post #21 of 59 (743 views)

Joseph.Schachner at Teledyne

On Thu, 2 Sep 2021 07:54:27 -0700 (PDT), Julio Di Egidio wrote:
> On Thursday, 2 September 2021 at 16:51:24 UTC+2, Christian Gollwitzer wrote:
>> Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
>> > On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
>> >> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>> >
>> >>> 39.60000000000001
>> >>
>> >> Welcome to the exciting world of roundoff error:
>> >
>> > Welcome to the exiting world of Usenet.
>> >
>> > *Plonk*
>>
>> Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
>> right afterwards.
>
> And I thought you were not seeing my posts...
>
> Given that I have already given a full explanation, you guys, that you
> realise it or not, are simply adding noise for the usual pub-level
> discussion I must most charitably guess.
>
> Anyway, just my opinion. (EOD.)

Although we are in the world of Usenet, comp.lang.python is by
no means typical of Usenet. This is a positive, helpful, welcoming
community in which "Plonk", "EOD", and "RTFM" (appearing in another
post) are seldom seen, and in which I have never before seen the
suggestion that everybody else should be silent so that the silver
voice of the chosen one can be heard.

--
To email me, substitute nowhere->runbox, invalid->com.
--
https://mail.python.org/mailman/listinfo/python-list

RE: on floating-point numbers [ In reply to ]

Sep 3, 2021, 8:55 AM

Post #22 of 59 (743 views)

Joseph.Schachner at Teledyne

What's really going on is that you are printing out more digits than you are entitled to. 39.60000000000001 : 16 decimal digits. 4e16 should require 55 binary bits (in the mantissa) to represent, at least as I calculate it.

Double precision floating point has 52 bits in the mantissa, plus one assumed due to normalization. So 53 bits.

The actual minor difference in sums that you see is because when you put the largest value 1st it makes a difference in the last few bits of the mantissa.

I recommend that you print out double precision values to at most 14 digits. Then you will never see this kind of issue. If you don't like that suggestion, you can create your own floating point representation using a Python integer as the mantissa, so it can grow as large as you have memory to represent the value; and a sign and an exponent. It would be slow, but it could have much more accuracy (if implemented to preserve accuracy).

By the way, this is why banks and other financial institutions use BCD (binary coded decimal). They cannot tolerate sums that have fraction of a cent errors.

I should also point out another float issue: subtractive cancellation. Try 1e14 + 0.1 - 1e14. The result clearly should be 0.1, but it won't be. That's because 0.1 cannot be accurately represented in binary, and it was only represented in the bottom few bits. I just tried it: I got 0.09375 This is not a Python issue. This is a well known issue when using binary floating point. So, when you sum a large array of data, to avoid these issues, you could either
1) sort the data smallest to largest ... may be helpful, but maybe not.
2) Create multiple sums of a few of the values. Next layer: Sum a few of the sums. Top layer: Sum the sum of sums to get the final sum. This is much more likely to work accurately than adding up all the values in one summation except the last, and then adding the last (which could be a relatively small value).

--- Joseph S.

Teledyne Confidential; Commercially Sensitive Business Data

-----Original Message-----
From: Hope Rouselle <hrouselle@jevedi.com>
Sent: Thursday, September 2, 2021 9:51 AM
To: python-list@python.org
Subject: on floating-point numbers

Just sharing a case of floating-point numbers. Nothing needed to be solved or to be figured out. Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do know how to work with base 2 and scientific notation. So the idea of expressing a number as

mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last position in the list. (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example. It's not so easy. Although I could display the violation of commutativity by moving just a single number in the list, I also see that 7.23 commutes with every other number in the list.

(*) My request

I would like to just get some clarity. I guess I need to translate all these numbers into base 2 and perform the addition myself to see the situation coming up?
--
https://mail.python.org/mailman/listinfo/python-list

RE: on floating-point numbers [ In reply to ]

Sep 3, 2021, 9:05 AM

Post #23 of 59 (743 views)

Actually, Python has an fsum function meant to address this issue.

>>> math.fsum([1e14, 1, -1e14])
1.0
>>>

Wow it works.

--- Joseph S.

Teledyne Confidential; Commercially Sensitive Business Data

-----Original Message-----
From: Hope Rouselle <hrouselle@jevedi.com>
Sent: Thursday, September 2, 2021 9:51 AM
To: python-list@python.org
Subject: on floating-point numbers

Just sharing a case of floating-point numbers. Nothing needed to be solved or to be figured out. Just bringing up conversation.

(*) An introduction to me

I don't understand floating-point numbers from the inside out, but I do know how to work with base 2 and scientific notation. So the idea of expressing a number as

mantissa * base^{power}

is not foreign to me. (If that helps you to perhaps instruct me on what's going on here.)

(*) A presentation of the behavior

>>> import sys
>>> sys.version
'3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)]'

>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>> sum(ls)
39.599999999999994

>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>> sum(ls)
39.60000000000001

All I did was to take the first number, 7.23, and move it to the last position in the list. (So we have a violation of the commutativity of
addition.)

Let me try to reduce the example. It's not so easy. Although I could display the violation of commutativity by moving just a single number in the list, I also see that 7.23 commutes with every other number in the list.

(*) My request

I would like to just get some clarity. I guess I need to translate all these numbers into base 2 and perform the addition myself to see the situation coming up?
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

nospam at please

Sep 3, 2021, 1:24 PM

Post #24 of 59 (743 views)

Il 03/09/2021 14:45, Chris Angelico ha scritto:
> I believe the definition of "accurate" here is that, if you take all
> of the real numbers represented by those floats, add them all together
> with mathematical accuracy, and then take the nearest representable
> float, that will be the exact value that fsum will return. In other
> words, its accuracy is exactly as good as the final result can be.

yup, I agree and this is the because of the link.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

greg.ewing at canterbury

Sep 3, 2021, 5:45 PM

Post #25 of 59 (743 views)

On 3/09/21 8:11 pm, Christian Gollwitzer wrote:
> Unless you have special numbers like NaN or signed zeros etc., a+b=b+a
> and a*b=b*a holds also for floats.

The only exception I'm aware of is for NaNs, and it's kind of pendantic:
you can't say that x + NaN == NaN + x, but only because NaNs never
compare equal. You still get a NaN either way, so for all practical
purposes it's commutative.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 5:48 AM

Post #26 of 59 (469 views)

Christian Gollwitzer <auriocus@gmx.de> writes:

> Am 02.09.21 um 15:51 schrieb Hope Rouselle:
>> Just sharing a case of floating-point numbers. Nothing needed to be
>> solved or to be figured out. Just bringing up conversation.
>> (*) An introduction to me
>> I don't understand floating-point numbers from the inside out, but I
>> do
>> know how to work with base 2 and scientific notation. So the idea of
>> expressing a number as
>> mantissa * base^{power}
>> is not foreign to me. (If that helps you to perhaps instruct me on
>> what's going on here.)
>> (*) A presentation of the behavior
>>
>>>>> import sys
>>>>> sys.version
>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>> bit (AMD64)]'
>>
>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>> sum(ls)
>> 39.599999999999994
>>
>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>> sum(ls)
>> 39.60000000000001
>> All I did was to take the first number, 7.23, and move it to the
>> last
>> position in the list. (So we have a violation of the commutativity of
>> addition.)
>
> I believe it is not commutativity, but associativity, that is
> violated.

Shall we take this seriously? (I will disagree, but that doesn't mean I
am not grateful for your post. Quite the contary.) It in general
violates associativity too, but the example above couldn't be referring
to associativity because the second sum above could not be obtained from
associativity alone. Commutativity is required, applied to five pairs
of numbers. How can I go from

7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77

to

8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?

Perhaps only through various application of commutativity, namely the
ones below. (I omit the parentheses for less typing. I suppose that
does not create much trouble. There is no use of associativity below,
except for the intented omission of parentheses.)

7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
= 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
= 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
= 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
= 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
= 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.

> Even for floating point, a+b=b+a except for maybe some extreme cases
> like denormliazed numbers etc.
>
> But in general (a+b)+c != a+ (b+c)
>
> Consider decimal floating point with 2 digits.
> a=1
> b=c=0.04
>
> Then you get LHS;
> (1 + 0.04) + 0.04 = 1 + 0.04 = 1
>
> RHS:
>
> 1 + (0.04 + 0.04) = 1 + 0.08 = 1.1
>
> Your sum is evaluated like (((a + b) + c) + ....) and hence, if you
> permute the numbers, it can be unequal. If you need better accuracy,
> there is the Kahan summation algorithm and other alternatives:
> https://en.wikipedia.org/wiki/Kahan_summation_algorithm

Thanks!
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 6:04 AM

Post #27 of 59 (469 views)

Chris Angelico <rosuav@gmail.com> writes:

> On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>>
>> Just sharing a case of floating-point numbers. Nothing needed to be
>> solved or to be figured out. Just bringing up conversation.
>>
>> (*) An introduction to me
>>
>> I don't understand floating-point numbers from the inside out, but I do
>> know how to work with base 2 and scientific notation. So the idea of
>> expressing a number as
>>
>> mantissa * base^{power}
>>
>> is not foreign to me. (If that helps you to perhaps instruct me on
>> what's going on here.)
>>
>> (*) A presentation of the behavior
>>
>> >>> import sys
>> >>> sys.version
>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>> bit (AMD64)]'
>>
>> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>> >>> sum(ls)
>> 39.599999999999994
>>
>> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>> >>> sum(ls)
>> 39.60000000000001
>>
>> All I did was to take the first number, 7.23, and move it to the last
>> position in the list. (So we have a violation of the commutativity of
>> addition.)
>
> It's not about the commutativity of any particular pair of operands -
> that's always guaranteed.

Shall we take this seriously? It has to be about the commutativity of
at least one particular pair because it is involved with the
commutavitity of a set of pairs. If various pairs are involved, then at
least one is involved. IOW, it is about the commutativity of some pair
of operands and so it could not be the case that it's not about the
commutativity of any. (Lol. I hope that's not too insubordinate. I
already protested against a claim for associativity in this thread and
now I'm going for the king of the hill, for whom I have always been so
grateful!)

> What you're seeing here is the results of intermediate rounding. Try
> this:

Indeed. Your post here is really great, as usual. It offers a nice
tool for investigation. Thank you so much.

>>>> def sum(stuff):
> ... total = 0
> ... for thing in stuff:
> ... total += thing
> ... print(thing, "-->", total)
> ... return total
> ...
>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>> sum(ls)
> 7.23 --> 7.23
> 8.41 --> 15.64
> 6.15 --> 21.79
> 2.31 --> 24.099999999999998

Alright. Thanks so much for this example. Here's a new puzzle for me.
The REPL makes me think that both 21.79 and 2.31 *are* representable
exactly in Python's floating-point datatype because I see:

>>> 2.31
2.31
>>> 21.79
21.79

When I add them, the result obtained makes me think that the sum is
*not* representable exactly in Python's floating-point number:

>>> 21.79 + 2.31
24.099999999999998

However, when I type 24.10 explicitly, the REPL makes me think that
24.10 *is* representable exactly:

>>> 24.10
24.1

I suppose I cannot trust the appearance of the representation? What's
really going on there? (Perhaps the trouble appears while Python is
computing the sum of the numbers 21.79 and 2.31?) Thanks so much!
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 6:40 AM

Post #28 of 59 (469 views)

Chris Angelico <rosuav@gmail.com> writes:

> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>>
>> Hope Rouselle <hrouselle@jevedi.com> writes:
>>
>> > Just sharing a case of floating-point numbers. Nothing needed to be
>> > solved or to be figured out. Just bringing up conversation.
>> >
>> > (*) An introduction to me
>> >
>> > I don't understand floating-point numbers from the inside out, but I do
>> > know how to work with base 2 and scientific notation. So the idea of
>> > expressing a number as
>> >
>> > mantissa * base^{power}
>> >
>> > is not foreign to me. (If that helps you to perhaps instruct me on
>> > what's going on here.)
>> >
>> > (*) A presentation of the behavior
>> >
>> >>>> import sys
>> >>>> sys.version
>> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>> > bit (AMD64)]'
>> >
>> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>> >>>> sum(ls)
>> > 39.599999999999994
>> >
>> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>> >>>> sum(ls)
>> > 39.60000000000001
>> >
>> > All I did was to take the first number, 7.23, and move it to the last
>> > position in the list. (So we have a violation of the commutativity of
>> > addition.)
>>
>> Suppose these numbers are prices in dollar, never going beyond cents.
>> Would it be safe to multiply each one of them by 100 and therefore work
>> with cents only? For instance
>
> Yes and no. It absolutely *is* safe to always work with cents, but to
> do that, you have to be consistent: ALWAYS work with cents, never with
> floating point dollars.
>
> (Or whatever other unit you choose to use. Most currencies have a
> smallest-normally-used-unit, with other currency units (where present)
> being whole number multiples of that minimal unit. Only in forex do
> you need to concern yourself with fractional cents or fractional yen.)
>
> But multiplying a set of floats by 100 won't necessarily solve your
> problem; you may have already fallen victim to the flaw of assuming
> that the numbers are represented accurately.

Hang on a second. I see it's always safe to work with cents, but I'm
only confident to say that when one gives me cents to start with. In
other words, if one gives me integers from the start. (Because then, of
course, I don't even have floats to worry about.) If I'm given 1.17,
say, I am not confident that I could turn this number into 117 by
multiplying it by 100. And that was the question. Can I always
multiply such IEEE 754 dollar amounts by 100?

Considering your last paragraph above, I should say: if one gives me an
accurate floating-point representation, can I assume a multiplication of
it by 100 remains accurately representable in IEEE 754?

>> --8<---------------cut here---------------start------------->8---
>> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>> >>> sum(map(lambda x: int(x*100), ls)) / 100
>> 39.6
>>
>> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>> >>> sum(map(lambda x: int(x*100), ls)) / 100
>> 39.6
>> --8<---------------cut here---------------end--------------->8---
>>
>> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
>> numbers either? (It worked in this case.)
>
> You're multiplying and then truncating, which risks a round-down
> error. Try adding a half onto them first:
>
> int(x * 100 + 0.5)
>
> But that's still not a perfect guarantee. Far safer would be to
> consider monetary values to be a different type of value, not just a
> raw number. For instance, the value $7.23 could be stored internally
> as the integer 723, but you also know that it's a value in USD, not a
> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
> sense to multiply USD*scalar, but it doesn't make sense to multiply
> USD*USD.

Because of the units? That would be USD squared? (Nice analysis.)

>> I suppose that if I multiply it by a power of two, that would be an
>> operation that I can be sure will not bring about any precision loss
>> with floating-point numbers. Do you agree?
>
> Assuming you're nowhere near 2**53, yes, that would be safe. But so
> would multiplying by a power of five. The problem isn't precision loss
> from the multiplication - the problem is that your input numbers
> aren't what you think they are. That number 7.23, for instance, is
> really....

Hm, I think I see what you're saying. You're saying multiplication and
division in IEEE 754 is perfectly safe --- so long as the numbers you
start with are accurately representable in IEEE 754 and assuming no
overflow or underflow would occur. (Addition and subtraction are not
safe.)

>>>> 7.23.as_integer_ratio()
> (2035064081618043, 281474976710656)
>
> ... the rational number 2035064081618043 / 281474976710656, which is
> very close to 7.23, but not exactly so. (The numerator would have to
> be ...8042.88 to be exactly correct.) There is nothing you can do at
> this point to regain the precision, although a bit of multiplication
> and rounding can cheat it and make it appear as if you did.
>
> Floating point is a very useful approximation to real numbers, but
> real numbers aren't the best way to represent financial data. Integers
> are.

I'm totally persuaded. Thanks.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 6:57 AM

Post #29 of 59 (469 views)

Julio Di Egidio <julio@diegidio.name> writes:

> On Thursday, 2 September 2021 at 15:52:02 UTC+2, Hope Rouselle wrote:
>
>> I don't understand floating-point numbers from the inside out, but I do
>> know how to work with base 2 and scientific notation. So the idea of
>> expressing a number as
>>
>> mantissa * base^{power}
>
> That's the basic idea, but the actual (ISO) floating-point *encoding*
> is more complicated than that.
>
>> is not foreign to me. (If that helps you to perhaps instruct me on
>> what's going on here.)
>
> This is the "classic":
> DAVID GOLDBER
> What Every Computer Scientist Should Know About Floating-Point Arithmetic
> <http://perso.ens-lyon.fr/jean-michel.muller/goldberg.pdf>
>
> Here is some more introductory stuff:
> <https://en.wikipedia.org/wiki/Floating-point_arithmetic>
> <https://www.phys.uconn.edu/~rozman/Courses/P2200_15F/downloads/floating-point-guide-2015-10-15.pdf>

Rozman's was pretty concise and nice. Thank you.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Richard at Damon-Family

Sep 4, 2021, 7:01 AM

Post #30 of 59 (469 views)

On 9/4/21 9:40 AM, Hope Rouselle wrote:
> Chris Angelico <rosuav@gmail.com> writes:
>
>> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>>>
>>> Hope Rouselle <hrouselle@jevedi.com> writes:
>>>
>>>> Just sharing a case of floating-point numbers. Nothing needed to be
>>>> solved or to be figured out. Just bringing up conversation.
>>>>
>>>> (*) An introduction to me
>>>>
>>>> I don't understand floating-point numbers from the inside out, but I do
>>>> know how to work with base 2 and scientific notation. So the idea of
>>>> expressing a number as
>>>>
>>>> mantissa * base^{power}
>>>>
>>>> is not foreign to me. (If that helps you to perhaps instruct me on
>>>> what's going on here.)
>>>>
>>>> (*) A presentation of the behavior
>>>>
>>>>>>> import sys
>>>>>>> sys.version
>>>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>>>> bit (AMD64)]'
>>>>
>>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>>> sum(ls)
>>>> 39.599999999999994
>>>>
>>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>>> sum(ls)
>>>> 39.60000000000001
>>>>
>>>> All I did was to take the first number, 7.23, and move it to the last
>>>> position in the list. (So we have a violation of the commutativity of
>>>> addition.)
>>>
>>> Suppose these numbers are prices in dollar, never going beyond cents.
>>> Would it be safe to multiply each one of them by 100 and therefore work
>>> with cents only? For instance
>>
>> Yes and no. It absolutely *is* safe to always work with cents, but to
>> do that, you have to be consistent: ALWAYS work with cents, never with
>> floating point dollars.
>>
>> (Or whatever other unit you choose to use. Most currencies have a
>> smallest-normally-used-unit, with other currency units (where present)
>> being whole number multiples of that minimal unit. Only in forex do
>> you need to concern yourself with fractional cents or fractional yen.)
>>
>> But multiplying a set of floats by 100 won't necessarily solve your
>> problem; you may have already fallen victim to the flaw of assuming
>> that the numbers are represented accurately.
>
> Hang on a second. I see it's always safe to work with cents, but I'm
> only confident to say that when one gives me cents to start with. In
> other words, if one gives me integers from the start. (Because then, of
> course, I don't even have floats to worry about.) If I'm given 1.17,
> say, I am not confident that I could turn this number into 117 by
> multiplying it by 100. And that was the question. Can I always
> multiply such IEEE 754 dollar amounts by 100?
>
> Considering your last paragraph above, I should say: if one gives me an
> accurate floating-point representation, can I assume a multiplication of
> it by 100 remains accurately representable in IEEE 754?
>

Multiplication by 100 might not be accurate if the number you are
starting with is close to the limit of precision, because 100 is
1.1001 x 64 so multiplying by 100 adds about 5 more 'bits' to the
representation of the number. In your case, the numbers are well below
that point.

>>> --8<---------------cut here---------------start------------->8---
>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>> sum(map(lambda x: int(x*100), ls)) / 100
>>> 39.6
>>>
>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>> sum(map(lambda x: int(x*100), ls)) / 100
>>> 39.6
>>> --8<---------------cut here---------------end--------------->8---
>>>
>>> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
>>> numbers either? (It worked in this case.)
>>
>> You're multiplying and then truncating, which risks a round-down
>> error. Try adding a half onto them first:
>>
>> int(x * 100 + 0.5)
>>
>> But that's still not a perfect guarantee. Far safer would be to
>> consider monetary values to be a different type of value, not just a
>> raw number. For instance, the value $7.23 could be stored internally
>> as the integer 723, but you also know that it's a value in USD, not a
>> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
>> sense to multiply USD*scalar, but it doesn't make sense to multiply
>> USD*USD.
>
> Because of the units? That would be USD squared? (Nice analysis.)
>
>>> I suppose that if I multiply it by a power of two, that would be an
>>> operation that I can be sure will not bring about any precision loss
>>> with floating-point numbers. Do you agree?
>>
>> Assuming you're nowhere near 2**53, yes, that would be safe. But so
>> would multiplying by a power of five. The problem isn't precision loss
>> from the multiplication - the problem is that your input numbers
>> aren't what you think they are. That number 7.23, for instance, is
>> really....
>
> Hm, I think I see what you're saying. You're saying multiplication and
> division in IEEE 754 is perfectly safe --- so long as the numbers you
> start with are accurately representable in IEEE 754 and assuming no
> overflow or underflow would occur. (Addition and subtraction are not
> safe.)
>

Addition and Subtraction are just as safe, as long as you stay within
the precision limits. Multiplication and division by powers of two are
the safest, not needing to add any precision, until you hit the limits
of the magnitude of numbers that can be expressed.

The problem is that a number like 0.1 isn't precisely represented, so it
ends up using ALL available precision to get the closest value to it so
ANY operations on it run the danger of precision loss.

>>>>> 7.23.as_integer_ratio()
>> (2035064081618043, 281474976710656)
>>
>> ... the rational number 2035064081618043 / 281474976710656, which is
>> very close to 7.23, but not exactly so. (The numerator would have to
>> be ...8042.88 to be exactly correct.) There is nothing you can do at
>> this point to regain the precision, although a bit of multiplication
>> and rounding can cheat it and make it appear as if you did.
>>
>> Floating point is a very useful approximation to real numbers, but
>> real numbers aren't the best way to represent financial data. Integers
>> are.
>
> I'm totally persuaded. Thanks.
>

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:07 AM

Post #31 of 59 (469 views)

Julio Di Egidio <julio@diegidio.name> writes:

> On Thursday, 2 September 2021 at 16:51:24 UTC+2, Christian Gollwitzer wrote:
>> Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
>> > On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
>> >> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
>> >
>> >>> 39.60000000000001
>> >>
>> >> Welcome to the exciting world of roundoff error:
>> >
>> > Welcome to the exiting world of Usenet.
>> >
>> > *Plonk*
>>
>> Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
>> right afterwards.
>
> And I thought you were not seeing my posts...
>
> Given that I have already given a full explanation, you guys, that you
> realise it or not, are simply adding noise for the usual pub-level
> discussion I must most charitably guess.
>
> Anyway, just my opinion. (EOD.)

Which is certainly appreciated --- as a rule. Pub-level noise is pretty
much unavoidable in investigation, education. Being wrong is, too,
unavoidable in investigation, education. There is a point we eventually
publish at the most respected journals, but that's a whole other
interval of the time-line. IOW, chill out! :-D (Give us a C-k and meet
us up in the next thread. Oh, my, you're not a Gnus user: you are a
G2/1.0 user. That's pretty scary.)

By the way, how's sci.logic going? I, too, lost my patience there. :-)
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:42 AM

Post #32 of 59 (469 views)

Richard Damon <Richard@Damon-Family.org> writes:

> On 9/4/21 9:40 AM, Hope Rouselle wrote:
>> Chris Angelico <rosuav@gmail.com> writes:
>>
>>> On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
>>>>
>>>> Hope Rouselle <hrouselle@jevedi.com> writes:
>>>>
>>>>> Just sharing a case of floating-point numbers. Nothing needed to be
>>>>> solved or to be figured out. Just bringing up conversation.
>>>>>
>>>>> (*) An introduction to me
>>>>>
>>>>> I don't understand floating-point numbers from the inside out, but I do
>>>>> know how to work with base 2 and scientific notation. So the idea of
>>>>> expressing a number as
>>>>>
>>>>> mantissa * base^{power}
>>>>>
>>>>> is not foreign to me. (If that helps you to perhaps instruct me on
>>>>> what's going on here.)
>>>>>
>>>>> (*) A presentation of the behavior
>>>>>
>>>>>>>> import sys
>>>>>>>> sys.version
>>>>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>>>>> bit (AMD64)]'
>>>>>
>>>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>>>> sum(ls)
>>>>> 39.599999999999994
>>>>>
>>>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>>>> sum(ls)
>>>>> 39.60000000000001
>>>>>
>>>>> All I did was to take the first number, 7.23, and move it to the last
>>>>> position in the list. (So we have a violation of the commutativity of
>>>>> addition.)
>>>>
>>>> Suppose these numbers are prices in dollar, never going beyond cents.
>>>> Would it be safe to multiply each one of them by 100 and therefore work
>>>> with cents only? For instance
>>>
>>> Yes and no. It absolutely *is* safe to always work with cents, but to
>>> do that, you have to be consistent: ALWAYS work with cents, never with
>>> floating point dollars.
>>>
>>> (Or whatever other unit you choose to use. Most currencies have a
>>> smallest-normally-used-unit, with other currency units (where present)
>>> being whole number multiples of that minimal unit. Only in forex do
>>> you need to concern yourself with fractional cents or fractional yen.)
>>>
>>> But multiplying a set of floats by 100 won't necessarily solve your
>>> problem; you may have already fallen victim to the flaw of assuming
>>> that the numbers are represented accurately.
>>
>> Hang on a second. I see it's always safe to work with cents, but I'm
>> only confident to say that when one gives me cents to start with. In
>> other words, if one gives me integers from the start. (Because then, of
>> course, I don't even have floats to worry about.) If I'm given 1.17,
>> say, I am not confident that I could turn this number into 117 by
>> multiplying it by 100. And that was the question. Can I always
>> multiply such IEEE 754 dollar amounts by 100?
>>
>> Considering your last paragraph above, I should say: if one gives me an
>> accurate floating-point representation, can I assume a multiplication of
>> it by 100 remains accurately representable in IEEE 754?
>>
>
> Multiplication by 100 might not be accurate if the number you are
> starting with is close to the limit of precision, because 100 is
> 1.1001 x 64 so multiplying by 100 adds about 5 more 'bits' to the
> representation of the number. In your case, the numbers are well below
> that point.

Alright. That's clear now. Thanks so much!

>>>> --8<---------------cut here---------------start------------->8---
>>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>>> sum(map(lambda x: int(x*100), ls)) / 100
>>>> 39.6
>>>>
>>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>>> sum(map(lambda x: int(x*100), ls)) / 100
>>>> 39.6
>>>> --8<---------------cut here---------------end--------------->8---
>>>>
>>>> Or multiplication by 100 isn't quite ``safe'' to do with floating-point
>>>> numbers either? (It worked in this case.)
>>>
>>> You're multiplying and then truncating, which risks a round-down
>>> error. Try adding a half onto them first:
>>>
>>> int(x * 100 + 0.5)
>>>
>>> But that's still not a perfect guarantee. Far safer would be to
>>> consider monetary values to be a different type of value, not just a
>>> raw number. For instance, the value $7.23 could be stored internally
>>> as the integer 723, but you also know that it's a value in USD, not a
>>> simple scalar. It makes perfect sense to add USD+USD, it makes perfect
>>> sense to multiply USD*scalar, but it doesn't make sense to multiply
>>> USD*USD.
>>
>> Because of the units? That would be USD squared? (Nice analysis.)
>>
>>>> I suppose that if I multiply it by a power of two, that would be an
>>>> operation that I can be sure will not bring about any precision loss
>>>> with floating-point numbers. Do you agree?
>>>
>>> Assuming you're nowhere near 2**53, yes, that would be safe. But so
>>> would multiplying by a power of five. The problem isn't precision loss
>>> from the multiplication - the problem is that your input numbers
>>> aren't what you think they are. That number 7.23, for instance, is
>>> really....
>>
>> Hm, I think I see what you're saying. You're saying multiplication and
>> division in IEEE 754 is perfectly safe --- so long as the numbers you
>> start with are accurately representable in IEEE 754 and assuming no
>> overflow or underflow would occur. (Addition and subtraction are not
>> safe.)
>>
>
> Addition and Subtraction are just as safe, as long as you stay within
> the precision limits. Multiplication and division by powers of two are
> the safest, not needing to add any precision, until you hit the limits
> of the magnitude of numbers that can be expressed.
>
> The problem is that a number like 0.1 isn't precisely represented, so it
> ends up using ALL available precision to get the closest value to it so
> ANY operations on it run the danger of precision loss.

Got it. That's clear now. It should've been before, but my attention
is that of a beginner, so some extra iterations turn up. As long as the
numbers involved are accurately representable, floating-points have no
other problems. I may, then, conclude that the whole difficulty with
floating-point is nothing but going beyond the reserved space for the
number.

However, I still lack an easy method to detect when a number is not
accurately representable by the floating-point datatype in use. For
instance, 0.1 is not representable accurately in IEEE 754. But I don't
know how to check that

>>> 0.1
0.1 # no clue
>>> 0.1 + 0.1
0.2 # no clue
>>> 0.1 + 0.1 + 0.1
0.30000000000000004 # there is the clue

How can I get a clearer and quicker evidence that 0.1 is not accurately
representable --- using the REPL?

I know

0.1 = 1/10 = 1 * 10^-1

and in base 2 that would have to be represented as... Let me calculate
it with my sophisticated skills:

0.1: 0 + 0.2
--> 0 + 0.4
--> 0 + 0.8
--> 1 + 0.6
--> 1 + 0.2, closing a cycle.

So 0.1 is representable poorly as 0.00011... In other words, 1/10 in
base 10 equals 1/2^4 + 1/2^5 + 1/2^9 + 1/2^10 + ...

The same question in other words --- what's a trivial way for the REPL
to show me such cycles occur?

>>>>>> 7.23.as_integer_ratio()
>>> (2035064081618043, 281474976710656)

Here's what I did on this case. The REPL is telling me that

7.23 = 2035064081618043/281474976710656

If that were true, then 7.23 * 281474976710656 would have to equal
2035064081618043. So I typed:

>>> 7.23 * 281474976710656
2035064081618043.0

That agrees with the falsehood. I'm getting no evidence of the problem.

When take control of my life out of the hands of misleading computers, I
calculate the sum:

844424930131968
+ 5629499534213120
197032483697459200
==================
203506408161804288
=/= 203506408161804300

How I can save the energy spent on manual verification?

Thanks very much.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 8:09 AM

Post #33 of 59 (469 views)

Julio Di Egidio <julio@diegidio.name> writes:

[...]

>> I, too, lost my patience there. :-)
>
> As if I didn't know who's trolling...

I never trolled you. When we had our conversations in sci.logic, I was
Boris Dorestand --- you would remember if you have very good memory. We
talked for just a few days, I guess. The people trolling you there were
not me. So, though that's no proof, *I* know you didn't know. :-)
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 8:25 AM

Post #34 of 59 (469 views)

Am 04.09.21 um 14:48 schrieb Hope Rouselle:
> Christian Gollwitzer <auriocus@gmx.de> writes:
>
>> Am 02.09.21 um 15:51 schrieb Hope Rouselle:
>>> Just sharing a case of floating-point numbers. Nothing needed to be
>>> solved or to be figured out. Just bringing up conversation.
>>> (*) An introduction to me
>>> I don't understand floating-point numbers from the inside out, but I
>>> do
>>> know how to work with base 2 and scientific notation. So the idea of
>>> expressing a number as
>>> mantissa * base^{power}
>>> is not foreign to me. (If that helps you to perhaps instruct me on
>>> what's going on here.)
>>> (*) A presentation of the behavior
>>>
>>>>>> import sys
>>>>>> sys.version
>>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>>> bit (AMD64)]'
>>>
>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>> sum(ls)
>>> 39.599999999999994
>>>
>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>> sum(ls)
>>> 39.60000000000001
>>> All I did was to take the first number, 7.23, and move it to the
>>> last
>>> position in the list. (So we have a violation of the commutativity of
>>> addition.)
>>
>> I believe it is not commutativity, but associativity, that is
>> violated.
>
> Shall we take this seriously? (I will disagree, but that doesn't mean I
> am not grateful for your post. Quite the contary.) It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone. Commutativity is required, applied to five pairs
> of numbers. How can I go from
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>
> Perhaps only through various application of commutativity, namely the
> ones below. (I omit the parentheses for less typing. I suppose that
> does not create much trouble. There is no use of associativity below,
> except for the intented omission of parentheses.)

With the parens it will become more obvious.

>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77

The sum is evaluated as

(((7.23 + 8.41) + 6.15 + ...)

For the first shift, you are correct that commutativity will result in

(((8.41 + 7.23) + 6.15 + ...)

But you can't go in one step to

(((8.41 + 6.15) + 7.23 + ...)

with the commutativity law alone. Instead, a sequence of associativity
and commutativity is required to move the 7.23 out of the first pair of
parentheses.

And what I was trying to say, the commutative steps *are* equal in
floating point arithmetics, whereas the associative steps are not.

Christian

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

greg.ewing at canterbury

Sep 4, 2021, 8:38 AM

Post #35 of 59 (469 views)

On 5/09/21 2:42 am, Hope Rouselle wrote:
> Here's what I did on this case. The REPL is telling me that
>
> 7.23 = 2035064081618043/281474976710656

If 7.23 were exactly representable, you would have got
723/1000.

Contrast this with something that *is* exactly representable:

>>> 7.875.as_integer_ratio()
(63, 8)

and observe that 7875/1000 == 63/8:

>>> from fractions import Fraction
>>> Fraction(7875,1000)
Fraction(63, 8)

In general, to find out whether a decimal number is exactly
representable in binary, represent it as a ratio of integers
where the denominator is a power of 10, reduce that to lowest
terms, and compare with the result of as_integer_ratio().

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 9:07 AM

Post #36 of 59 (469 views)

Christian Gollwitzer <auriocus@gmx.de> writes:

> Am 04.09.21 um 14:48 schrieb Hope Rouselle:
>> Christian Gollwitzer <auriocus@gmx.de> writes:
>>
>>> Am 02.09.21 um 15:51 schrieb Hope Rouselle:
>>>> Just sharing a case of floating-point numbers. Nothing needed to be
>>>> solved or to be figured out. Just bringing up conversation.
>>>> (*) An introduction to me
>>>> I don't understand floating-point numbers from the inside out, but I
>>>> do
>>>> know how to work with base 2 and scientific notation. So the idea of
>>>> expressing a number as
>>>> mantissa * base^{power}
>>>> is not foreign to me. (If that helps you to perhaps instruct me on
>>>> what's going on here.)
>>>> (*) A presentation of the behavior
>>>>
>>>>>>> import sys
>>>>>>> sys.version
>>>> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
>>>> bit (AMD64)]'
>>>>
>>>>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
>>>>>>> sum(ls)
>>>> 39.599999999999994
>>>>
>>>>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
>>>>>>> sum(ls)
>>>> 39.60000000000001
>>>> All I did was to take the first number, 7.23, and move it to the
>>>> last
>>>> position in the list. (So we have a violation of the commutativity of
>>>> addition.)
>>>
>>> I believe it is not commutativity, but associativity, that is
>>> violated.
>> Shall we take this seriously? (I will disagree, but that doesn't
>> mean I
>> am not grateful for your post. Quite the contary.) It in general
>> violates associativity too, but the example above couldn't be referring
>> to associativity because the second sum above could not be obtained from
>> associativity alone. Commutativity is required, applied to five pairs
>> of numbers. How can I go from
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>> to
>> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>> Perhaps only through various application of commutativity, namely
>> the
>> ones below. (I omit the parentheses for less typing. I suppose that
>> does not create much trouble. There is no use of associativity below,
>> except for the intented omission of parentheses.)
>
> With the parens it will become more obvious.
>
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
>
> The sum is evaluated as
>
> (((7.23 + 8.41) + 6.15 + ...)
>
> For the first shift, you are correct that commutativity will result in
>
> (((8.41 + 7.23) + 6.15 + ...)
>
> But you can't go in one step to
>
> (((8.41 + 6.15) + 7.23 + ...)
>
> with the commutativity law alone. Instead, a sequence of
> associativity and commutativity is required to move the 7.23 out of
> the first pair of parentheses.
>
> And what I was trying to say, the commutative steps *are* equal in
> floating point arithmetics, whereas the associative steps are not.

Oh, I see it. Very good point! Lesson learned.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 9:22 AM

Post #37 of 59 (469 views)

Greg Ewing <greg.ewing@canterbury.ac.nz> writes:

> On 5/09/21 2:42 am, Hope Rouselle wrote:
>> Here's what I did on this case. The REPL is telling me that
>> 7.23 = 2035064081618043/281474976710656
>
> If 7.23 were exactly representable, you would have got
> 723/1000.
>
> Contrast this with something that *is* exactly representable:
>
>>>> 7.875.as_integer_ratio()
> (63, 8)
>
> and observe that 7875/1000 == 63/8:
>
>>>> from fractions import Fraction
>>>> Fraction(7875,1000)
> Fraction(63, 8)
>
> In general, to find out whether a decimal number is exactly
> representable in binary, represent it as a ratio of integers
> where the denominator is a power of 10, reduce that to lowest
> terms, and compare with the result of as_integer_ratio().

That makes perfect sense and answers my question. I appreciate it.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 10:32 AM

Post #38 of 59 (469 views)

Hope Rouselle <hrouselle@jevedi.com> writes:

> Greg Ewing <greg.ewing@canterbury.ac.nz> writes:
>
>> On 5/09/21 2:42 am, Hope Rouselle wrote:
>>> Here's what I did on this case. The REPL is telling me that
>>> 7.23 = 2035064081618043/281474976710656
>>
>> If 7.23 were exactly representable, you would have got
>> 723/1000.
>>
>> Contrast this with something that *is* exactly representable:
>>
>>>>> 7.875.as_integer_ratio()
>> (63, 8)
>>
>> and observe that 7875/1000 == 63/8:
>>
>>>>> from fractions import Fraction
>>>>> Fraction(7875,1000)
>> Fraction(63, 8)
>>
>> In general, to find out whether a decimal number is exactly
>> representable in binary, represent it as a ratio of integers
>> where the denominator is a power of 10, reduce that to lowest
>> terms, and compare with the result of as_integer_ratio().
>
> That makes perfect sense and answers my question. I appreciate it.

Here's my homework in high-precision. Thoughts? Thank you!

--8<---------------cut here---------------start------------->8---
def is_representable(s):
return in_lowest_terms(rat_power_of_10(s)) == float(s).as_integer_ratio()

# >>> is_representable("1.5")
# True
#
# >>> is_representable("0.1")
# False

def rat_power_of_10(s):
"""I assume s is a numeric string in the format <int>.<frac>"""
if "." not in s:
return int(s), 1
integral, fractional = s.split(".")
return int(integral + fractional), 10**(len(fractional))

# >>> rat_power_of_10("72.100")
# (72100, 1000)

def in_lowest_terms(rat):
from math import gcd
n, d = rat
return n//gcd(n, d), d//gcd(n, d)

# >>> in_lowest_terms( (72100, 1000) )
# (721, 10)
--8<---------------cut here---------------end--------------->8---
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:49 PM

Post #39 of 59 (469 views)

On Sun, Sep 5, 2021 at 12:44 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Chris Angelico <rosuav@gmail.com> writes:
>
> > On Fri, Sep 3, 2021 at 4:29 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
> >>
> >> Just sharing a case of floating-point numbers. Nothing needed to be
> >> solved or to be figured out. Just bringing up conversation.
> >>
> >> (*) An introduction to me
> >>
> >> I don't understand floating-point numbers from the inside out, but I do
> >> know how to work with base 2 and scientific notation. So the idea of
> >> expressing a number as
> >>
> >> mantissa * base^{power}
> >>
> >> is not foreign to me. (If that helps you to perhaps instruct me on
> >> what's going on here.)
> >>
> >> (*) A presentation of the behavior
> >>
> >> >>> import sys
> >> >>> sys.version
> >> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> bit (AMD64)]'
> >>
> >> >>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >> >>> sum(ls)
> >> 39.599999999999994
> >>
> >> >>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >> >>> sum(ls)
> >> 39.60000000000001
> >>
> >> All I did was to take the first number, 7.23, and move it to the last
> >> position in the list. (So we have a violation of the commutativity of
> >> addition.)
> >
> > It's not about the commutativity of any particular pair of operands -
> > that's always guaranteed.
>
> Shall we take this seriously? It has to be about the commutativity of
> at least one particular pair because it is involved with the
> commutavitity of a set of pairs. If various pairs are involved, then at
> least one is involved. IOW, it is about the commutativity of some pair
> of operands and so it could not be the case that it's not about the
> commutativity of any. (Lol. I hope that's not too insubordinate. I
> already protested against a claim for associativity in this thread and
> now I'm going for the king of the hill, for whom I have always been so
> grateful!)

No, that is not the case. Look at the specific pairs of numbers that get added.

ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]

>>> 7.23 + 8.41
15.64
>>> _ + 6.15
21.79
>>> _ + 2.31
24.099999999999998
>>> _ + 7.73
31.83
>>> _ + 7.77
39.599999999999994

And with the other list:

ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]

>>> 8.41 + 6.15
14.56
>>> _ + 2.31
16.87
>>> _ + 7.73
24.6
>>> _ + 7.77
32.370000000000005
>>> _ + 7.23
39.60000000000001

If commutativity is being violated, then there should be some
situation where you could have written "7.73 + _" instead of "_ +
7.73" or equivalent, and gotten a different result. But that is simply
not the case. What you are seeing is NOT commutativity, but the
consequences of internal rounding, which is a matter of associativity.

> Alright. Thanks so much for this example. Here's a new puzzle for me.
> The REPL makes me think that both 21.79 and 2.31 *are* representable
> exactly in Python's floating-point datatype because I see:
>
> >>> 2.31
> 2.31
> >>> 21.79
> 21.79
>
> When I add them, the result obtained makes me think that the sum is
> *not* representable exactly in Python's floating-point number:
>
> >>> 21.79 + 2.31
> 24.099999999999998
>
> However, when I type 24.10 explicitly, the REPL makes me think that
> 24.10 *is* representable exactly:
>
> >>> 24.10
> 24.1
>
> I suppose I cannot trust the appearance of the representation? What's
> really going on there? (Perhaps the trouble appears while Python is
> computing the sum of the numbers 21.79 and 2.31?) Thanks so much!

The representation is a conversion from the internal format into
decimal digits. It is rounded for convenience of display, because you
don't want it to look like this:

>>> print(Fraction(24.10))
3391773469363405/140737488355328

Since that's useless, the repr of a float rounds it to the shortest
plausible number as represented in decimal digits. This has nothing to
do with whether it is exactly representable, and everything to do with
displaying things usefully in as many situations as possible :)

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:52 PM

Post #40 of 59 (469 views)

On Sun, Sep 5, 2021 at 12:48 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Chris Angelico <rosuav@gmail.com> writes:
>
> > On Fri, Sep 3, 2021 at 4:58 AM Hope Rouselle <hrouselle@jevedi.com> wrote:
> >>
> >> Hope Rouselle <hrouselle@jevedi.com> writes:
> >>
> >> > Just sharing a case of floating-point numbers. Nothing needed to be
> >> > solved or to be figured out. Just bringing up conversation.
> >> >
> >> > (*) An introduction to me
> >> >
> >> > I don't understand floating-point numbers from the inside out, but I do
> >> > know how to work with base 2 and scientific notation. So the idea of
> >> > expressing a number as
> >> >
> >> > mantissa * base^{power}
> >> >
> >> > is not foreign to me. (If that helps you to perhaps instruct me on
> >> > what's going on here.)
> >> >
> >> > (*) A presentation of the behavior
> >> >
> >> >>>> import sys
> >> >>>> sys.version
> >> > '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> > bit (AMD64)]'
> >> >
> >> >>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >> >>>> sum(ls)
> >> > 39.599999999999994
> >> >
> >> >>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >> >>>> sum(ls)
> >> > 39.60000000000001
> >> >
> >> > All I did was to take the first number, 7.23, and move it to the last
> >> > position in the list. (So we have a violation of the commutativity of
> >> > addition.)
> >>
> >> Suppose these numbers are prices in dollar, never going beyond cents.
> >> Would it be safe to multiply each one of them by 100 and therefore work
> >> with cents only? For instance
> >
> > Yes and no. It absolutely *is* safe to always work with cents, but to
> > do that, you have to be consistent: ALWAYS work with cents, never with
> > floating point dollars.
> >
> > (Or whatever other unit you choose to use. Most currencies have a
> > smallest-normally-used-unit, with other currency units (where present)
> > being whole number multiples of that minimal unit. Only in forex do
> > you need to concern yourself with fractional cents or fractional yen.)
> >
> > But multiplying a set of floats by 100 won't necessarily solve your
> > problem; you may have already fallen victim to the flaw of assuming
> > that the numbers are represented accurately.
>
> Hang on a second. I see it's always safe to work with cents, but I'm
> only confident to say that when one gives me cents to start with. In
> other words, if one gives me integers from the start. (Because then, of
> course, I don't even have floats to worry about.) If I'm given 1.17,
> say, I am not confident that I could turn this number into 117 by
> multiplying it by 100. And that was the question. Can I always
> multiply such IEEE 754 dollar amounts by 100?
>
> Considering your last paragraph above, I should say: if one gives me an
> accurate floating-point representation, can I assume a multiplication of
> it by 100 remains accurately representable in IEEE 754?

Humans usually won't give you IEEE 754 floats. What they'll usually
give you is a text string. Let's say you ask someone to type in the
prices of various items, the quantities thereof, and the shipping. You
take strings like "1.17" (or "$1.17"), and you parse that into the
integer 117.

> Hm, I think I see what you're saying. You're saying multiplication and
> division in IEEE 754 is perfectly safe --- so long as the numbers you
> start with are accurately representable in IEEE 754 and assuming no
> overflow or underflow would occur. (Addition and subtraction are not
> safe.)
>

All operations are equally valid. Anything that causes rounding can
cause loss of data, and that can happen with multiplication/division
as well as addition/subtraction. But yes, with the caveats you give,
everything is safe.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:53 PM

Post #41 of 59 (469 views)

On Sun, Sep 5, 2021 at 12:50 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Christian Gollwitzer <auriocus@gmx.de> writes:
>
> > Am 02.09.21 um 15:51 schrieb Hope Rouselle:
> >> Just sharing a case of floating-point numbers. Nothing needed to be
> >> solved or to be figured out. Just bringing up conversation.
> >> (*) An introduction to me
> >> I don't understand floating-point numbers from the inside out, but I
> >> do
> >> know how to work with base 2 and scientific notation. So the idea of
> >> expressing a number as
> >> mantissa * base^{power}
> >> is not foreign to me. (If that helps you to perhaps instruct me on
> >> what's going on here.)
> >> (*) A presentation of the behavior
> >>
> >>>>> import sys
> >>>>> sys.version
> >> '3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64
> >> bit (AMD64)]'
> >>
> >>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>>> sum(ls)
> >> 39.599999999999994
> >>
> >>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>>> sum(ls)
> >> 39.60000000000001
> >> All I did was to take the first number, 7.23, and move it to the
> >> last
> >> position in the list. (So we have a violation of the commutativity of
> >> addition.)
> >
> > I believe it is not commutativity, but associativity, that is
> > violated.
>
> Shall we take this seriously? (I will disagree, but that doesn't mean I
> am not grateful for your post. Quite the contary.) It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone. Commutativity is required, applied to five pairs
> of numbers. How can I go from
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>
> Perhaps only through various application of commutativity, namely the
> ones below. (I omit the parentheses for less typing. I suppose that
> does not create much trouble. There is no use of associativity below,
> except for the intented omission of parentheses.)
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>

Show me the pairs of numbers. You'll find that they are not the same
numbers. Commutativity is specifically that a+b == b+a and you won't
find any situation where that is violated.

As soon as you go to three or more numbers, what you're doing is
changing which numbers get added first, which is this:

a + (b + c) != (a + b) + c

and this can most certainly be violated due to intermediate rounding.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 7:59 PM

Post #42 of 59 (469 views)

On Sun, Sep 5, 2021 at 12:55 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>
> Julio Di Egidio <julio@diegidio.name> writes:
>
> > On Thursday, 2 September 2021 at 16:51:24 UTC+2, Christian Gollwitzer wrote:
> >> Am 02.09.21 um 16:49 schrieb Julio Di Egidio:
> >> > On Thursday, 2 September 2021 at 16:41:38 UTC+2, Peter Pearson wrote:
> >> >> On Thu, 02 Sep 2021 10:51:03 -0300, Hope Rouselle wrote:
> >> >
> >> >>> 39.60000000000001
> >> >>
> >> >> Welcome to the exciting world of roundoff error:
> >> >
> >> > Welcome to the exiting world of Usenet.
> >> >
> >> > *Plonk*
> >>
> >> Pretty harsh, isn't it? He gave a concise example of the same inaccuracy
> >> right afterwards.
> >
> > And I thought you were not seeing my posts...
> >
> > Given that I have already given a full explanation, you guys, that you
> > realise it or not, are simply adding noise for the usual pub-level
> > discussion I must most charitably guess.
> >
> > Anyway, just my opinion. (EOD.)
>
> Which is certainly appreciated --- as a rule. Pub-level noise is pretty
> much unavoidable in investigation, education. Being wrong is, too,
> unavoidable in investigation, education. There is a point we eventually
> publish at the most respected journals, but that's a whole other
> interval of the time-line. IOW, chill out! :-D (Give us a C-k and meet
> us up in the next thread. Oh, my, you're not a Gnus user: you are a
> G2/1.0 user. That's pretty scary.)
>

I'm not a fan of the noise level in a pub, but I have absolutely no
problem with arguing these points out. And everyone (mostly) in this
thread is being respectful. I don't mind when someone else is wrong,
especially since - a lot of the time - I'm wrong too (or maybe I'm the
only one who's wrong).

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 8:00 PM

Post #43 of 59 (469 views)

On Sun, Sep 5, 2021 at 12:58 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
>
> On 5/09/21 2:42 am, Hope Rouselle wrote:
> > Here's what I did on this case. The REPL is telling me that
> >
> > 7.23 = 2035064081618043/281474976710656
>
> If 7.23 were exactly representable, you would have got
> 723/1000.
>
> Contrast this with something that *is* exactly representable:
>
> >>> 7.875.as_integer_ratio()
> (63, 8)
>
> and observe that 7875/1000 == 63/8:
>
> >>> from fractions import Fraction
> >>> Fraction(7875,1000)
> Fraction(63, 8)
>
> In general, to find out whether a decimal number is exactly
> representable in binary, represent it as a ratio of integers
> where the denominator is a power of 10, reduce that to lowest
> terms, and compare with the result of as_integer_ratio().
>

Or let Python do that work for you!

>>> from fractions import Fraction
>>> Fraction("7.875") == Fraction(7.875)
True
>>> Fraction("7.8") == Fraction(7.8)
False

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 4, 2021, 8:38 PM

Post #44 of 59 (469 views)

On Sun, Sep 5, 2021 at 1:04 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
> The same question in other words --- what's a trivial way for the REPL
> to show me such cycles occur?
>
> >>>>>> 7.23.as_integer_ratio()
> >>> (2035064081618043, 281474976710656)
>
> Here's what I did on this case. The REPL is telling me that
>
> 7.23 = 2035064081618043/281474976710656
>
> If that were true, then 7.23 * 281474976710656 would have to equal
> 2035064081618043. So I typed:
>
> >>> 7.23 * 281474976710656
> 2035064081618043.0
>
> That agrees with the falsehood. I'm getting no evidence of the problem.
>
> When take control of my life out of the hands of misleading computers, I
> calculate the sum:
>
> 844424930131968
> + 5629499534213120
> 197032483697459200
> ==================
> 203506408161804288
> =/= 203506408161804300
>
> How I can save the energy spent on manual verification?
>

What you've stumbled upon here is actually a neat elegance of
floating-point, and an often-forgotten fundamental of it: rounding
occurs exactly the same regardless of the scale. The number 7.23 is
represented with a certain mantissa, and multiplying it by some power
of two doesn't change the mantissa, only the exponent. So the rounding
happens exactly the same, and it comes out looking equal!

The easiest way, in Python, to probe this sort of thing is to use
either fractions.Fraction or decimal.Decimal. I prefer Fraction, since
a float is fundamentally a rational number, and you can easily see
what's happening. You can construct a Fraction from a string, and
it'll do what you would expect; or you can construct one from a float,
and it'll show you what that float truly represents.

It's often cleanest to print fractions out rather than just dumping
them to the console, since the str() of a fraction looks like a
fraction, but the repr() looks like a constructor call.

>>> Fraction(0.25)
Fraction(1, 4)
>>> Fraction(0.1)
Fraction(3602879701896397, 36028797018963968)

If it looks like the number you put in, it was perfectly
representable. If it looks like something of roughly that many digits,
it's probably not the number you started with.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 5, 2021, 10:22 AM

Post #45 of 59 (463 views)

On Sat, 04 Sep 2021 10:40:35 -0300, Hope Rouselle <hrouselle@jevedi.com>
declaimed the following:

>course, I don't even have floats to worry about.) If I'm given 1.17,
>say, I am not confident that I could turn this number into 117 by
>multiplying it by 100. And that was the question. Can I always
>multiply such IEEE 754 dollar amounts by 100?
>

HOW are you "given" that 1.17? If that is coming from some
user-readable source (keyboard entry, text-based file [CSV/TSV, even SYLK])
you do NOT have a number -- you have a string, which needs to be converted
by some algorithm.

For money, the best solution, again, is to use the Decimal module and
feed the /string/ to the initialization call. If you want to do it
yourself, to get a scaled integer, you will have to code a
parser/converter.

* strip extraneous punctuation ($, etc -- but not comma, decimal
point, or + and -)
* strip any grouping separators (commas, but beware, some countries
group using a period -- "1.234,56" vs "1,234.56"). "1,234.56" => "1234.56"
* ensure there is a decimal point (again, you may have to convert a
comma to decimal point), if not append a "." to the input
* append enough 0s to the end to ensure you have whatever scale
factor you are using behind the decimal point (as mentioned M$ Excel money
type uses four places) "1234.56" => "1234.5600"
* remove the decimal marker. "1234.5600" => "12345600"
* convert to native integer. int("12345600") => 12345600 [as integer]
{may fail if +/- are not in the required position for Python}

If the number is coming from some binary file format, it is already too
late. And you might need to study any DBMS being used. Some transfer values
as text strings, and reconvert to DBMS numeric types on receipt. Make sure
the DBMS is using a decimal number type and not a floating type for the
field (which leaves out SQLite3 <G> which will store anything in any field,
but uses some slightly obscure logic to determine what conversion is done)

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 5, 2021, 2:41 PM

Post #46 of 59 (465 views)

On 2021-09-04 09:48:40 -0300, Hope Rouselle wrote:
> Christian Gollwitzer <auriocus@gmx.de> writes:
> > Am 02.09.21 um 15:51 schrieb Hope Rouselle:
> >>>>> ls = [7.23, 8.41, 6.15, 2.31, 7.73, 7.77]
> >>>>> sum(ls)
> >> 39.599999999999994
> >>
> >>>>> ls = [8.41, 6.15, 2.31, 7.73, 7.77, 7.23]
> >>>>> sum(ls)
> >> 39.60000000000001
> >> All I did was to take the first number, 7.23, and move it to the
> >> last
> >> position in the list. (So we have a violation of the commutativity of
> >> addition.)
> >
> > I believe it is not commutativity, but associativity, that is
> > violated.

I agree.

> Shall we take this seriously? (I will disagree, but that doesn't mean I
> am not grateful for your post. Quite the contary.) It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone. Commutativity is required, applied to five pairs
> of numbers. How can I go from
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?

Simple:

>>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
39.599999999999994
>>> 7.23 + (8.41 + 6.15 + 2.31 + 7.73 + 7.77)
39.60000000000001

Due to commutativity, this is the same as

>>> (8.41 + 6.15 + 2.31 + 7.73 + 7.77) + 7.23
39.60000000000001

So commutativity is preserved but associativity is lost. (Of course a
single example doesn't prove that this is always the case, but it can be
seen from the guarantees that IEEE-754 arithmetic gives you that this is
actually the case).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Re: on floating-point numbers [ In reply to ]

Sep 5, 2021, 3:13 PM

Post #47 of 59 (465 views)

On 2021-09-05 03:38:55 +1200, Greg Ewing wrote:
> If 7.23 were exactly representable, you would have got
> 723/1000.
>
> Contrast this with something that *is* exactly representable:
>
> >>> 7.875.as_integer_ratio()
> (63, 8)
>
> and observe that 7875/1000 == 63/8:
>
> >>> from fractions import Fraction
> >>> Fraction(7875,1000)
> Fraction(63, 8)
>
> In general, to find out whether a decimal number is exactly
> representable in binary, represent it as a ratio of integers
> where the denominator is a power of 10, reduce that to lowest
> terms,

... and check if the denominator is a power of two. If it isn't (e.g.
1000 == 2**3 * 5**3) then the number is not exactly representable as a
binary floating point number.

More generally, if the prime factorization of the denominator only
contains prime factors which are also prime factors of your base, then
the number can be exactle represented (unless either the denominator or
the enumerator get too big). So, for base 10 (2*5), all numbers which
have only powers of 2 and 5 in the denominator (e.g 1/10 == 1/(2*5),
1/8192 == 1/2**13, 1/1024000 == 1/(2**13 * 5**3)) can represented
exactly, but those with other prime factors (e.g. 1/3, 1/7,
1/24576 == 1/(2**13 * 3), 1/1024001 == 1/(11 * 127 * 733)) cannot.
Similarly, for base 12 (2*2*3) numbers with 2 and 3 in the denominator
can be represented and for base 60 (2*2*3*5), numbers with 2, 3 and 5.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Re: on floating-point numbers [ In reply to ]

Sep 5, 2021, 3:19 PM

Post #48 of 59 (465 views)

On 2021-09-04 10:01:23 -0400, Richard Damon wrote:
> On 9/4/21 9:40 AM, Hope Rouselle wrote:
> > Hm, I think I see what you're saying. You're saying multiplication and
> > division in IEEE 754 is perfectly safe --- so long as the numbers you
> > start with are accurately representable in IEEE 754 and assuming no
> > overflow or underflow would occur. (Addition and subtraction are not
> > safe.)
> >
>
> Addition and Subtraction are just as safe, as long as you stay within
> the precision limits.

That depends a lot on what you call "safe",

a * b / a will always be very close to b (unless there's an over- or
underflow), but a + b - a can be quite different from b.

In general when analyzing a numerical algorithm you have to pay a lot
more attention to addition and subtraction than to multiplication and
division.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Re: on floating-point numbers [ In reply to ]

grant.b.edwards at gmail

Sep 5, 2021, 3:32 PM

Post #49 of 59 (463 views)

On 2021-09-05, Peter J. Holzer <hjp-python@hjp.at> wrote:
> On 2021-09-05 03:38:55 +1200, Greg Ewing wrote:
>> If 7.23 were exactly representable, you would have got
>> 723/1000.
>>
>> Contrast this with something that *is* exactly representable:
>>
>> >>> 7.875.as_integer_ratio()
>> (63, 8)
>>
>> and observe that 7875/1000 == 63/8:
>>
>> >>> from fractions import Fraction
>> >>> Fraction(7875,1000)
>> Fraction(63, 8)
>>
>> In general, to find out whether a decimal number is exactly
>> representable in binary, represent it as a ratio of integers where
>> the denominator is a power of 10, reduce that to lowest terms,
>
> ... and check if the denominator is a power of two. If it isn't
> (e.g. 1000 == 2**3 * 5**3) then the number is not exactly
> representable as a binary floating point number.
>
> More generally, if the prime factorization of the denominator only
> contains prime factors which are also prime factors of your base,
> then the number can be exactle represented (unless either the
> denominator or the enumerator get too big).

And once you understand that, ignore it and write code under the
assumumption that nothing can be exactly represented in floating
point.

If you like, you can assume that 0 can be exactly represented without
getting into too much trouble as long as it's a literal constant value
and not the result of any run-time FP operations.

If you want to live dangerously, you can assume that integers with
magnitude less than a million can be exactly represented. That
assumption is true for all the FP representations I've ever used, but
once you start depending on it, you're one stumble from the edge of
the cliff.

--
Grant

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Richard at damon-family

Sep 5, 2021, 8:21 PM

Post #50 of 59 (464 views)

> On Sep 5, 2021, at 6:22 PM, Peter J. Holzer <hjp-python@hjp.at> wrote:
>
> ?On 2021-09-04 10:01:23 -0400, Richard Damon wrote:
>>> On 9/4/21 9:40 AM, Hope Rouselle wrote:
>>> Hm, I think I see what you're saying. You're saying multiplication and
>>> division in IEEE 754 is perfectly safe --- so long as the numbers you
>>> start with are accurately representable in IEEE 754 and assuming no
>>> overflow or underflow would occur. (Addition and subtraction are not
>>> safe.)
>>>
>>
>> Addition and Subtraction are just as safe, as long as you stay within
>> the precision limits.
>
> That depends a lot on what you call "safe",
>
> a * b / a will always be very close to b (unless there's an over- or
> underflow), but a + b - a can be quite different from b.
>
> In general when analyzing a numerical algorithm you have to pay a lot
> more attention to addition and subtraction than to multiplication and
> division.
>
> hp
>
> --
Yes, it depends on your definition of safe. If ‘close’ is good enough then multiplication is probably safer as the problems are in more extreme cases. If EXACT is the question, addition tends to be better. To have any chance, the numbers need to be somewhat low ‘precision’, which means the need to avoid arbitrary decimals. Once past that, as long as the numbers are of roughly the same magnitude, and are the sort of numbers you are apt to just write, you can tend to add a lot of them before you get enough bits to accumulate to have a problem. With multiplication, every multiply roughly adds the number of bits of precision, so you quickly run out, and one divide will have a chance to just end the process.

Remember, the question came up because the sum was’t associative because of fractional bits. That points to thinking of exact operations, and addition does better at that.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 6, 2021, 9:58 AM

Post #51 of 59 (387 views)

"Peter J. Holzer" <hjp-python@hjp.at> writes:

> On 2021-09-05 03:38:55 +1200, Greg Ewing wrote:
>> If 7.23 were exactly representable, you would have got
>> 723/1000.
>>
>> Contrast this with something that *is* exactly representable:
>>
>> >>> 7.875.as_integer_ratio()
>> (63, 8)
>>
>> and observe that 7875/1000 == 63/8:
>>
>> >>> from fractions import Fraction
>> >>> Fraction(7875,1000)
>> Fraction(63, 8)
>>
>> In general, to find out whether a decimal number is exactly
>> representable in binary, represent it as a ratio of integers
>> where the denominator is a power of 10, reduce that to lowest
>> terms,
>
> ... and check if the denominator is a power of two. If it isn't (e.g.
> 1000 == 2**3 * 5**3) then the number is not exactly representable as a
> binary floating point number.
>
> More generally, if the prime factorization of the denominator only
> contains prime factors which are also prime factors of your base, then
> the number can be exactle represented (unless either the denominator or
> the enumerator get too big). So, for base 10 (2*5), all numbers which
> have only powers of 2 and 5 in the denominator (e.g 1/10 == 1/(2*5),
> 1/8192 == 1/2**13, 1/1024000 == 1/(2**13 * 5**3)) can represented
> exactly, but those with other prime factors (e.g. 1/3, 1/7,
> 1/24576 == 1/(2**13 * 3), 1/1024001 == 1/(11 * 127 * 733)) cannot.
> Similarly, for base 12 (2*2*3) numbers with 2 and 3 in the denominator
> can be represented and for base 60 (2*2*3*5), numbers with 2, 3 and 5.

Very grateful to these paragraphs. They destroy all the mystery.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 6, 2021, 2:10 PM

Post #52 of 59 (387 views)

Chris Angelico <rosuav@gmail.com> writes:

> On Sun, Sep 5, 2021 at 1:04 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>> The same question in other words --- what's a trivial way for the REPL
>> to show me such cycles occur?
>>
>> >>>>>> 7.23.as_integer_ratio()
>> >>> (2035064081618043, 281474976710656)
>>
>> Here's what I did on this case. The REPL is telling me that
>>
>> 7.23 = 2035064081618043/281474976710656
>>
>> If that were true, then 7.23 * 281474976710656 would have to equal
>> 2035064081618043. So I typed:
>>
>> >>> 7.23 * 281474976710656
>> 2035064081618043.0
>>
>> That agrees with the falsehood. I'm getting no evidence of the problem.
>>
>> When take control of my life out of the hands of misleading computers, I
>> calculate the sum:
>>
>> 844424930131968
>> + 5629499534213120
>> 197032483697459200
>> ==================
>> 203506408161804288
>> =/= 203506408161804300
>>
>> How I can save the energy spent on manual verification?
>
> What you've stumbled upon here is actually a neat elegance of
> floating-point, and an often-forgotten fundamental of it: rounding
> occurs exactly the same regardless of the scale. The number 7.23 is
> represented with a certain mantissa, and multiplying it by some power
> of two doesn't change the mantissa, only the exponent. So the rounding
> happens exactly the same, and it comes out looking equal!

That's insightful. Thanks!

> The easiest way, in Python, to probe this sort of thing is to use
> either fractions.Fraction or decimal.Decimal. I prefer Fraction, since
> a float is fundamentally a rational number, and you can easily see
> what's happening. You can construct a Fraction from a string, and
> it'll do what you would expect; or you can construct one from a float,
> and it'll show you what that float truly represents.
>
> It's often cleanest to print fractions out rather than just dumping
> them to the console, since the str() of a fraction looks like a
> fraction, but the repr() looks like a constructor call.
>
>>>> Fraction(0.25)
> Fraction(1, 4)
>>>> Fraction(0.1)
> Fraction(3602879701896397, 36028797018963968)
>
> If it looks like the number you put in, it was perfectly
> representable. If it looks like something of roughly that many digits,
> it's probably not the number you started with.

That's pretty, pretty nice. It was really what I was looking for.

--
You're the best ``little lord of local nonsense'' I've ever met! :-D
(Lol. The guy is kinda stressed out! Plonk, plonk, plonk. EOD.)
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

pfeiffer at cs

Sep 7, 2021, 12:54 PM

Post #53 of 59 (387 views)

Hope Rouselle <hrouselle@jevedi.com> writes:
> Christian Gollwitzer <auriocus@gmx.de> writes:
>>
>> I believe it is not commutativity, but associativity, that is
>> violated.
>
> Shall we take this seriously? (I will disagree, but that doesn't mean I
> am not grateful for your post. Quite the contary.) It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone. Commutativity is required, applied to five pairs
> of numbers. How can I go from
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>
> Perhaps only through various application of commutativity, namely the
> ones below. (I omit the parentheses for less typing. I suppose that
> does not create much trouble. There is no use of associativity below,
> except for the intented omission of parentheses.)
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.

But these transformations depend on both commutativity and
associativity, precisely due to those omitted parentheses. When you
transform

7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77

into

8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.

it isn't just assuming commutativity, it's also assuming associativity
since it is changing from

(7.23 + 8.41 + 6.15 + 2.31 + 7.73) + 7.77

to

(8.41 + 6.15 + 2.31 + 7.73 + 7.77) + 7.23.

If I use parentheses to modify the order of operations of the first line
to match that of the last, I get
7.23 + (8.41 + 6.15 + 2.31 + 7.73 + 7.77)

Now, I get 39.60000000000001 evaluating either of them.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 8, 2021, 6:20 AM

Post #54 of 59 (386 views)

Joe Pfeiffer <pfeiffer@cs.nmsu.edu> writes:

> Hope Rouselle <hrouselle@jevedi.com> writes:
>> Christian Gollwitzer <auriocus@gmx.de> writes:
>>>
>>> I believe it is not commutativity, but associativity, that is
>>> violated.
>>
>> Shall we take this seriously? (I will disagree, but that doesn't mean I
>> am not grateful for your post. Quite the contary.) It in general
>> violates associativity too, but the example above couldn't be referring
>> to associativity because the second sum above could not be obtained from
>> associativity alone. Commutativity is required, applied to five pairs
>> of numbers. How can I go from
>>
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>>
>> to
>>
>> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>>
>> Perhaps only through various application of commutativity, namely the
>> ones below. (I omit the parentheses for less typing. I suppose that
>> does not create much trouble. There is no use of associativity below,
>> except for the intented omission of parentheses.)
>>
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
>> = 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>
> But these transformations depend on both commutativity and
> associativity, precisely due to those omitted parentheses. When you
> transform
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> into
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>
> it isn't just assuming commutativity, it's also assuming associativity
> since it is changing from
>
> (7.23 + 8.41 + 6.15 + 2.31 + 7.73) + 7.77
>
> to
>
> (8.41 + 6.15 + 2.31 + 7.73 + 7.77) + 7.23.
>
> If I use parentheses to modify the order of operations of the first line
> to match that of the last, I get
> 7.23 + (8.41 + 6.15 + 2.31 + 7.73 + 7.77)
>
> Now, I get 39.60000000000001 evaluating either of them.

I need to go slow. If I have just two numbers, then I don't need to
talk about associativity: I can send 7.23 to the rightmost place with a
single application of commutativity. In symbols,

7.23 + 8.41 = 8.41 + 7.23.

But if I have three numbers and I want to send the leftmost to the
rightmost place, I need to apply associativity

7.23 + 8.41 + 6.15
= (7.23 + 8.41) + 6.15 -- clarifying that I go left to right
= 7.23 + (8.41 + 6.15) -- associativity
= (8.41 + 6.15) + 7.23 -- commutativity

I see it. Cool. Thanks.
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 11, 2021, 7:57 AM

Post #55 of 59 (379 views)

On 2021-09-05 22:32:51 -0000, Grant Edwards wrote:
> On 2021-09-05, Peter J. Holzer <hjp-python@hjp.at> wrote:

[on the representability of fractional numbers as floating point
numbers]

> And once you understand that, ignore it and write code under the
> assumumption that nothing can be exactly represented in floating
> point.

In almost all cases even the input values aren't exact.

> If you like, you can assume that 0 can be exactly represented without
> getting into too much trouble as long as it's a literal constant value
> and not the result of any run-time FP operations.
>
> If you want to live dangerously, you can assume that integers with
> magnitude less than a million can be exactly represented. That
> assumption is true for all the FP representations I've ever used,

If you know nothing about the FP representation you use you could do
that (however, there is half-precision (16-bit) floating-point which has
an even shorter mantissa). But if you are that conservative, you should
be equally conservative with your integers, which probably means you
can't depend on more than 16 bits (?32767).

However, we are using Python here which means we have at least 9 decimal digits
of useable mantissa
(https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex
somewhat unhelpfully states that "[f]loating point numbers are usually
implemented using double in C", but refers to
https://docs.python.org/3/library/sys.html#sys.float_info which in turn
refers directly to the DBL_* constants from C99. So DBL_EPSILON is at
most 1E-9. in practice almost certainly less than 1E-15).

> but once you start depending on it, you're one stumble from the edge
> of the cliff.

I think this attitude will prevent you from using floating point numbers
when you could, reinventing the wheel, probably badly.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Re: on floating-point numbers [ In reply to ]

Sep 11, 2021, 8:06 AM

Post #56 of 59 (379 views)

On 2021-09-05 23:21:14 -0400, Richard Damon wrote:
> > On Sep 5, 2021, at 6:22 PM, Peter J. Holzer <hjp-python@hjp.at> wrote:
> > On 2021-09-04 10:01:23 -0400, Richard Damon wrote:
> >>> On 9/4/21 9:40 AM, Hope Rouselle wrote:
> >>> Hm, I think I see what you're saying. You're saying multiplication and
> >>> division in IEEE 754 is perfectly safe --- so long as the numbers you
> >>> start with are accurately representable in IEEE 754 and assuming no
> >>> overflow or underflow would occur. (Addition and subtraction are not
> >>> safe.)
> >>>
> >>
> >> Addition and Subtraction are just as safe, as long as you stay within
> >> the precision limits.
> >
> > That depends a lot on what you call "safe",
> >
> > a * b / a will always be very close to b (unless there's an over- or
> > underflow), but a + b - a can be quite different from b.
> >
> > In general when analyzing a numerical algorithm you have to pay a lot
> > more attention to addition and subtraction than to multiplication and
> > division.
> >
> Yes, it depends on your definition of safe. If ‘close’ is good enough
> then multiplication is probably safer as the problems are in more
> extreme cases. If EXACT is the question, addition tends to be better.
> To have any chance, the numbers need to be somewhat low ‘precision’,
> which means the need to avoid arbitrary decimals.

If you have any "decimals" (i.e decimal digits to the right of your
decimal point) then the input values won't be exactly representable and
the nearest representation will use all available bits, thus losing some
precision with most additions.

> Once past that, as long as the numbers are of roughly the same
> magnitude, and are the sort of numbers you are apt to just write, you
> can tend to add a lot of them before you get enough bits to accumulate
> to have a problem.

But they won't be exact. You may not care about rounding errors in the
tenth digit after the point, but you are only close, not exact. So if
you are fine with a tiny rounding error here, why are you upset about
equally tiny rounding errors on multiplication?

> With multiplication, every multiply roughly adds the number of bits of
> precision, so you quickly run out, and one divide will have a chance
> to just end the process.

Nope. The relative error stays the same unlike for addition where is can
get very large very quickly.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Re: on floating-point numbers [ In reply to ]

Sep 11, 2021, 8:40 AM

Post #57 of 59 (379 views)

On Sun, Sep 12, 2021 at 1:07 AM Peter J. Holzer <hjp-python@hjp.at> wrote:
> If you have any "decimals" (i.e decimal digits to the right of your
> decimal point) then the input values won't be exactly representable and
> the nearest representation will use all available bits, thus losing some
> precision with most additions.

That's an oversimplification, though - numbers like 12345.03125 can be
perfectly accurately represented, since the fractional part is a
(negative) power of two.

The perceived inaccuracy of floating point numbers comes from an
assumption that a string of decimal digits is exact, and the
computer's representation of it is not. If I put this in my code:

ONE_THIRD = 0.33333

then you know full well that it's not accurate, and that's nothing to
do with IEEE floating-point! The confusion comes from the fact that
one fifth (0.2) can be represented precisely in decimal, and not in
binary.

Once you accept that "perfectly representable numbers" aren't
necessarily the ones you expect them to be, 64-bit floats become
adequate for a huge number of tasks. Even 32-bit floats are pretty
reliable for most tasks, although I suspect that there's little reason
to use them now - would be curious to see if there's any performance
benefit from restricting to the smaller format, given that most FPUs
probably have 80-bit or wider internal registers.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

grant.b.edwards at gmail

Sep 11, 2021, 10:01 AM

Post #58 of 59 (376 views)

On 2021-09-11, Chris Angelico <rosuav@gmail.com> wrote:

> Once you accept that "perfectly representable numbers" aren't
> necessarily the ones you expect them to be, 64-bit floats become
> adequate for a huge number of tasks. Even 32-bit floats are pretty
> reliable for most tasks, although I suspect that there's little reason
> to use them now - would be curious to see if there's any performance
> benefit from restricting to the smaller format, given that most FPUs
> probably have 80-bit or wider internal registers.

Not all CPUs have FPUs. Most of my development time is spent writing
code for processors without FPUs. A soft implementation of 32-bit FP
on a 32-bit processors is way, way faster than for 64-bit FP. Not to
mention the fact that 32-bit FP data takes up half the memory of
64-bit.

There are probably not many people using Python on 32-bit CPUs w/o FP.

--
Grant

--
https://mail.python.org/mailman/listinfo/python-list

Re: on floating-point numbers [ In reply to ]

Sep 11, 2021, 12:07 PM

Post #59 of 59 (378 views)