Mailing List Archive

1 2 3  View All
Re: on floating-point numbers [ In reply to ]
"Peter J. Holzer" <hjp-python@hjp.at> writes:

> On 2021-09-05 03:38:55 +1200, Greg Ewing wrote:
>> If 7.23 were exactly representable, you would have got
>> 723/1000.
>>
>> Contrast this with something that *is* exactly representable:
>>
>> >>> 7.875.as_integer_ratio()
>> (63, 8)
>>
>> and observe that 7875/1000 == 63/8:
>>
>> >>> from fractions import Fraction
>> >>> Fraction(7875,1000)
>> Fraction(63, 8)
>>
>> In general, to find out whether a decimal number is exactly
>> representable in binary, represent it as a ratio of integers
>> where the denominator is a power of 10, reduce that to lowest
>> terms,
>
> ... and check if the denominator is a power of two. If it isn't (e.g.
> 1000 == 2**3 * 5**3) then the number is not exactly representable as a
> binary floating point number.
>
> More generally, if the prime factorization of the denominator only
> contains prime factors which are also prime factors of your base, then
> the number can be exactle represented (unless either the denominator or
> the enumerator get too big). So, for base 10 (2*5), all numbers which
> have only powers of 2 and 5 in the denominator (e.g 1/10 == 1/(2*5),
> 1/8192 == 1/2**13, 1/1024000 == 1/(2**13 * 5**3)) can represented
> exactly, but those with other prime factors (e.g. 1/3, 1/7,
> 1/24576 == 1/(2**13 * 3), 1/1024001 == 1/(11 * 127 * 733)) cannot.
> Similarly, for base 12 (2*2*3) numbers with 2 and 3 in the denominator
> can be represented and for base 60 (2*2*3*5), numbers with 2, 3 and 5.

Very grateful to these paragraphs. They destroy all the mystery.
--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
Chris Angelico <rosuav@gmail.com> writes:

> On Sun, Sep 5, 2021 at 1:04 PM Hope Rouselle <hrouselle@jevedi.com> wrote:
>> The same question in other words --- what's a trivial way for the REPL
>> to show me such cycles occur?
>>
>> >>>>>> 7.23.as_integer_ratio()
>> >>> (2035064081618043, 281474976710656)
>>
>> Here's what I did on this case. The REPL is telling me that
>>
>> 7.23 = 2035064081618043/281474976710656
>>
>> If that were true, then 7.23 * 281474976710656 would have to equal
>> 2035064081618043. So I typed:
>>
>> >>> 7.23 * 281474976710656
>> 2035064081618043.0
>>
>> That agrees with the falsehood. I'm getting no evidence of the problem.
>>
>> When take control of my life out of the hands of misleading computers, I
>> calculate the sum:
>>
>> 844424930131968
>> + 5629499534213120
>> 197032483697459200
>> ==================
>> 203506408161804288
>> =/= 203506408161804300
>>
>> How I can save the energy spent on manual verification?
>
> What you've stumbled upon here is actually a neat elegance of
> floating-point, and an often-forgotten fundamental of it: rounding
> occurs exactly the same regardless of the scale. The number 7.23 is
> represented with a certain mantissa, and multiplying it by some power
> of two doesn't change the mantissa, only the exponent. So the rounding
> happens exactly the same, and it comes out looking equal!

That's insightful. Thanks!

> The easiest way, in Python, to probe this sort of thing is to use
> either fractions.Fraction or decimal.Decimal. I prefer Fraction, since
> a float is fundamentally a rational number, and you can easily see
> what's happening. You can construct a Fraction from a string, and
> it'll do what you would expect; or you can construct one from a float,
> and it'll show you what that float truly represents.
>
> It's often cleanest to print fractions out rather than just dumping
> them to the console, since the str() of a fraction looks like a
> fraction, but the repr() looks like a constructor call.
>
>>>> Fraction(0.25)
> Fraction(1, 4)
>>>> Fraction(0.1)
> Fraction(3602879701896397, 36028797018963968)
>
> If it looks like the number you put in, it was perfectly
> representable. If it looks like something of roughly that many digits,
> it's probably not the number you started with.

That's pretty, pretty nice. It was really what I was looking for.

--
You're the best ``little lord of local nonsense'' I've ever met! :-D
(Lol. The guy is kinda stressed out! Plonk, plonk, plonk. EOD.)
--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
Hope Rouselle <hrouselle@jevedi.com> writes:
> Christian Gollwitzer <auriocus@gmx.de> writes:
>>
>> I believe it is not commutativity, but associativity, that is
>> violated.
>
> Shall we take this seriously? (I will disagree, but that doesn't mean I
> am not grateful for your post. Quite the contary.) It in general
> violates associativity too, but the example above couldn't be referring
> to associativity because the second sum above could not be obtained from
> associativity alone. Commutativity is required, applied to five pairs
> of numbers. How can I go from
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> to
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>
> Perhaps only through various application of commutativity, namely the
> ones below. (I omit the parentheses for less typing. I suppose that
> does not create much trouble. There is no use of associativity below,
> except for the intented omission of parentheses.)
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
> = 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.

But these transformations depend on both commutativity and
associativity, precisely due to those omitted parentheses. When you
transform

7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77

into

8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.

it isn't just assuming commutativity, it's also assuming associativity
since it is changing from

(7.23 + 8.41 + 6.15 + 2.31 + 7.73) + 7.77

to

(8.41 + 6.15 + 2.31 + 7.73 + 7.77) + 7.23.

If I use parentheses to modify the order of operations of the first line
to match that of the last, I get
7.23 + (8.41 + 6.15 + 2.31 + 7.73 + 7.77)

Now, I get 39.60000000000001 evaluating either of them.
--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
Joe Pfeiffer <pfeiffer@cs.nmsu.edu> writes:

> Hope Rouselle <hrouselle@jevedi.com> writes:
>> Christian Gollwitzer <auriocus@gmx.de> writes:
>>>
>>> I believe it is not commutativity, but associativity, that is
>>> violated.
>>
>> Shall we take this seriously? (I will disagree, but that doesn't mean I
>> am not grateful for your post. Quite the contary.) It in general
>> violates associativity too, but the example above couldn't be referring
>> to associativity because the second sum above could not be obtained from
>> associativity alone. Commutativity is required, applied to five pairs
>> of numbers. How can I go from
>>
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>>
>> to
>>
>> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23?
>>
>> Perhaps only through various application of commutativity, namely the
>> ones below. (I omit the parentheses for less typing. I suppose that
>> does not create much trouble. There is no use of associativity below,
>> except for the intented omission of parentheses.)
>>
>> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>> = 8.41 + 7.23 + 6.15 + 2.31 + 7.73 + 7.77
>> = 8.41 + 6.15 + 7.23 + 2.31 + 7.73 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.23 + 7.73 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.73 + 7.23 + 7.77
>> = 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>
> But these transformations depend on both commutativity and
> associativity, precisely due to those omitted parentheses. When you
> transform
>
> 7.23 + 8.41 + 6.15 + 2.31 + 7.73 + 7.77
>
> into
>
> 8.41 + 6.15 + 2.31 + 7.73 + 7.77 + 7.23.
>
> it isn't just assuming commutativity, it's also assuming associativity
> since it is changing from
>
> (7.23 + 8.41 + 6.15 + 2.31 + 7.73) + 7.77
>
> to
>
> (8.41 + 6.15 + 2.31 + 7.73 + 7.77) + 7.23.
>
> If I use parentheses to modify the order of operations of the first line
> to match that of the last, I get
> 7.23 + (8.41 + 6.15 + 2.31 + 7.73 + 7.77)
>
> Now, I get 39.60000000000001 evaluating either of them.

I need to go slow. If I have just two numbers, then I don't need to
talk about associativity: I can send 7.23 to the rightmost place with a
single application of commutativity. In symbols,

7.23 + 8.41 = 8.41 + 7.23.

But if I have three numbers and I want to send the leftmost to the
rightmost place, I need to apply associativity

7.23 + 8.41 + 6.15
= (7.23 + 8.41) + 6.15 -- clarifying that I go left to right
= 7.23 + (8.41 + 6.15) -- associativity
= (8.41 + 6.15) + 7.23 -- commutativity

I see it. Cool. Thanks.
--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
On 2021-09-05 22:32:51 -0000, Grant Edwards wrote:
> On 2021-09-05, Peter J. Holzer <hjp-python@hjp.at> wrote:

[on the representability of fractional numbers as floating point
numbers]

> And once you understand that, ignore it and write code under the
> assumumption that nothing can be exactly represented in floating
> point.

In almost all cases even the input values aren't exact.


> If you like, you can assume that 0 can be exactly represented without
> getting into too much trouble as long as it's a literal constant value
> and not the result of any run-time FP operations.
>
> If you want to live dangerously, you can assume that integers with
> magnitude less than a million can be exactly represented. That
> assumption is true for all the FP representations I've ever used,

If you know nothing about the FP representation you use you could do
that (however, there is half-precision (16-bit) floating-point which has
an even shorter mantissa). But if you are that conservative, you should
be equally conservative with your integers, which probably means you
can't depend on more than 16 bits (?32767).

However, we are using Python here which means we have at least 9 decimal digits
of useable mantissa
(https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex
somewhat unhelpfully states that "[f]loating point numbers are usually
implemented using double in C", but refers to
https://docs.python.org/3/library/sys.html#sys.float_info which in turn
refers directly to the DBL_* constants from C99. So DBL_EPSILON is at
most 1E-9. in practice almost certainly less than 1E-15).

> but once you start depending on it, you're one stumble from the edge
> of the cliff.

I think this attitude will prevent you from using floating point numbers
when you could, reinventing the wheel, probably badly.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Re: on floating-point numbers [ In reply to ]
On 2021-09-05 23:21:14 -0400, Richard Damon wrote:
> > On Sep 5, 2021, at 6:22 PM, Peter J. Holzer <hjp-python@hjp.at> wrote:
> > On 2021-09-04 10:01:23 -0400, Richard Damon wrote:
> >>> On 9/4/21 9:40 AM, Hope Rouselle wrote:
> >>> Hm, I think I see what you're saying. You're saying multiplication and
> >>> division in IEEE 754 is perfectly safe --- so long as the numbers you
> >>> start with are accurately representable in IEEE 754 and assuming no
> >>> overflow or underflow would occur. (Addition and subtraction are not
> >>> safe.)
> >>>
> >>
> >> Addition and Subtraction are just as safe, as long as you stay within
> >> the precision limits.
> >
> > That depends a lot on what you call "safe",
> >
> > a * b / a will always be very close to b (unless there's an over- or
> > underflow), but a + b - a can be quite different from b.
> >
> > In general when analyzing a numerical algorithm you have to pay a lot
> > more attention to addition and subtraction than to multiplication and
> > division.
> >
> Yes, it depends on your definition of safe. If ‘close’ is good enough
> then multiplication is probably safer as the problems are in more
> extreme cases. If EXACT is the question, addition tends to be better.
> To have any chance, the numbers need to be somewhat low ‘precision’,
> which means the need to avoid arbitrary decimals.

If you have any "decimals" (i.e decimal digits to the right of your
decimal point) then the input values won't be exactly representable and
the nearest representation will use all available bits, thus losing some
precision with most additions.

> Once past that, as long as the numbers are of roughly the same
> magnitude, and are the sort of numbers you are apt to just write, you
> can tend to add a lot of them before you get enough bits to accumulate
> to have a problem.

But they won't be exact. You may not care about rounding errors in the
tenth digit after the point, but you are only close, not exact. So if
you are fine with a tiny rounding error here, why are you upset about
equally tiny rounding errors on multiplication?

> With multiplication, every multiply roughly adds the number of bits of
> precision, so you quickly run out, and one divide will have a chance
> to just end the process.

Nope. The relative error stays the same unlike for addition where is can
get very large very quickly.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Re: on floating-point numbers [ In reply to ]
On Sun, Sep 12, 2021 at 1:07 AM Peter J. Holzer <hjp-python@hjp.at> wrote:
> If you have any "decimals" (i.e decimal digits to the right of your
> decimal point) then the input values won't be exactly representable and
> the nearest representation will use all available bits, thus losing some
> precision with most additions.

That's an oversimplification, though - numbers like 12345.03125 can be
perfectly accurately represented, since the fractional part is a
(negative) power of two.

The perceived inaccuracy of floating point numbers comes from an
assumption that a string of decimal digits is exact, and the
computer's representation of it is not. If I put this in my code:

ONE_THIRD = 0.33333

then you know full well that it's not accurate, and that's nothing to
do with IEEE floating-point! The confusion comes from the fact that
one fifth (0.2) can be represented precisely in decimal, and not in
binary.

Once you accept that "perfectly representable numbers" aren't
necessarily the ones you expect them to be, 64-bit floats become
adequate for a huge number of tasks. Even 32-bit floats are pretty
reliable for most tasks, although I suspect that there's little reason
to use them now - would be curious to see if there's any performance
benefit from restricting to the smaller format, given that most FPUs
probably have 80-bit or wider internal registers.

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
On 2021-09-11, Chris Angelico <rosuav@gmail.com> wrote:

> Once you accept that "perfectly representable numbers" aren't
> necessarily the ones you expect them to be, 64-bit floats become
> adequate for a huge number of tasks. Even 32-bit floats are pretty
> reliable for most tasks, although I suspect that there's little reason
> to use them now - would be curious to see if there's any performance
> benefit from restricting to the smaller format, given that most FPUs
> probably have 80-bit or wider internal registers.

Not all CPUs have FPUs. Most of my development time is spent writing
code for processors without FPUs. A soft implementation of 32-bit FP
on a 32-bit processors is way, way faster than for 64-bit FP. Not to
mention the fact that 32-bit FP data takes up half the memory of
64-bit.

There are probably not many people using Python on 32-bit CPUs w/o FP.

--
Grant


--
https://mail.python.org/mailman/listinfo/python-list
Re: on floating-point numbers [ In reply to ]
On 2021-09-12 01:40:12 +1000, Chris Angelico wrote:
> On Sun, Sep 12, 2021 at 1:07 AM Peter J. Holzer <hjp-python@hjp.at> wrote:
> > If you have any "decimals" (i.e decimal digits to the right of your
> > decimal point) then the input values won't be exactly representable and
> > the nearest representation will use all available bits, thus losing some
> > precision with most additions.
>
> That's an oversimplification, though - numbers like 12345.03125 can be
> perfectly accurately represented, since the fractional part is a
> (negative) power of two.

Yes. I had explained that earlier in this thread.

> The perceived inaccuracy of floating point numbers comes from an
> assumption that a string of decimal digits is exact, and the
> computer's representation of it is not. If I put this in my code:
>
> ONE_THIRD = 0.33333
>
> then you know full well that it's not accurate, and that's nothing to
> do with IEEE floating-point! The confusion comes from the fact that
> one fifth (0.2) can be represented precisely in decimal, and not in
> binary.

Exactly.


> Once you accept that "perfectly representable numbers" aren't
> necessarily the ones you expect them to be, 64-bit floats become
> adequate for a huge number of tasks.

Yep. That's what I was trying to convey.


> Even 32-bit floats are pretty reliable for most tasks, although I
> suspect that there's little reason to use them now - would be curious
> to see if there's any performance benefit from restricting to the
> smaller format, given that most FPUs probably have 80-bit or wider
> internal registers.

AFAIK C compilers on 64-bit AMD/Intel architecture don't use the x87 ABI
any more, they use the various vector extensions (SSE, etc.) instead.
Those have hardware support for 64 and 32 bit FP values, so 32 bit are
probably faster, if only because you can cram more of them into a
register. Modern GPUs now have 16 bit FP numbers - those are perfectly
adequate for neural networks and also some graphics tasks and you can
transfer twice as many per memory cycle ...

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

1 2 3  View All