Mailing List Archive

Non-deterministic set ordering
I was shocked to discover that when repeatedly running the following
program (condensed from a "real" program) under Python 3.8.3

for p in { ('x','y'), ('y','x') }:
    print(p)

the output was sometimes

('y', 'x')
('x', 'y')

and sometimes

('x', 'y')
('y', 'x')

Can anyone explain why running identical code should result in
traversing a set in a different order?
Thanks
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering [ In reply to ]
On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list <
python-list@python.org> wrote:

> I was shocked to discover that when repeatedly running the following
> program (condensed from a "real" program) under Python 3.8.3
>
> for p in { ('x','y'), ('y','x') }:
> print(p)
>
> the output was sometimes
>
> ('y', 'x')
> ('x', 'y')
>
> and sometimes
>
> ('x', 'y')
> ('y', 'x')
>
> Can anyone explain why running identical code should result in
> traversing a set in a different order?
>

Sets are defined as unordered so that they can be hashed internally to give
O(1) operations for many tasks.

It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering
for a given group of set operations, but being unpredictable deters
developers from mistakenly assuming they are ordered.

If you need order, you should use a tuple, list, or something like
https://grantjenks.com/docs/sortedcontainers/sortedset.html
--
https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering [ In reply to ]
On 16/05/2022 04:13, Dan Stromberg wrote:
>
> On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list
> <python-list@python.org> wrote:
>
> I was shocked to discover that when repeatedly running the following
> program (condensed from a "real" program) under Python 3.8.3
>
> for p in { ('x','y'), ('y','x') }:
>      print(p)
>
> the output was sometimes
>
> ('y', 'x')
> ('x', 'y')
>
> and sometimes
>
> ('x', 'y')
> ('y', 'x')
>
> Can anyone explain why running identical code should result in
> traversing a set in a different order?
>
>
> Sets are defined as unordered so that they can be hashed internally to
> give O(1) operations for many tasks.
>
> It wouldn't be unreasonable for sets to use a fixed-by-arbitrary
> ordering for a given group of set operations, but being unpredictable
> deters developers from mistakenly assuming they are ordered.
>
> If you need order, you should use a tuple, list, or something like
> https://grantjenks.com/docs/sortedcontainers/sortedset.html
Thanks, I can work round this behaviour.
But I'm curious: where does the variability come from?  Is it deliberate
(as your answer seems to imply)?  AFAIK the same code within the *same
run* of a program does produce identical results.
Best wishes
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering [ In reply to ]
This may explain it:
https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions

On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote:
>
>
> On 16/05/2022 04:13, Dan Stromberg wrote:
> >
> > On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list
> > <python-list@python.org> wrote:
> >
> >     I was shocked to discover that when repeatedly running the
> > following
> >     program (condensed from a "real" program) under Python 3.8.3
> >
> >     for p in { ('x','y'), ('y','x') }:
> >          print(p)
> >
> >     the output was sometimes
> >
> >     ('y', 'x')
> >     ('x', 'y')
> >
> >     and sometimes
> >
> >     ('x', 'y')
> >     ('y', 'x')
> >
> >     Can anyone explain why running identical code should result in
> >     traversing a set in a different order?
> >
> >
> > Sets are defined as unordered so that they can be hashed internally
> > to
> > give O(1) operations for many tasks.
> >
> > It wouldn't be unreasonable for sets to use a fixed-by-arbitrary
> > ordering for a given group of set operations, but being
> > unpredictable
> > deters developers from mistakenly assuming they are ordered.
> >
> > If you need order, you should use a tuple, list, or something like
> > https://grantjenks.com/docs/sortedcontainers/sortedset.html
> Thanks, I can work round this behaviour.
> But I'm curious: where does the variability come from?  Is it
> deliberate
> (as your answer seems to imply)?  AFAIK the same code within the
> *same
> run* of a program does produce identical results.
> Best wishes
> Rob Cliffe

--
https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering [ In reply to ]
On 2022-05-16 04:20, Rob Cliffe via Python-list wrote:
>
>
> On 16/05/2022 04:13, Dan Stromberg wrote:
>>
>> On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list
>> <python-list@python.org> wrote:
>>
>> I was shocked to discover that when repeatedly running the following
>> program (condensed from a "real" program) under Python 3.8.3
>>
>> for p in { ('x','y'), ('y','x') }:
>>      print(p)
>>
>> the output was sometimes
>>
>> ('y', 'x')
>> ('x', 'y')
>>
>> and sometimes
>>
>> ('x', 'y')
>> ('y', 'x')
>>
>> Can anyone explain why running identical code should result in
>> traversing a set in a different order?
>>
>>
>> Sets are defined as unordered so that they can be hashed internally to
>> give O(1) operations for many tasks.
>>
>> It wouldn't be unreasonable for sets to use a fixed-by-arbitrary
>> ordering for a given group of set operations, but being unpredictable
>> deters developers from mistakenly assuming they are ordered.
>>
>> If you need order, you should use a tuple, list, or something like
>> https://grantjenks.com/docs/sortedcontainers/sortedset.html
> Thanks, I can work round this behaviour.
> But I'm curious: where does the variability come from?  Is it deliberate
> (as your answer seems to imply)?  AFAIK the same code within the *same
> run* of a program does produce identical results.

Basically, Python uses hash randomisation in order to protect it against
denial-of-service attacks. (Search for "PYTHONHASHSEED" in the docs.)

It also applied to dicts (the code for sets was based on that for
dicts), but dicts now remember their insertion order.
--
https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering [ In reply to ]
Thanks, Paul.  Question answered!
Rob Cliffe

On 16/05/2022 04:36, Paul Bryan wrote:
> This may explain it:
> https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions
>
> On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote:
>>
>>
>> On 16/05/2022 04:13, Dan Stromberg wrote:
>>>
>>> On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list
>>> <python-list@python.org> wrote:
>>>
>>>     I was shocked to discover that when repeatedly running the following
>>>     program (condensed from a "real" program) under Python 3.8.3
>>>
>>>     for p in { ('x','y'), ('y','x') }:
>>>          print(p)
>>>
>>>     the output was sometimes
>>>
>>>     ('y', 'x')
>>>     ('x', 'y')
>>>
>>>     and sometimes
>>>
>>>     ('x', 'y')
>>>     ('y', 'x')
>>>
>>>     Can anyone explain why running identical code should result in
>>>     traversing a set in a different order?
>>>
>>>
>>> Sets are defined as unordered so that they can be hashed internally to
>>> give O(1) operations for many tasks.
>>>
>>> It wouldn't be unreasonable for sets to use a fixed-by-arbitrary
>>> ordering for a given group of set operations, but being unpredictable
>>> deters developers from mistakenly assuming they are ordered.
>>>
>>> If you need order, you should use a tuple, list, or something like
>>> https://grantjenks.com/docs/sortedcontainers/sortedset.html
>> Thanks, I can work round this behaviour.
>> But I'm curious: where does the variability come from?  Is it deliberate
>> (as your answer seems to imply)?  AFAIK the same code within the *same
>> run* of a program does produce identical results.
>> Best wishes
>> Rob Cliffe
>
--
https://mail.python.org/mailman/listinfo/python-list