Mailing List Archive

PERL_PERTURB_KEYS=2
I have an application that I believe is otherwise deterministic, but had
a heisenbug caused by an unintentional dependency on hash ordering.

Before managing to find the bug by other means, I tried unsuccessully
to create a reproducible testcase by experimentally setting PERL_HASH_SEED
to $h > 0, and PERL_PERTURB_KEYS to $k in (0, 1 (default), 2).

I found that with $k=1 the application failed randomly for each $h;
with $k=0 it consistently succeeded for each $h; and with $k=2 it
again failed randomly for each $h. That last ("DETERMINISTIC") feels
like a bug.

I tried unsuccessfully to reproduce the issue with a simple test:
$ENV{PERL_PERTURB_KEYS} = 2;
for my $h (1 .. 10) {
for (1 .. 3) {
local $ENV{PERL_HASH_SEED} = $h;
print "$h: ";
system(q{perl}, q{-e}, q{
%x = map +($_ => 1), ("a".."z");
print join("", keys %x), "\n";
})
}
}
.. but saw no non-deterministic behaviour there (and in fact $h does
not seem to cause any variation in results either).

Before I dig further to try and cut down the larger application to a
test case, does anyone have suggestions as to:
- what other (non-obvious) non-deterministic behaviour I might have?
- circumstances in which PERL_PERTURB_KEYS=2 might fail to give consistent
behaviour between runs of a program that is deterministic other than
for reliance on hash order?
- how to dump perl internals of a hash at runtime? Devel::Peek does not
appear to show the structure of buckets and chains explicitly, unless
I'm misreading it.

Thanks in advance for any clues,

Hugo
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
Everybody had to start doing a for each sort to get deterministic order.
That happened 3-4 years ago, I think.

On Mon, Jul 27, 2020 at 6:57 AM <hv@crypt.org> wrote:

> I have an application that I believe is otherwise deterministic, but had
> a heisenbug caused by an unintentional dependency on hash ordering.
>
> Before managing to find the bug by other means, I tried unsuccessully
> to create a reproducible testcase by experimentally setting PERL_HASH_SEED
> to $h > 0, and PERL_PERTURB_KEYS to $k in (0, 1 (default), 2).
>
> I found that with $k=1 the application failed randomly for each $h;
> with $k=0 it consistently succeeded for each $h; and with $k=2 it
> again failed randomly for each $h. That last ("DETERMINISTIC") feels
> like a bug.
>
> I tried unsuccessfully to reproduce the issue with a simple test:
> $ENV{PERL_PERTURB_KEYS} = 2;
> for my $h (1 .. 10) {
> for (1 .. 3) {
> local $ENV{PERL_HASH_SEED} = $h;
> print "$h: ";
> system(q{perl}, q{-e}, q{
> %x = map +($_ => 1), ("a".."z");
> print join("", keys %x), "\n";
> })
> }
> }
> .. but saw no non-deterministic behaviour there (and in fact $h does
> not seem to cause any variation in results either).
>
> Before I dig further to try and cut down the larger application to a
> test case, does anyone have suggestions as to:
> - what other (non-obvious) non-deterministic behaviour I might have?
> - circumstances in which PERL_PERTURB_KEYS=2 might fail to give consistent
> behaviour between runs of a program that is deterministic other than
> for reliance on hash order?
> - how to dump perl internals of a hash at runtime? Devel::Peek does not
> appear to show the structure of buckets and chains explicitly, unless
> I'm misreading it.
>
> Thanks in advance for any clues,
>
> Hugo
>
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
On Mon, 27 Jul 2020 at 15:57, <hv@crypt.org> wrote:
>
> I have an application that I believe is otherwise deterministic, but had
> a heisenbug caused by an unintentional dependency on hash ordering.
>
> Before managing to find the bug by other means, I tried unsuccessully
> to create a reproducible testcase by experimentally setting PERL_HASH_SEED
> to $h > 0, and PERL_PERTURB_KEYS to $k in (0, 1 (default), 2).
>
> I found that with $k=1 the application failed randomly for each $h;
> with $k=0 it consistently succeeded for each $h; and with $k=2 it
> again failed randomly for each $h. That last ("DETERMINISTIC") feels
> like a bug.
>
> I tried unsuccessfully to reproduce the issue with a simple test:
> $ENV{PERL_PERTURB_KEYS} = 2;

> for my $h (1 .. 10) {
> for (1 .. 3) {
> local $ENV{PERL_HASH_SEED} = $h;
> print "$h: ";
> system(q{perl}, q{-e}, q{
> %x = map +($_ => 1), ("a".."z");
> print join("", keys %x), "\n";
> })
> }
> }
> .. but saw no non-deterministic behaviour there (and in fact $h does
> not seem to cause any variation in results either).

I do not really understand the question. What I see, below, is what I
expect, 10 triplets of the same thing.

$ perl t.pl
1: ynxpwfzakiceqbjlhutvsodgmr
1: ynxpwfzakiceqbjlhutvsodgmr
1: ynxpwfzakiceqbjlhutvsodgmr
2: xhipulwjymensrbvkotfdzcagq
2: xhipulwjymensrbvkotfdzcagq
2: xhipulwjymensrbvkotfdzcagq
3: koxvyjzlhitdanpfqcgurswebm
3: koxvyjzlhitdanpfqcgurswebm
3: koxvyjzlhitdanpfqcgurswebm
4: esjbtnlaifpxgryvmowcqukdhz
4: esjbtnlaifpxgryvmowcqukdhz
4: esjbtnlaifpxgryvmowcqukdhz
5: ksdveiytjquhrozfxmbcwglnpa
5: ksdveiytjquhrozfxmbcwglnpa
5: ksdveiytjquhrozfxmbcwglnpa
6: hasownfcejpymxrdukvzbgtliq
6: hasownfcejpymxrdukvzbgtliq
6: hasownfcejpymxrdukvzbgtliq
7: yaodewbtcxqhjgmipukvlsnrfz
7: yaodewbtcxqhjgmipukvlsnrfz
7: yaodewbtcxqhjgmipukvlsnrfz
8: ycbktxleufsqhzamjovnipdgwr
8: ycbktxleufsqhzamjovnipdgwr
8: ycbktxleufsqhzamjovnipdgwr
9: idsgnalxtqobrhmcuypjfzvwke
9: idsgnalxtqobrhmcuypjfzvwke
9: idsgnalxtqobrhmcuypjfzvwke
10: ynxpwfzakiceqbjlhutvsodgmr
10: ynxpwfzakiceqbjlhutvsodgmr
10: ynxpwfzakiceqbjlhutvsodgmr

and if I run it twice I see the same thing

$ perl t.pl > t.out1
$ perl t.pl > t.out2
$ diff t.out1 t.out2
$

You should see the same. What do you see instead and what do you expect to see?

BTW, PERL_PERTURB_KEYS = 2 does not stop the peturbing, it makes the
perturbing deterministic, when PERL_PERTURB_KEYS = 1 it mixes in data
which can vary from run to run. PERL_PERTURB_KEYS stops the perturbing
entirely.

Also PERL_PERTURB_KEYS is orthogonal to PERL_HASH_SEED. And yes,
setting the seed changes the order, which is why there are 10 sets of
the same order.

I think what you want is to do PERL_HASH_SEED=0 which is magic, and
sets PERL_PERTURB_KEYS=0 and sets the seed to a standard default, so
from the point of view of the hash engine it is totally deterministic.

Yves




--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
demerphq <demerphq@gmail.com> wrote:
:On Mon, 27 Jul 2020 at 15:57, <hv@crypt.org> wrote:
:>
:> I have an application that I believe is otherwise deterministic, but had
:> a heisenbug caused by an unintentional dependency on hash ordering.
:>
:> Before managing to find the bug by other means, I tried unsuccessully
:> to create a reproducible testcase by experimentally setting PERL_HASH_SEED
:> to $h > 0, and PERL_PERTURB_KEYS to $k in (0, 1 (default), 2).
:>
:> I found that with $k=1 the application failed randomly for each $h;
:> with $k=0 it consistently succeeded for each $h; and with $k=2 it
:> again failed randomly for each $h. That last ("DETERMINISTIC") feels
:> like a bug.
:>
:> I tried unsuccessfully to reproduce the issue with a simple test:
:> $ENV{PERL_PERTURB_KEYS} = 2;
:
:> for my $h (1 .. 10) {
:> for (1 .. 3) {
:> local $ENV{PERL_HASH_SEED} = $h;
:> print "$h: ";
:> system(q{perl}, q{-e}, q{
:> %x = map +($_ => 1), ("a".."z");
:> print join("", keys %x), "\n";
:> })
:> }
:> }
:> .. but saw no non-deterministic behaviour there (and in fact $h does
:> not seem to cause any variation in results either).
:
:I do not really understand the question. What I see, below, is what I
:expect, 10 triplets of the same thing.
:
:$ perl t.pl
:1: ynxpwfzakiceqbjlhutvsodgmr
:1: ynxpwfzakiceqbjlhutvsodgmr
:1: ynxpwfzakiceqbjlhutvsodgmr
:2: xhipulwjymensrbvkotfdzcagq
[...]

Apologies if I've been unclear.

With 5.32 I see all 10 triplets being the same as each other:
1: fmpdvjxiezlghkrotnyasuqwbc
1: fmpdvjxiezlghkrotnyasuqwbc
1: fmpdvjxiezlghkrotnyasuqwbc
2: fmpdvjxiezlghkrotnyasuqwbc
2: fmpdvjxiezlghkrotnyasuqwbc
2: fmpdvjxiezlghkrotnyasuqwbc
3: fmpdvjxiezlghkrotnyasuqwbc
3: fmpdvjxiezlghkrotnyasuqwbc
3: fmpdvjxiezlghkrotnyasuqwbc
4: fmpdvjxiezlghkrotnyasuqwbc
4: fmpdvjxiezlghkrotnyasuqwbc
4: fmpdvjxiezlghkrotnyasuqwbc
5: fmpdvjxiezlghkrotnyasuqwbc
5: fmpdvjxiezlghkrotnyasuqwbc
5: fmpdvjxiezlghkrotnyasuqwbc
6: fmpdvjxiezlghkrotnyasuqwbc
6: fmpdvjxiezlghkrotnyasuqwbc
6: fmpdvjxiezlghkrotnyasuqwbc
7: fmpdvjxiezlghkrotnyasuqwbc
7: fmpdvjxiezlghkrotnyasuqwbc
7: fmpdvjxiezlghkrotnyasuqwbc
8: fmpdvjxiezlghkrotnyasuqwbc
8: fmpdvjxiezlghkrotnyasuqwbc
8: fmpdvjxiezlghkrotnyasuqwbc
9: fmpdvjxiezlghkrotnyasuqwbc
9: fmpdvjxiezlghkrotnyasuqwbc
9: fmpdvjxiezlghkrotnyasuqwbc
10: fmpdvjxiezlghkrotnyasuqwbc
10: fmpdvjxiezlghkrotnyasuqwbc
10: fmpdvjxiezlghkrotnyasuqwbc

:You should see the same. What do you see instead and what do you expect to see?

I expected either to see the same as you (PERL_PERTURB_KEYS=2 working as
documented) or variation within triplets (reproducing the non-determinism
I appear to get with my larger application.

:BTW, PERL_PERTURB_KEYS = 2 does not stop the peturbing, it makes the
:perturbing deterministic, when PERL_PERTURB_KEYS = 1 it mixes in data
:which can vary from run to run. PERL_PERTURB_KEYS stops the perturbing
:entirely.
:
:Also PERL_PERTURB_KEYS is orthogonal to PERL_HASH_SEED. And yes,
:setting the seed changes the order, which is why there are 10 sets of
:the same order.
:
:I think what you want is to do PERL_HASH_SEED=0 which is magic, and
:sets PERL_PERTURB_KEYS=0 and sets the seed to a standard default, so
:from the point of view of the hash engine it is totally deterministic.

What I _was_ trying to do was deterministically to reproduce my heisenbug.
But I found the bug by other means (examining all uses of C<keys> and
C<values> in my application), so that's no longer the issue.

PERL_PERTURB_KEYS=0 did not help me, because for all hash keys that failed
to reproduce the bug - the test case passed every time.

I expect PERL_PERTURB_KEYS=2 to give me deterministic behaviour for a
given hash seed, but in my larger application I fail to see that - the
bug remains a heisenbug.

I'm in the process of trying to cut it down to the point I can either
see what I've done wrong or post a real test case here, but it'll likely
take a few days.

Hugo
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
demerphq <demerphq@gmail.com> wrote:
:Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?
:
:I have a feeling your perl is built with hadh seed randomization disabled.

% PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x986401ef56ab149158c74ee4062846bd8b8504d92918d2beaf137d90 PERTURB_KEYS = 1 (RANDOM)
% /opt/v5.32.0-d/bin/perl -V
Summary of my perl5 (revision 5 version 32 subversion 0) configuration:
Commit id: 0cf01644e9e0460386db76c3546d69b15e0806df
Platform:
osname=linux
osvers=5.3.0-51-generic
archname=x86_64-linux
uname='linux zen2 5.3.0-51-generic #44~18.04.2-ubuntu smp thu apr 23 14:27:18 utc 2020 x86_64 x86_64 x86_64 gnulinux '
config_args='-des -Dcc=gcc -Dprefix=/opt/v5.32.0-d -Doptimize=-g -O6 -DDEBUGGING -Dusedevel -Uversiononly'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='gcc'
ccflags ='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-g -O6'
cppflags='-fwrapv -DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='7.5.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='gcc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.27.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.27'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -g -O6 -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl):
Compile-time options:
DEBUGGING
HAS_TIMES
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_DEVEL
USE_64_BIT_ALL
USE_64_BIT_INT
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under linux
Compiled at Jun 25 2020 16:54:58
@INC:
/opt/v5.32.0-d/lib/perl5/site_perl/5.32.0/x86_64-linux
/opt/v5.32.0-d/lib/perl5/site_perl/5.32.0
/opt/v5.32.0-d/lib/perl5/5.32.0/x86_64-linux
/opt/v5.32.0-d/lib/perl5/5.32.0
%

Hugo
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?

I have a feeling your perl is built with hadh seed randomization disabled.

Yves

On Tue, 28 Jul 2020, 22:09 , <hv@crypt.org> wrote:

> demerphq <demerphq@gmail.com> wrote:
> :On Mon, 27 Jul 2020 at 15:57, <hv@crypt.org> wrote:
> :>
> :> I have an application that I believe is otherwise deterministic, but had
> :> a heisenbug caused by an unintentional dependency on hash ordering.
> :>
> :> Before managing to find the bug by other means, I tried unsuccessully
> :> to create a reproducible testcase by experimentally setting
> PERL_HASH_SEED
> :> to $h > 0, and PERL_PERTURB_KEYS to $k in (0, 1 (default), 2).
> :>
> :> I found that with $k=1 the application failed randomly for each $h;
> :> with $k=0 it consistently succeeded for each $h; and with $k=2 it
> :> again failed randomly for each $h. That last ("DETERMINISTIC") feels
> :> like a bug.
> :>
> :> I tried unsuccessfully to reproduce the issue with a simple test:
> :> $ENV{PERL_PERTURB_KEYS} = 2;
> :
> :> for my $h (1 .. 10) {
> :> for (1 .. 3) {
> :> local $ENV{PERL_HASH_SEED} = $h;
> :> print "$h: ";
> :> system(q{perl}, q{-e}, q{
> :> %x = map +($_ => 1), ("a".."z");
> :> print join("", keys %x), "\n";
> :> })
> :> }
> :> }
> :> .. but saw no non-deterministic behaviour there (and in fact $h does
> :> not seem to cause any variation in results either).
> :
> :I do not really understand the question. What I see, below, is what I
> :expect, 10 triplets of the same thing.
> :
> :$ perl t.pl
> :1: ynxpwfzakiceqbjlhutvsodgmr
> :1: ynxpwfzakiceqbjlhutvsodgmr
> :1: ynxpwfzakiceqbjlhutvsodgmr
> :2: xhipulwjymensrbvkotfdzcagq
> [...]
>
> Apologies if I've been unclear.
>
> With 5.32 I see all 10 triplets being the same as each other:
> 1: fmpdvjxiezlghkrotnyasuqwbc
> 1: fmpdvjxiezlghkrotnyasuqwbc
> 1: fmpdvjxiezlghkrotnyasuqwbc
> 2: fmpdvjxiezlghkrotnyasuqwbc
> 2: fmpdvjxiezlghkrotnyasuqwbc
> 2: fmpdvjxiezlghkrotnyasuqwbc
> 3: fmpdvjxiezlghkrotnyasuqwbc
> 3: fmpdvjxiezlghkrotnyasuqwbc
> 3: fmpdvjxiezlghkrotnyasuqwbc
> 4: fmpdvjxiezlghkrotnyasuqwbc
> 4: fmpdvjxiezlghkrotnyasuqwbc
> 4: fmpdvjxiezlghkrotnyasuqwbc
> 5: fmpdvjxiezlghkrotnyasuqwbc
> 5: fmpdvjxiezlghkrotnyasuqwbc
> 5: fmpdvjxiezlghkrotnyasuqwbc
> 6: fmpdvjxiezlghkrotnyasuqwbc
> 6: fmpdvjxiezlghkrotnyasuqwbc
> 6: fmpdvjxiezlghkrotnyasuqwbc
> 7: fmpdvjxiezlghkrotnyasuqwbc
> 7: fmpdvjxiezlghkrotnyasuqwbc
> 7: fmpdvjxiezlghkrotnyasuqwbc
> 8: fmpdvjxiezlghkrotnyasuqwbc
> 8: fmpdvjxiezlghkrotnyasuqwbc
> 8: fmpdvjxiezlghkrotnyasuqwbc
> 9: fmpdvjxiezlghkrotnyasuqwbc
> 9: fmpdvjxiezlghkrotnyasuqwbc
> 9: fmpdvjxiezlghkrotnyasuqwbc
> 10: fmpdvjxiezlghkrotnyasuqwbc
> 10: fmpdvjxiezlghkrotnyasuqwbc
> 10: fmpdvjxiezlghkrotnyasuqwbc
>
> :You should see the same. What do you see instead and what do you expect
> to see?
>
> I expected either to see the same as you (PERL_PERTURB_KEYS=2 working as
> documented) or variation within triplets (reproducing the non-determinism
> I appear to get with my larger application.
>
> :BTW, PERL_PERTURB_KEYS = 2 does not stop the peturbing, it makes the
> :perturbing deterministic, when PERL_PERTURB_KEYS = 1 it mixes in data
> :which can vary from run to run. PERL_PERTURB_KEYS stops the perturbing
> :entirely.
> :
> :Also PERL_PERTURB_KEYS is orthogonal to PERL_HASH_SEED. And yes,
> :setting the seed changes the order, which is why there are 10 sets of
> :the same order.
> :
> :I think what you want is to do PERL_HASH_SEED=0 which is magic, and
> :sets PERL_PERTURB_KEYS=0 and sets the seed to a standard default, so
> :from the point of view of the hash engine it is totally deterministic.
>
> What I _was_ trying to do was deterministically to reproduce my heisenbug.
> But I found the bug by other means (examining all uses of C<keys> and
> C<values> in my application), so that's no longer the issue.
>
> PERL_PERTURB_KEYS=0 did not help me, because for all hash keys that failed
> to reproduce the bug - the test case passed every time.
>
> I expect PERL_PERTURB_KEYS=2 to give me deterministic behaviour for a
> given hash seed, but in my larger application I fail to see that - the
> bug remains a heisenbug.
>
> I'm in the process of trying to cut it down to the point I can either
> see what I've done wrong or post a real test case here, but it'll likely
> take a few days.
>
> Hugo
>
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
On Wed, 29 Jul 2020 at 00:00, <hv@crypt.org> wrote:
>
> demerphq <demerphq@gmail.com> wrote:
> :Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?
> :
> :I have a feeling your perl is built with hadh seed randomization disabled.
>
> % PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x986401ef56ab149158c74ee4062846bd8b8504d92918d2beaf137d90 PERTURB_KEYS = 1 (RANDOM)

Ok, I get it now. Your perl is /opt/v5.32.0-d/bin/perl. But your
script references the system perl in your $PATH. I think if you change
your test script to use $^X instead of q{perl} you will get the
desired results. I suggest you have the inner perl print $], and
enable PERL_HASH_SEED_DEBUG=1 when you run the test script so you can
see details of both the outer and inner perl separately. My guess is
you have an old perl, maybe pre hash randomization in your system path
and you simply arent testing the perl you think you are.


cheers,
Yves
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
demerphq <demerphq@gmail.com> wrote:
:On Wed, 29 Jul 2020 at 00:00, <hv@crypt.org> wrote:
:>
:> demerphq <demerphq@gmail.com> wrote:
:> :Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?
:> :
:> :I have a feeling your perl is built with hadh seed randomization disabled.
:>
:> % PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
:> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x986401ef56ab149158c74ee4062846bd8b8504d92918d2beaf137d90 PERTURB_KEYS = 1 (RANDOM)
:
:Ok, I get it now. Your perl is /opt/v5.32.0-d/bin/perl. But your
:script references the system perl in your $PATH.

Ah no, the script I used referenced the same perl explicitly; I reduced
that to 'perl' to share it in the email.

Just to confirm I ran it using $^X instead, and again got 30 copies of
'fmpdvjxiezlghkrotnyasuqwbc'. I get the same result (30 copies of some
string, not necessarily the same one) with several other locally installed
perls:

% PERL_HASH_SEED_DEBUG=1 /opt/v5.28.1/bin/perl -e 1
HASH_FUNCTION = SBOX32_WITH_STATDX HASH_SEED = 0x5afc389d7cd92b2d24cf55677e77eabe7f13d65bea502c5734b45a5b PERTURB_KEYS = 1 (RANDOM)
% PERL_HASH_SEED_DEBUG=1 /opt/v5.30.0-d/bin/perl -e 1
HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x8f9a8d9d90e3a4c7e2c7aee1296da214c6a69686f38126244ab41750 PERTURB_KEYS = 1 (RANDOM)
% PERL_HASH_SEED_DEBUG=1 /opt/v5.31.10-d/bin/perl -e 1
HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x3f531474e73fd3e0e083390f9893801ef1072cf2a5512da51961da41 PERTURB_KEYS = 1 (RANDOM)
% PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x252cfa4768d343caaa2ea7a313eab33b4f8bab52e7b4993d44fd0bf1 PERTURB_KEYS = 1 (RANDOM)

System perl is 5.26.1 with 67 registered patches (Ubuntu 18.04.4), and
that gives 10 triples similar to your output:

% PERL_HASH_SEED_DEBUG=1 perl -e 1
HASH_FUNCTION = HYBRID_OAATHU_SIPHASH_1_3 HASH_SEED = 0x645353b08bcb9b82a48ecf87816cffb48a58faeaa1c191c8 PERTURB_KEYS = 1 (RANDOM)

This all feels like a distraction though: it all focuses on some
unexpectedly deterministic results, when the bit I was asking about
related to the unexpectedly _non-deterministic_ results.

Hugo
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
On Wed, 29 Jul 2020 at 12:30, <hv@crypt.org> wrote:
>
> demerphq <demerphq@gmail.com> wrote:
> :On Wed, 29 Jul 2020 at 00:00, <hv@crypt.org> wrote:
> :>
> :> demerphq <demerphq@gmail.com> wrote:
> :> :Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?
> :> :
> :> :I have a feeling your perl is built with hadh seed randomization disabled.
> :>
> :> % PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
> :> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x986401ef56ab149158c74ee4062846bd8b8504d92918d2beaf137d90 PERTURB_KEYS = 1 (RANDOM)
> :
> :Ok, I get it now. Your perl is /opt/v5.32.0-d/bin/perl. But your
> :script references the system perl in your $PATH.
>
> Ah no, the script I used referenced the same perl explicitly; I reduced
> that to 'perl' to share it in the email.
>
> Just to confirm I ran it using $^X instead, and again got 30 copies of
> 'fmpdvjxiezlghkrotnyasuqwbc'. I get the same result (30 copies of some
> string, not necessarily the same one) with several other locally installed
> perls:
>
> % PERL_HASH_SEED_DEBUG=1 /opt/v5.28.1/bin/perl -e 1
> HASH_FUNCTION = SBOX32_WITH_STATDX HASH_SEED = 0x5afc389d7cd92b2d24cf55677e77eabe7f13d65bea502c5734b45a5b PERTURB_KEYS = 1 (RANDOM)
> % PERL_HASH_SEED_DEBUG=1 /opt/v5.30.0-d/bin/perl -e 1
> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x8f9a8d9d90e3a4c7e2c7aee1296da214c6a69686f38126244ab41750 PERTURB_KEYS = 1 (RANDOM)
> % PERL_HASH_SEED_DEBUG=1 /opt/v5.31.10-d/bin/perl -e 1
> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x3f531474e73fd3e0e083390f9893801ef1072cf2a5512da51961da41 PERTURB_KEYS = 1 (RANDOM)
> % PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x252cfa4768d343caaa2ea7a313eab33b4f8bab52e7b4993d44fd0bf1 PERTURB_KEYS = 1 (RANDOM)
This is what i see with an augmented version of your script:

$ PERL_HASH_SEED_DEBUG=1 ./perl t.pl
HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0xbb6f06afee7a49c0fbbd6ab110b92e3fa9231333cf0f356c1921c8ba
PERTURB_KEYS = 1 (RANDOM)
this is /git_tree/perl/perl version 5.033001
1: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony
1: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony
1: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony
2: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x20000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkexizljvdpfmcqwbusaynrot
2: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x20000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkexizljvdpfmcqwbusaynrot
2: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x20000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkexizljvdpfmcqwbusaynrot
3: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x30000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkxiezljvpdfmcqwbusaynrot
3: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x30000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkxiezljvpdfmcqwbusaynrot
3: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x30000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:ghkxiezljvpdfmcqwbusaynrot
4: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x40000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzexijvdpmfcbqwusayntro
4: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x40000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzexijvdpmfcbqwusayntro
4: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x40000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzexijvdpmfcbqwusayntro
5: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x50000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzxiejvpdmfcbqwusayntro
5: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x50000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzxiejvpdmfcbqwusayntro
5: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x50000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkglzxiejvpdmfcbqwusayntro
6: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x60000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmdpvjexizlghkrotnyasuqwbc
6: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x60000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmdpvjexizlghkrotnyasuqwbc
6: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x60000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmdpvjexizlghkrotnyasuqwbc
7: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x70000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmpdvjxiezlghkrotnyasuqwbc
7: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x70000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmpdvjxiezlghkrotnyasuqwbc
7: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x70000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:fmpdvjxiezlghkrotnyasuqwbc
8: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x80000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfdpvjlzexihkgtronyasubqwc
8: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x80000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfdpvjlzexihkgtronyasubqwc
8: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x80000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfdpvjlzexihkgtronyasubqwc
9: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x90000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfpdvjlzxiehkgtronyasubqwc
9: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x90000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfpdvjlzxiehkgtronyasubqwc
9: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x90000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:mfpdvjlzxiehkgtronyasubqwc
10: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony
10: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony
10: HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED =
0x10000000000000000000000000000000000000000000000000000000
PERTURB_KEYS = 2 (DETERMINISTIC)
5.033001:hkgjlzixevmfpdcubwqastrony

The SBOX is being initialized from a seed with barely any bits set,
and whose bits are very similar. This produces an SBOX whose first row
is relatively similar to each other, and as the keys are only one byte
their hash values are constructed from only one row of the SBOX table.
Additionally the hash table only uses 5 bits of the hash, which
magnifies the chance that there will be a similar order between the
different seeds.

You can add keys(%hash)=10000 and you will see the full hash for each
value is different for each seed. It is only the low bits that are
similar for these similar seeds.

> System perl is 5.26.1 with 67 registered patches (Ubuntu 18.04.4), and
> that gives 10 triples similar to your output:
>
> % PERL_HASH_SEED_DEBUG=1 perl -e 1
> HASH_FUNCTION = HYBRID_OAATHU_SIPHASH_1_3 HASH_SEED = 0x645353b08bcb9b82a48ecf87816cffb48a58faeaa1c191c8 PERTURB_KEYS = 1 (RANDOM)
>
> This all feels like a distraction though: it all focuses on some
> unexpectedly deterministic results, when the bit I was asking about
> related to the unexpectedly _non-deterministic_ results.

Did you post the code that was unexpectedly non-deterministic? Maybe i
got confused.

Yves


--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
demerphq <demerphq@gmail.com> wrote:
:On Wed, 29 Jul 2020 at 12:30, <hv@crypt.org> wrote:
:>
:> demerphq <demerphq@gmail.com> wrote:
:> :On Wed, 29 Jul 2020 at 00:00, <hv@crypt.org> wrote:
:> :>
:> :> demerphq <demerphq@gmail.com> wrote:
:> :> :Can you tell me what PERL_HASH_SEED_DEBUG=1 reports for you?
:> :> :
:> :> :I have a feeling your perl is built with hadh seed randomization disabled.
:> :>
:> :> % PERL_HASH_SEED_DEBUG=1 /opt/v5.32.0-d/bin/perl -e 1
:> :> HASH_FUNCTION = SBOX32_WITH_STADTX HASH_SEED = 0x986401ef56ab149158c74ee4062846bd8b8504d92918d2beaf137d90 PERTURB_KEYS = 1 (RANDOM)
:> :
:> :Ok, I get it now. Your perl is /opt/v5.32.0-d/bin/perl. But your
:> :script references the system perl in your $PATH.
:>
:> Ah no, the script I used referenced the same perl explicitly; I reduced
:> that to 'perl' to share it in the email.
:>
:> Just to confirm I ran it using $^X instead, and again got 30 copies of
:> 'fmpdvjxiezlghkrotnyasuqwbc'. I get the same result (30 copies of some
:> string, not necessarily the same one) with several other locally installed
:> perls:
[...]
:The SBOX is being initialized from a seed with barely any bits set,
:and whose bits are very similar. This produces an SBOX whose first row
:is relatively similar to each other, and as the keys are only one byte
:their hash values are constructed from only one row of the SBOX table.
:Additionally the hash table only uses 5 bits of the hash, which
:magnifies the chance that there will be a similar order between the
:different seeds.
:
:You can add keys(%hash)=10000 and you will see the full hash for each
:value is different for each seed. It is only the low bits that are
:similar for these similar seeds.

Ok.

:> This all feels like a distraction though: it all focuses on some
:> unexpectedly deterministic results, when the bit I was asking about
:> related to the unexpectedly _non-deterministic_ results.
:
:Did you post the code that was unexpectedly non-deterministic? Maybe i
:got confused.

Not yet, I've been working on cutting down the testcase to the point I
can do so. If you want to see it anyway, the branch is at:
https://github.com/hvds/axiom/tree/perl-perturb
Last commit (e8d8363) gets it down to 763 lines of code, with the test case
in the commit message; the preceding commit 93c48cd shows (a revert of the
fix for) the bug that led me down this path in the first place.

Hugo
Re: PERL_PERTURB_KEYS=2 [ In reply to ]
Earler I wrote:
:demerphq <demerphq@gmail.com> wrote:
::Did you post the code that was unexpectedly non-deterministic? Maybe i
::got confused.
:
:Not yet, I've been working on cutting down the testcase to the point I
:can do so.

Below is as short as I've got it so far; very minor changes (eg replacing
the last Axiom::Dict->new call with C< bless [ {}, [] ], 'Axiom::Dict' >,
or removing C<use strict>) are enough to make the indeterminacy disappear.
I suspect something to do with class/object management in perl may be
triggering the indeterminacy.

I'm saving the below as 'axiom', and testing it for a given hash seed like so:

% PERL_PERTURB_KEYS=2 perl -wle 'local $ENV{PERL_HASH_SEED} = shift; ++$s{`./axiom`} for 1 .. 100; print "$_ $s{$_}" for sort keys %s' 1
aa 13
ab 87
% PERL_PERTURB_KEYS=2 perl -wle 'local $ENV{PERL_HASH_SEED} = shift; ++$s{`./axiom`} for 1 .. 100; print "$_ $s{$_}" for sort keys %s' 2
aa 10
ab 90
% PERL_PERTURB_KEYS=2 perl -wle 'local $ENV{PERL_HASH_SEED} = shift; ++$s{`./axiom`} for 1 .. 100; print "$_ $s{$_}" for sort keys %s' 3
bb 100
%

Here the first two hash seeds show indeterminacy: some runs of the program
result in the two copy() calls seeing C< keys %$dict > in the order (a, b)
both times, other runs see (a, b) the first time and (b, a) the second.

The third hash seed sees (b, a) both times consistently, so it appears to
be acting deterministically.

My understanding of PERL_PERTURB_KEYS=2 is that those first two cases
should not happen - we should get the same results on each run of the
program for a given hash seed.

Hugo
---
#!/opt/v5.32.0-d/bin/perl
use strict;
use warnings;

my $dict = bless [ {}, [] ], 'Axiom::Dict';
{
my $sdict = bless [ {}, [] ], 'Axiom::Dict';
$sdict->[1] = $dict->[1];
$sdict->[0]{'a'} = $sdict->[1][0] = [];
$sdict->[0]{'b'} = $sdict->[1][1] = [];
my $dsdict = $sdict->copy;
$dsdict->clone;
}

$dict->[0]{'a'} = $dict->[1][2] = [];
$dict->[0]{'b'} = $dict->[1][3] = [];
$dict->copy;
exit 0;

package Axiom::Dict {
sub new {
my($class) = @_;
return bless [ {}, [] ], 'Axiom::Dict';
}
sub dict { shift->[0] }
sub bind { shift->[1] }

sub clone {
my($other) = @_;
my $self = bless [ {}, [] ], 'Axiom::Dict';
my($sd, $sb) = @$self;
my($od, $ob) = @$other;
my %tr;
@$sb = map { $tr{$_} = [] } @$ob;
for my $name (keys %$od) {
$sd->{$name} = $tr{$od->{$name}} // $od->{$name};
}
return $self;
}

sub copy {
my($self) = @_;
my($dict, $bind) = @$self;
my $copy = ref($self)->new;
my $first;
for my $name (keys %$dict) {
print $name unless $first++;
my $bound = [];
push @{ $copy->[1] }, $bound;
$copy->[0]->{$name} = $bound;
}
return $copy;
}
};
__END__
Re: PERL_PERTURB_KEYS=2 (solved) [ In reply to ]
Earler I wrote:
:Below is as short as I've got it so far.

It's shorter now:

#!/opt/v5.32.0-d/bin/perl
use strict ();
my $dict = [ {} ];
{
my %tr = map +($_ => 1), ([], []);
}
$dict->[0]{'a'} = 1;
$dict->[0]{'b'} = 1;
print keys %{ $dict->[0] };
__END__

To test, expect "ab 100" or "ba 100" for a hash seed if it is
deterministic:

% PERL_PERTURB_KEYS=2 perl -wle '
for my $h (1 .. 10) {
local $ENV{PERL_HASH_SEED} = $h;
%s = ();
++$s{`./axiom`} for 1 .. 100;
print "$h: ", join(" ", map "$_ $s{$_}", sort keys %s);
}
'
1: ab 100
2: ab 100
3: ba 100
4: ab 9 ba 91
5: ab 6 ba 94
6: ba 100
7: ab 10 ba 90
8: ab 100
9: ab 5 ba 95
10: ab 100
%

In my original post I asked:
:Before I dig further to try and cut down the larger application to a
:test case, does anyone have suggestions as to:
:- what other (non-obvious) non-deterministic behaviour I might have?
:[...]

So I now have the answer: I'm constructing hash keys as stringified
references, and somehow even an invocation as simple as 'use strict ()'
introduces enough non-determinism into _memory addresses_ that those
hash keys can vary. And that seems to be enough to change the course
for later key ordering under DETERMINISTIC.

I don't know whether that's a flaw in DETERMINISTIC or simply something
we have to caveat; certainly it makes it a lot less useful to me, since
stringified references as hash keys is something I use very regularly.

Hugo