Mailing List Archive

Underscore prefix detection fix
Hi there

I'm working on a native build of wireshark on OS X, which uses Gtk+/
Quartz and doesn't require X11 emulation.
Since wireshark depends on libgcrypt (indirectly), I wondered why it
wasn't possible to build working i586 assembly optimisations on OS X/
Intel.
make check fails with error messages about unknown __mpih_* symbols
in the test programs (I'm using libgcrypt-1.2.4 here).

As I could see, this problem is known (<http://lists.gnupg.org/
pipermail/gnupg-devel/2007-January/023506.html>), but no one had
fixed it yet. So I tried my own luck.

After some backtracing, I found a small bug in the autoconf script
that prevents it from detecting if the operating system's ABI
required underscores (_) in front of symbol names. This is indeed the
case on OS X/Intel, so those underscores were "forgotten" while
compiling assembly modules and lead to the aformentioned unknown
symbols.

The hardest part was to fix the problem in a clean way, which is
somehow difficult with the way autoconf et al. works. It's also very
well possible that different versions lead to different configure
scripts, which do not work correctly any more. It might be possible
that $global_symbol_pipe was renamed to $lt_cv_sys_global_symbol_pipe
at some point of time, together with providing a different output
(which has to be trimmed with cut, for example).
One of the two things I fixed is independent of the autoconf version,
though: a closing brace was forgotten in the conftest.c code.

I hope it works for other people and can be integrated into the next
version.

Regards
Gregor

So, long story short, here's my patch:
Re: Underscore prefix detection fix [ In reply to ]
On Wed, 4 Jul 2007 18:31, seto-kun@freesurf.ch said:

> The hardest part was to fix the problem in a clean way, which is
> somehow difficult with the way autoconf et al. works. It's also very

The actual problem is that some years ago we replaced our own parser for
symnols by the one provided by libtool. Either we did this wrong (which
is entirely possible) ot libtool changed the name and semantic at some
point. Because of the syntax error in the test (see below) I assume we
never really tested the test code on platforms which require an
underscore.

> One of the two things I fixed is independent of the autoconf version,
> though: a closing brace was forgotten in the conftest.c code.

Yes this is a major fault and lurking there since 2003 :-(.

> I hope it works for other people and can be integrated into the next
> version.

I applied this to the libgcrypt trunk (soon to be 1.3.1) for testing.
I'd appreciate if you can do that. Here is the actual patch which might
also work for 1.2.4:

Index: acinclude.m4
===================================================================
--- acinclude.m4 (revision 1258)
+++ acinclude.m4 (working copy)
@@ -93,12 +93,12 @@
[ac_cv_sys_symbol_underscore=no
cat > conftest.$ac_ext <<EOF
void nm_test_func(){}
- int main(){nm_test_func;return 0;
+ int main(){nm_test_func;return 0;}
EOF
if AC_TRY_EVAL(ac_compile); then
# Now try to grab the symbols.
ac_nlist=conftest.nm
- if AC_TRY_EVAL(NM conftest.$ac_objext \| $global_symbol_pipe \> $ac_nlist) && test -s "$ac_nlist"; then
+ if AC_TRY_EVAL(NM conftest.$ac_objext \| $lt_cv_sys_global_symbol_pipe \| cut -d \' \' -f 2 \> $ac_nlist) && test -s "$ac_nlist"; then
# See whether the symbols have a leading underscore.
if egrep '^_nm_test_func' "$ac_nlist" >/dev/null; then
ac_cv_sys_symbol_underscore=yes
@@ -110,7 +110,7 @@
fi
fi
else
- echo "configure: cannot run $global_symbol_pipe" >&AC_FD_CC
+ echo "configure: cannot run $lt_cv_sys_global_symbol_pipe" >&AC_FD_CC
fi
else
echo "configure: failed program was:" >&AC_FD_CC


It is probably not a coincidence that this patch looks identical to
yours. Note that pacthign configure is not suggested as this is build
by autoconf. Run ./autogen.sh to re-create it.


Thanks,

Werner


_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gcrypt-devel
Re: Underscore prefix detection fix [ In reply to ]
Thanks for checking the patch in.
I tested trunk, and it seems to be working fine.
Two things strike me as odd though:

make check displays (while benchmarking):
ECDSA 192 bit 300000ms 7500000ms 13800000ms
ECDSA 224 bit 400000ms 9400000ms 17900000ms
ECDSA 256 bit 500000ms 11500000ms 22300000ms
ECDSA 384 bit 1000000ms 25700000ms 49600000ms
ECDSA 521 bit 2600000ms 64800000ms 126000000ms

Shouldn't the last line read ECDSA 512 bit?

The other oddity is, when I build with i586 assembly, the checks run
_slower_ than in i386 mode.
I get 1min 19sec vs. 2min 14sec on a MacBook CoreDuo 1.83GHz with 1GB
RAM.
Even when using aggressive optimisation (CFLAGS="-arch i586 -
march=yonah -O3 -ffast-math -mfpmath=sse -msse -msse2"), I still only
get 1min 47secs. For i386, I didn't use any special compiler flags.

What are me and my Mac messing up here?

Regards, G.
Re: Underscore prefix detection fix [ In reply to ]
> The other oddity is, when I build with i586 assembly, the checks
> run _slower_ than in i386 mode.
> I get 1min 19sec vs. 2min 14sec on a MacBook CoreDuo 1.83GHz with
> 1GB RAM.
> Even when using aggressive optimisation (CFLAGS="-arch i586 -
> march=yonah -O3 -ffast-math -mfpmath=sse -msse -msse2"), I still
> only get 1min 47secs. For i386, I didn't use any special compiler
> flags.
>
> What are me and my Mac messing up here?

I think I've found the problem.
In mpi/config.links, there's a rule for i586-* that sets the macro
ELF_SYNTAX in asm-syntax.h. This in turn causes the assembler to see
the line
.align (1<<3)
in front of the Loop: label in mpih-sub1-asm.S and mpih-add1-asm.S.
At least with the Apple assembler, this will be interpreted as "align
the next instruction on a 2^(1<<3) boundary" - which is BSD syntax.
I'm not quite sure, but I thought I read somewhere that this 2^(align
size) type syntax is even used in recent gas versions? In any case,
the 1<<(1<<3) = 0x100 = 256 byte alignment produces 200+ nops, which
slow the routine down considerably.
I fixed this by adding the darwin triplets to the djgpp triplets in
config.links:
i[3467]86*-msdosdjgpp* | \
i[34]86*-apple-darwin*)
echo '#define BSD_SYNTAX' >>./mpi/asm-syntax.h
cat $srcdir/mpi/i386/syntax.h >>./mpi/asm-syntax.h
path="i386"
;;
i586*-msdosdjgpp* | \
i[567]86*-apple-darwin*)
echo '#define BSD_SYNTAX' >>./mpi/asm-syntax.h
cat $srcdir/mpi/i386/syntax.h >>./mpi/asm-syntax.h
path="i586 i386"
;;

This takes out the nops - but it's still slower.
Using the aggressive optimisation flags mentioned earlier, i386
assembly lets benchmark run in 49secs, and in 68secs with i586 assembly.
Disabling assembly yields 65secs by the way.

I think I give up on this for now - it's fast enough and I'm happy
that gcrypt builds with a little bit of speed improvement on OSX. :)

Thanks for all your work,
Gregor
Re: Underscore prefix detection fix [ In reply to ]
On Sat, 28 Jul 2007 01:54, seto-kun@freesurf.ch said:

> ECDSA 521 bit 2600000ms 64800000ms 126000000ms
>
> Shouldn't the last line read ECDSA 512 bit?

No, this curve is 521 bit.


Shalom-Salam,

Werner


_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gcrypt-devel