Mailing List Archive

[PATCH 3/3] mpi/longlong.h: x86-64: use tzcnt instruction for trailing zeros
* mpi/longlong.h [__x86_64__] (count_trailing_zeros): Add 'rep' prefix
for 'bsfq'.
--

"rep;bsf" aka "tzcnt" is new instruction with well defined operation
on zero input and as result is faster on new CPUs. On old CPUs, "tzcnt"
functions as old "bsf" with undefined behaviour on zero input.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
---
mpi/longlong.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mpi/longlong.h b/mpi/longlong.h
index 2921e9bd..706ac723 100644
--- a/mpi/longlong.h
+++ b/mpi/longlong.h
@@ -624,7 +624,7 @@ extern USItype __udiv_qrnnd ();
# define count_trailing_zeros(count, x) \
do { \
UDItype __cbtmp; \
- __asm__ ("bsfq %1,%0" \
+ __asm__ ("rep;bsfq %1,%0" \
: "=r" (__cbtmp) : "rm" ((UDItype)(x)) \
__CLOBBER_CC); \
(count) = __cbtmp; \
--
2.34.1


_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gcrypt-devel
Re: [PATCH 3/3] mpi/longlong.h: x86-64: use tzcnt instruction for trailing zeros [ In reply to ]
Hello,

Jussi Kivilinna wrote:
> * mpi/longlong.h [__x86_64__] (count_trailing_zeros): Add 'rep' prefix
> for 'bsfq'.

Is it also applicable to 80x86 (IA-32) (adding 'rep')?


Besides, I have another issue/concern here. IIUC, longlong.h upstream
is GCC. It would be good to import some other changes from the
upstream. For example, in our version for PPC/POWER, we still have old
two-syntax asm code, that's quite outdated. ( https://dev.gnupg.org/T5980 )
--

_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gcrypt-devel
Re: [PATCH 3/3] mpi/longlong.h: x86-64: use tzcnt instruction for trailing zeros [ In reply to ]
On 6.10.2022 10.09, NIIBE Yutaka wrote:
> Hello,
>
> Jussi Kivilinna wrote:
>> * mpi/longlong.h [__x86_64__] (count_trailing_zeros): Add 'rep' prefix
>> for 'bsfq'.
>
> Is it also applicable to 80x86 (IA-32) (adding 'rep')?
>

Yes it is, I'll add 'rep' for i386 too.

>
> Besides, I have another issue/concern here. IIUC, longlong.h upstream
> is GCC. It would be good to import some other changes from the
> upstream. For example, in our version for PPC/POWER, we still have old
> two-syntax asm code, that's quite outdated. ( https://dev.gnupg.org/T5980 )

I can take look into it.

-Jussi

_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gcrypt-devel