Mailing List Archive

All CPU threads
Hi!

Anyone knows if there is a way to use all CPU threads with
*gnupg-desktop-2.4.3.0-x86_64.AppImage* ?

Best,

JK
Re: All CPU threads [ In reply to ]
Please do not send HTML to this list. Many of the people you very much
hope to read your questions will not read HTML email.

> Anyone knows if there is a way to use all CPU threads with
> *gnupg-desktop-2.4.3.0-x86_64.AppImage* ?

What exactly are you hoping to speed up? The classic mode of encryption
used in RFC2440 and RFC4880 is a hacked-up cipher feedback mode, which
is not parallelizable and doesn't benefit from using multiple threads.
You can of course use multiple threads, but you won't get any benefit.

So my question is, what exactly is it that you need to speed up? Once
we know that, we'll be able to give suggestions for how you might proceed.

_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: All CPU threads [ In reply to ]
On 9/10/23 01:21, Robert J. Hansen via Gnupg-users wrote:
> Please do not send HTML to this list.  Many of the people you very
> much hope to read your questions will not read HTML email.
>
>> Anyone knows if there is a way to use all CPU threads with
>> *gnupg-desktop-2.4.3.0-x86_64.AppImage* ?
>
> What exactly are you hoping to speed up?  The classic mode of
> encryption used in RFC2440 and RFC4880 is a hacked-up cipher feedback
> mode, which is not parallelizable and doesn't benefit from using
> multiple threads. You can of course use multiple threads, but you
> won't get any benefit.
>
> So my question is, what exactly is it that you need to speed up? Once
> we know that, we'll be able to give suggestions for how you might
> proceed.
>
> _______________________________________________
> Gnupg-users mailing list
> Gnupg-users@gnupg.org
> https://lists.gnupg.org/mailman/listinfo/gnupg-users

Thank you for reply. I was thinking about speeding up the encryption
process. But if that's not possible then that's how it is.

Is this message now plain text only?


Best,

Jozsef K.



_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: All CPU threads [ In reply to ]
> Thank you for reply. I was thinking about speeding up the encryption
> process. But if that's not possible then that's how it is.

Thank you for sending a plain-text email to the list! :)

The answer is a little complicated, but this should be an
accurate-enough explanation.

Encryption speed is dominated by disk speed first and foremost. If
you're encrypting a 1Mb file, you have to read in the file and write it
out again when you're done: your absolute minimum time is given by
however long it takes to read and write a 1Mb file.

This is unfortunate, because disk I/O is *slow*. Even SSDs, which are
about ten to twenty times as fast as older spinning metal platter hard
drives, can't completely bridge this gap. So at the end of the day,
your bottleneck for encryption is going to be disk I/O.

There are various games people play, like keeping an in-memory
filesystem. If you're doing that, then we can look at other places for
speed improvement. Remember, as you read what follows: we're doing all
of these weird things to improve things by a very tiny bit -- the
bottleneck is in disk I/O!

=====

Encryption generates a random session key and encrypts that with your
recipient's public key. Here's your next problem: there are *so many*
algorithms GnuPG supports, and there isn't a single effective
parallelization strategy for all of them. Take RSA as an example: the
expensive part of the encryption operation is P = C^e (mod n), or as
normal humans call it, "modular exponentiation".

I've got an IEEE paper on my desk (by Budikafa and Pulungan) dating from
2017 that says you can parallelize modular exponentiation to get up to a
28% speed improvement. That's really nice! The problem is the phrase
"up to" a 28% speed improvement, and the fact that only RSA uses modular
exponentiation, so if your correspondent is using ECC you're kind of out
of luck.

So, when it comes to the asymmetric part of the encryption: a sequential
version takes a couple of milliseconds, and best-case scenario by
throwing multiple threads at it you can save 28% on two milliseconds.
This is not a big enough win to justify the multithreading.

Once you've encrypted the random session key for each recipient, now you
have to process the file 16 bytes at a time. For each block after the
first, the result of the last block's encryption is an input to the
current block's encryption. Block 0 (which is the first -- remember,
computer scientists are weird, we start counting at zero) doesn't depend
on anything; block 1 depends on having the output of block 0; block 2
depends on having the output of block 1; and so on. Even if you were to
spin up one thread per block you'd still get no speed improvement.
You'd be encrypting sequentially, one block at a time until you were
complete. Multi-threading is thus theoretically possible, but offers no
advantages.

(Note that Phil Rogaway kind of disagrees with me: he characterizes
parallelizing cipher feedback modes as possible "but awkward". When
Phil Rogaway, one of the sharpest cryptographers in the world, describes
an optimization as "awkward", I very quietly turn around and start
moving in the opposite direction. Clearly I am in over my head and I
need to escape.)

https://web.cs.ucdavis.edu/~rogaway/papers/modes.pdf -- search for the
words "but awkward".

Etcetera, etcetera. Speeding up encryption operations with multiple
threads is a *deeply* challenging cryptographic engineering problem, and
for the vast majority of users isn't worth it. The easy wins (28% cost
savings on RSA encryption! Whee, almost half a millisecond!) are too
trivial, and the big wins are somewhere between "Rogaway says it's
awkward" and "Rogaway says it's impossible".

That said, the next RFC draft -- when it comes out -- will be offering
new encryption modes that may offer better parallelization performance.
I'm sure that if and when the next RFC is officially released, there
will be interest in getting parallelization support for them.

_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: All CPU threads [ In reply to ]
Hi!

Thanks Rob for your comments. Here are some additional points:

On Sat, 9 Sep 2023 22:07, Robert J. Hansen said:
> and for the vast majority of users isn't worth it. The easy wins (28%
> cost savings on RSA encryption! Whee, almost half a millisecond!) are

The blinding we use for RSA (to mitigate side-channel attacks) should be
in the same range as these wins. I bet that by adding threads to the
computation you will open another can of side-channel attacks.

> performance. I'm sure that if and when the next RFC is officially
> released, there will be interest in getting parallelization support

OCB mode is already used and deployed for years. With a decent
Libgcrypt (1.10) I get these figures for the old (CFB) and the new mode
(OCB)

AES256 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CFB enc | 0.691 ns/B 1379 MiB/s 5.14 c/B 7440±1
CFB dec | 0.064 ns/B 14959 MiB/s 0.470 c/B 7372±2

OCB enc | 0.070 ns/B 13547 MiB/s 0.522 c/B 7415±2
OCB dec | 0.071 ns/B 13451 MiB/s 0.520 c/B 7336±3


These values are for the low level crypto routines. In reality we also
do a SHA-1 hashing in addition to CFB which makes it even slower. OTOH.
the protocol requires buffering and the way gpg implements things has a
large impact on the performance. Fortunately, Jussi Kivilinna also
worked on gpg's buffering and gained a lot of extra speed:

* gpg: Threefold decryption speedup for large files.
https://dev.gnupg.org/rGab177eed51 (For the old CFB mode)

* gpg: Nearly double the AES256.OCB encryption speed.
https://dev.gnupg.org/rG99e2c178c7

Thus in 2.4 we get this for symmetric encryption of a 4 GiB file from
RAM to /dev/null on a Ryzen5800X:

AES256.CFB encryption 1.3 GiB/s
AES256.OCB encryption 4.2 GiB/s

FWIW there are also improvements in signature verification:

* gpg: Up to five times faster verification of detached signatures.
Doubled detached signing speed.
https://dev.gnupg.org/rG4e27b9defc
https://dev.gnupg.org/rGf8943ce098

YMMV depending on what kind of data you encrypt, whether signing and
compression comes into the game. Compression is a major performance hog
- feeding gpg from a (threaded) bzip2 and using -z0 will in general give
better performance than the using the internal compressor code.


Shalom-Salam,

Werner


--
The pioneers of a warless world are the youth that
refuse military service. - A. Einstein
Re: All CPU threads [ In reply to ]
Werner Koch via Gnupg-users wrote:
> [...]
>
> On Sat, 9 Sep 2023 22:07, Robert J. Hansen said:
>
>> and for the vast majority of users isn't worth it. The easy wins (28%
>> cost savings on RSA encryption! Whee, almost half a millisecond!) are
>>
>
> The blinding we use for RSA (to mitigate side-channel attacks) should be
> in the same range as these wins. I bet that by adding threads to the
> computation you will open another can of side-channel attacks.
>

So using threads to compute a blinded RSA operation would just about
recover the computational cost of blinding the calculation? How would
hypothetical thread-related side channels matter if we are using
blinding around the parallel calculation?


-- Jacob

_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gnupg-users
Re: All CPU threads [ In reply to ]
On Mon, 11 Sep 2023 22:29, Jacob Bachmeyer said:

> So using threads to compute a blinded RSA operation would just about
> recover the computational cost of blinding the calculation? How would

No. I gave this as an example where you could else see on how to speed
up things. For example if you do not need to mitigate local
side-channel attacks.


Shalom-Salam,

Werner

--
The pioneers of a warless world are the youth that
refuse military service. - A. Einstein
Re: All CPU threads [ In reply to ]
Werner Koch wrote:
> On Mon, 11 Sep 2023 22:29, Jacob Bachmeyer said:
>
>
>> So using threads to compute a blinded RSA operation would just about
>> recover the computational cost of blinding the calculation? How would
>>
>
> No. I gave this as an example where you could else see on how to speed
> up things. For example if you do not need to mitigate local
> side-channel attacks.

OK, I get it now: you were suggesting that there are easier trade-offs
for similar performance gains. Thanks.


-- Jacob


_______________________________________________
Gnupg-users mailing list
Gnupg-users@gnupg.org
https://lists.gnupg.org/mailman/listinfo/gnupg-users