Mailing List Archive

[patch] allow ctr mode to handle 'unaligned' plaintext blocks and improve ctr benchmarks
Hi there,

I needed to use AES in CTR mode but found that libgcrypt's current
implementation does not allow for 'unaligned' blocks of plaintext, i.e.
where the plaintext is not a multiple of the context's blocksize.

I adapted an implementation found at c. line 289 of:

https://svn.torproject.org/svn/tor/trunk/src/common/aes.c

to add this functionality to libgcrypt and have supplied the patch below.
The code there is licensed under 3-clause BSD which is GPL-compatible. I
also ran it by Nick Mathewson of Tor (at #tor@irc.oftc.net) and he was
fine with it too. (Though any errors remaining in the patch are very much
my own.)

I haven't tested the change to destruction but it does pass the existing
unit tests and libgcrypt now decrypts correctly the unaligned plaintext
blocks it was previously choking on in my application.

The results from the unit tests I've added to basic.c are the output from
patched do_ctr_encrypt() on my machine so shouldn't be taken as validating
the new code in itself! I'm sure there are numerous other tests that
could/should be added to validate the patch and I'm also sure it will be
subject to careful review!

Finally, you can see the patch results in a 33% performance improvement:

Old do_ctr_encrypt():
$ ./benchmark --cipher-repetition 10 --verbose cipher AES
Running each test 10 times.
CTR
---------------
AES 370ms 360ms

Patched do_ctr_encrypt():
$ ./benchmark --cipher-repetition 10 --verbose cipher AES
Running each test 10 times.
CTR
---------------
AES 240ms 240ms

The Tor code also finds some optimization while incrementing the counter. I
will test this out later and see if the gains are appreciable.
Alternatively, you're welcome to try them out for yourself.

The patch isn't very readable so here is the new do_ctr_encrypt() function
by itself:

static void
do_ctr_encrypt( gcry_cipher_hd_t c, byte *outbuf, const byte *inbuf,
unsigned int nbytes )
{
unsigned int n;
int i;
n = c->ctrpos;

while (1) {
if (n == 0)
c->cipher->encrypt (&c->context.c, c->ctrbuf, c->ctr);
do {
if (nbytes-- == 0) { c->ctrpos = n; return; }
*(outbuf++) = *(inbuf++) ^ c->ctrbuf[n];
} while (++n != c->cipher->blocksize);
c->ctrpos = n = 0;
for (i = c->cipher->blocksize; i > 0; i--)
{
c->ctr[i-1]++;
if (c->ctr[i-1] != 0)
break;
}
if (nbytes == 0) { c->ctrpos = n; return; }
}

}


Index: tests/basic.c
===================================================================
--- tests/basic.c (revision 1375)
+++ tests/basic.c (working copy)
@@ -363,6 +363,16 @@
{ "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10",
16,
"\x1e\x03\x1d\xda\x2f\xbe\x03\xd1\x79\x21\x70\xa0\xf3\x00\x9c\xee" },
+ { "\xf6\x9f\x24\x45\xdf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10"
+ "\x6c\x37\x10",
+ 19,
+ "\x46\x92\x63\xbd\xcb\xc5\x0a\x19\x5d\x43\x71\xec\x76\x27\x92\x12"
+ "\x34\xae\x54"},
+ { "\xf1\x92\x24\x35\xaf\x4f\x9b\x17\xad\x2b\x41\x7b\xe6\x6c\x37\x10"
+ "\x6c\x37\x11",
+ 19,
+ "\xab\xdf\xc5\x34\x5a\x5c\x51\xc6\x35\x56\xc8\x92\xfd\x57\xee\xbc"
+ "\x15\x7e\xcf"},
}
},
{ GCRY_CIPHER_AES192,
Index: cipher/cipher.c
===================================================================
--- cipher/cipher.c (revision 1375)
+++ cipher/cipher.c (working copy)
@@ -205,7 +205,19 @@

unsigned char ctr[MAX_BLOCKSIZE]; /* For Counter (CTR) mode. */

+ /*This is the context's current position in ctrbuf. Where the number
+ of bytes encrypted are not a multiple of c->cipher_block_size this
+ stores the position of the next byte in the keystream to be xor'd
+ by do_encrypt_ctr. ctr gets incremented every time ctrpos reaches
+ c->cipher_block_size. */
+ unsigned int ctrpos; /* For Counter (CTR) mode. */

+ /* This contains the context's current keystream. ctrpos stores the
+ next position to be xor'd in ctrbuf when do_ctr_encrypt is called.
+ The contents of ctr are encrypted every time ctrpos reaches
+ zero in do_ctr_encrypt, and the result is stored in ctrbuf. */
+ byte ctrbuf[MAX_BLOCKSIZE]; /* For Counter (CTR) mode. */
+
/* What follows are two contexts of the cipher in use. The first
one needs to be aligned well enough for the cipher operation
whereas the second one is a copy created by cipher_setkey and
@@ -921,6 +933,8 @@
memset (c->u_iv.iv, 0, c->cipher->blocksize);
memset (c->lastiv, 0, c->cipher->blocksize);
memset (c->ctr, 0, c->cipher->blocksize);
+ memset (c->ctrbuf, 0, c->cipher->blocksize);
+ c->ctrpos=0;
}


@@ -1355,32 +1369,31 @@
}
}

-
static void
do_ctr_encrypt( gcry_cipher_hd_t c, byte *outbuf, const byte *inbuf,
unsigned int nbytes )
{
unsigned int n;
- byte tmp[MAX_BLOCKSIZE];
int i;
+ n = c->ctrpos;

- for(n=0; n < nbytes; n++)
- {
- if ((n % c->cipher->blocksize) == 0)
- {
- c->cipher->encrypt (&c->context.c, tmp, c->ctr);
+ while (1) {
+ if (n == 0)
+ c->cipher->encrypt (&c->context.c, c->ctrbuf, c->ctr);
+ do {
+ if (nbytes-- == 0) { c->ctrpos = n; return; }
+ *(outbuf++) = *(inbuf++) ^ c->ctrbuf[n];
+ } while (++n != c->cipher->blocksize);
+ c->ctrpos = n = 0;
+ for (i = c->cipher->blocksize; i > 0; i--)
+ {
+ c->ctr[i-1]++;
+ if (c->ctr[i-1] != 0)
+ break;
+ }
+ if (nbytes == 0) { c->ctrpos = n; return; }
+ }

- for (i = c->cipher->blocksize; i > 0; i--)
- {
- c->ctr[i-1]++;
- if (c->ctr[i-1] != 0)
- break;
- }
- }
-
- /* XOR input with encrypted counter and store in output. */
- outbuf[n] = inbuf[n] ^ tmp[n % c->cipher->blocksize];
- }
}

static void
Re: [patch] allow ctr mode to handle 'unaligned' plaintext blocks and improve ctr benchmarks [ In reply to ]
On Mon, 29 Dec 2008 22:07, robert@roberthogan.net said:

> I needed to use AES in CTR mode but found that libgcrypt's current
> implementation does not allow for 'unaligned' blocks of plaintext, i.e.
> where the plaintext is not a multiple of the context's blocksize.

That is clearly a bug and needs to be fixed.

> to add this functionality to libgcrypt and have supplied the patch below.
> The code there is licensed under 3-clause BSD which is GPL-compatible. I

As per the GNU coding standards we would need to exchange legal papers
with the orginal author and you to include this code - this would be a
hassle for such a bug. Thus, I am going to implement it of my own,
probably as the first task next year. See
https://bugs.g10code.com/gnupg/issue983 .

> The results from the unit tests I've added to basic.c are the output from
> patched do_ctr_encrypt() on my machine so shouldn't be taken as validating

I'll add this test to the regression test of course. Thanks.

> The Tor code also finds some optimization while incrementing the counter. I
> will test this out later and see if the gains are appreciable.

Would it be helpful for you or TOR to have the code further optimized?
We already have CFB and CBS optimizations for AES and adding CTR should
not be a big problem. However, I can do further optimizations only
after the release of 1.4.4.


Salam-Shalom,

Werner


p.s.
Now I need to prepare tomorrows shutdown of our TOR server
allium.gnupg.org due to the German data retention laws :-((.

Artikel 10 Grundgesetz, you served as well since May 23, 1949. Bye, bye
for now and lets hope that the Federal Constitutional Court will decide
soon.

--
Die Gedanken sind frei. Auschnahme regelt ein Bundeschgesetz.


_______________________________________________
Gcrypt-devel mailing list
Gcrypt-devel@gnupg.org
http://lists.gnupg.org/mailman/listinfo/gcrypt-devel
Re: [patch] allow ctr mode to handle 'unaligned' plaintext blocks and improve ctr benchmarks [ In reply to ]
On Tuesday 30 December 2008 13:42:31 Werner Koch wrote:
> > to add this functionality to libgcrypt and have supplied the patch
> > below. The code there is licensed under 3-clause BSD which is
> > GPL-compatible. I
>
> As per the GNU coding standards we would need to exchange legal papers
> with the orginal author and you to include this code - this would be a
> hassle for such a bug. Thus, I am going to implement it of my own,
> probably as the first task next year. See
> https://bugs.g10code.com/gnupg/issue983 .
>
Sounds complicated..

> > The Tor code also finds some optimization while incrementing the
> > counter. I will test this out later and see if the gains are
> > appreciable.
>
> Would it be helpful for you or TOR to have the code further optimized?
> We already have CFB and CBS optimizations for AES and adding CTR should
> not be a big problem. However, I can do further optimizations only
> after the release of 1.4.4.
>

Tor ships a number of AES-CTR implementations and tries to choose the
optimal one at compile time. All of them are optimized to the nth degree
because Tor spends a lot of time there. I'm not part of the Tor project
but as far as I can see they've helped themselves in this regard.

>
> p.s.
> Now I need to prepare tomorrows shutdown of our TOR server
> allium.gnupg.org due to the German data retention laws :-((.
>

You might be interested to read:

http://archives.seul.org/or/talk/Nov-2008/msg00262.html

> Artikel 10 Grundgesetz, you served as well since May 23, 1949. Bye, bye
> for now and lets hope that the Federal Constitutional Court will decide
> soon.