Message ID | 1497950940-24243-1-git-send-email-ard.biesheuvel@linaro.org |
---|---|
Headers | show |
Series | crypto: aes - allow generic AES to be omitted | expand |
On Tue, Jun 20, 2017 at 11:28:53AM +0200, Ard Biesheuvel wrote: > The generic AES driver uses 16 lookup tables of 1 KB each, and has > encryption and decryption routines that are fully unrolled. Given how > the dependencies between this code and other drivers are declared in > Kconfig files, this code is always pulled into the core kernel, even > if it is usually superseded at runtime by accelerated drivers that > exist for many architectures. Why can't we simply replace aes-generic with aes-ti? Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 18 July 2017 at 06:25, Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Tue, Jun 20, 2017 at 11:28:53AM +0200, Ard Biesheuvel wrote: >> The generic AES driver uses 16 lookup tables of 1 KB each, and has >> encryption and decryption routines that are fully unrolled. Given how >> the dependencies between this code and other drivers are declared in >> Kconfig files, this code is always pulled into the core kernel, even >> if it is usually superseded at runtime by accelerated drivers that >> exist for many architectures. > > Why can't we simply replace aes-generic with aes-ti? > Because it is slower, and how much slower is architecture dependent (if your arch has slow multiplication, aes-ti decryption will be dog slow compared to aes-generic) Also, quite a few architectures have table based implementations that reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those into a separate module if we were to remove aes-generic.
On Tue, Jul 18, 2017 at 07:32:41AM +0100, Ard Biesheuvel wrote: > > Because it is slower, and how much slower is architecture dependent > (if your arch has slow multiplication, aes-ti decryption will be dog > slow compared to aes-generic) Right, but does anybody actually care? My guess is that on such a platform aes-generic is going to be dog-slow anyway. > Also, quite a few architectures have table based implementations that > reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those > into a separate module if we were to remove aes-generic. You mean x86 and arm, right? Isn't the idea of your patch-set to allow aes-generic to be disabled on x86/arm, no? Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 18 July 2017 at 08:18, Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Tue, Jul 18, 2017 at 07:32:41AM +0100, Ard Biesheuvel wrote: >> >> Because it is slower, and how much slower is architecture dependent >> (if your arch has slow multiplication, aes-ti decryption will be dog >> slow compared to aes-generic) > > Right, but does anybody actually care? My guess is that on such a > platform aes-generic is going to be dog-slow anyway. > Could be, yes, but it is not trivial to find out. >> Also, quite a few architectures have table based implementations that >> reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those >> into a separate module if we were to remove aes-generic. > > You mean x86 and arm, right? Isn't the idea of your patch-set to allow > aes-generic to be disabled on x86/arm, no? > Yes. The original aes-ti implemented encryption only, to provide AES-CBCMAC, CMAC etc, i.e., algorithms that rely on encryption only, and cannot be parallelized, the use case being 32-bit ARM, which has a fast, NEON-based, time invariant implementation for parallel modes, but had to fall back to the table based AES for the MAC part of CCM etc. As suggested by Eric, AES-TI was converted into a full cipher, allowing it to replace AES generic entirely. (I have documented the rationale more elaborately here: http://www.workofard.com/2017/02/time-invariant-aes/) So if you care about security and/or the cache/memory footprint more than about speed, you can disable the table based implementations that exist for i586, x86, ARM and arm64 (all of which have faster and time invariant implementations based on SIMD or special instructions anyway, so for 95% of the cases, it does not really matter). But I'd be happy to rework the code so that the small time invariant routines implement the core AES code, fulfilling the existing dependencies on CRYPTO_AES. Then, we could still provide AES generic on top of that, or only expose the tables (I don't think we should remove the x86/arm table based drivers altogether). This fits well with my non-SIMD fallback changes for arm64, which invoke crypto_aes_encrypt() directly (rather than going via CRYPTO_ALG_NEED_FALLBACK resolution, which itself may produce another SIMD based algo which requires a non-SIMD fallback itself)
On Tue, Jul 18, 2017 at 08:57:28AM +0100, Ard Biesheuvel wrote: > > So if you care about security and/or the cache/memory footprint more > than about speed, you can disable the table based implementations that > exist for i586, x86, ARM and arm64 (all of which have faster and time > invariant implementations based on SIMD or special instructions > anyway, so for 95% of the cases, it does not really matter). The thing is that anybody who cares about speed won't be using aes-generic anyway. We have way too many AES implementations as it is, and having two C implementations is really getting silly. So would it be possible for you to proceed with your work in such a way that we end up with just aes-ti as the generic C implementation? As for the table-based asm implementations yes they can stay and work out some way of sharing that table at the source-code level. At run-time the table can just go into the asm module directly since you'd only have one on each platform, right? Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On 18 July 2017 at 09:30, Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Tue, Jul 18, 2017 at 08:57:28AM +0100, Ard Biesheuvel wrote: >> >> So if you care about security and/or the cache/memory footprint more >> than about speed, you can disable the table based implementations that >> exist for i586, x86, ARM and arm64 (all of which have faster and time >> invariant implementations based on SIMD or special instructions >> anyway, so for 95% of the cases, it does not really matter). > > The thing is that anybody who cares about speed won't be using > aes-generic anyway. We have way too many AES implementations > as it is, and having two C implementations is really getting > silly. > > So would it be possible for you to proceed with your work in > such a way that we end up with just aes-ti as the generic C > implementation? > Sure. > As for the table-based asm implementations yes they can stay and > work out some way of sharing that table at the source-code level. > At run-time the table can just go into the asm module directly > since you'd only have one on each platform, right? > Indeed. And ARM only uses 4 of those 16 tables anyway (and really only needs two of them, so I will fix that as well)