Message ID | 20210107124128.19791-1-ardb@kernel.org |
---|---|
State | Superseded |
Headers | show |
Series | crypto - shash: reduce minimum alignment of shash_desc structure | expand |
On Thu, 7 Jan 2021 at 20:02, Eric Biggers <ebiggers@kernel.org> wrote: > > On Thu, Jan 07, 2021 at 01:41:28PM +0100, Ard Biesheuvel wrote: > > Unlike many other structure types defined in the crypto API, the > > 'shash_desc' structure is permitted to live on the stack, which > > implies its contents may not be accessed by DMA masters. (This is > > due to the fact that the stack may be located in the vmalloc area, > > which requires a different virtual-to-physical translation than the > > one implemented by the DMA subsystem) > > > > Our definition of CRYPTO_MINALIGN_ATTR is based on ARCH_KMALLOC_MINALIGN, > > which may take DMA constraints into account on architectures that support > > non-cache coherent DMA such as ARM and arm64. In this case, the value is > > chosen to reflect the largest cacheline size in the system, in order to > > ensure that explicit cache maintenance as required by non-coherent DMA > > masters does not affect adjacent, unrelated slab allocations. On arm64, > > this value is currently set at 128 bytes. > > > > This means that applying CRYPTO_MINALIGN_ATTR to struct shash_desc is both > > unnecessary (as it is never used for DMA), and undesirable, given that it > > wastes stack space (on arm64, performing the alignment costs 112 bytes in > > the worst case, and the hole between the 'tfm' and '__ctx' members takes > > up another 120 bytes, resulting in an increased stack footprint of up to > > 232 bytes.) So instead, let's switch to the minimum SLAB alignment, which > > does not take DMA constraints into account. > > > > Note that this is a no-op for x86. > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > --- > > include/crypto/hash.h | 8 ++++---- > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/include/crypto/hash.h b/include/crypto/hash.h > > index af2ff31ff619..13f8a6a54ca8 100644 > > --- a/include/crypto/hash.h > > +++ b/include/crypto/hash.h > > @@ -149,7 +149,7 @@ struct ahash_alg { > > > > struct shash_desc { > > struct crypto_shash *tfm; > > - void *__ctx[] CRYPTO_MINALIGN_ATTR; > > + void *__ctx[] __aligned(ARCH_SLAB_MINALIGN); > > }; > > > > #define HASH_MAX_DIGESTSIZE 64 > > @@ -162,9 +162,9 @@ struct shash_desc { > > > > #define HASH_MAX_STATESIZE 512 > > > > -#define SHASH_DESC_ON_STACK(shash, ctx) \ > > - char __##shash##_desc[sizeof(struct shash_desc) + \ > > - HASH_MAX_DESCSIZE] CRYPTO_MINALIGN_ATTR; \ > > +#define SHASH_DESC_ON_STACK(shash, ctx) \ > > + char __##shash##_desc[sizeof(struct shash_desc) + HASH_MAX_DESCSIZE] \ > > + __aligned(__alignof__(struct shash_desc)); \ > > struct shash_desc *shash = (struct shash_desc *)__##shash##_desc > > Looks good to me, but it would be helpful if the comment above the definition of > CRYPTO_MINALIGN in include/linux/crypto.h was updated. > I'd be inclined to update CRYPTO_MINALIGN altogether, given that there should be very few cases where this actually matter (we've had to fix some non-coherent DMA issues in the past, but in the general case, all buffers that are passed to devices for DMA should be described via scatterlists, and I don't think we permit pointing the scatterlist into request structures)
On Fri, Jan 08, 2021 at 09:36:23AM +0100, Ard Biesheuvel wrote: > > scatterlists, and I don't think we permit pointing the scatterlist > into request structures) Not only do we allow that, we do that in lots of places. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Fri, 8 Jan 2021 at 10:23, Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Fri, Jan 08, 2021 at 09:36:23AM +0100, Ard Biesheuvel wrote: > > > > scatterlists, and I don't think we permit pointing the scatterlist > > into request structures) > > Not only do we allow that, we do that in lots of places. > Fair enough. So changing CRYPTO_MINALIGN in general may trigger DMA related bugs in non-coherent devices.
On Fri, Jan 8, 2021 at 10:22 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Fri, Jan 08, 2021 at 09:36:23AM +0100, Ard Biesheuvel wrote: > > > > scatterlists, and I don't think we permit pointing the scatterlist > > into request structures) > > Not only do we allow that, we do that in lots of places. How does this work for kernels with CONFIG_VMAP_STACK? I remember some other subsystems (usb, hid) adding workarounds for that, but I don't see those in drivers/crypto Arnd
On Fri, Jan 08, 2021 at 11:42:53AM +0100, Arnd Bergmann wrote: > > How does this work for kernels with CONFIG_VMAP_STACK? > I remember some other subsystems (usb, hid) adding workarounds > for that, but I don't see those in drivers/crypto I'm referring to the situation in general and not the subject of this thread. For shash_desc of course it doesn't happen since it sync only. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Fri, Jan 8, 2021 at 11:44 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Fri, Jan 08, 2021 at 11:42:53AM +0100, Arnd Bergmann wrote: > > > > How does this work for kernels with CONFIG_VMAP_STACK? > > I remember some other subsystems (usb, hid) adding workarounds > > for that, but I don't see those in drivers/crypto > > I'm referring to the situation in general and not the subject of > this thread. > > For shash_desc of course it doesn't happen since it sync only. Ah, got it. Arnd
On Fri, 8 Jan 2021 at 11:59, Arnd Bergmann <arnd@kernel.org> wrote: > > On Fri, Jan 8, 2021 at 11:44 AM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > > On Fri, Jan 08, 2021 at 11:42:53AM +0100, Arnd Bergmann wrote: > > > > > > How does this work for kernels with CONFIG_VMAP_STACK? > > > I remember some other subsystems (usb, hid) adding workarounds > > > for that, but I don't see those in drivers/crypto > > > > I'm referring to the situation in general and not the subject of > > this thread. > > > > For shash_desc of course it doesn't happen since it sync only. > > Ah, got it. > I can fold in the following --- a/include/linux/crypto.h +++ b/include/linux/crypto.h @@ -151,9 +151,12 @@ * The macro CRYPTO_MINALIGN_ATTR (along with the void * type in the actual * declaration) is used to ensure that the crypto_tfm context structure is * aligned correctly for the given architecture so that there are no alignment - * faults for C data types. In particular, this is required on platforms such - * as arm where pointers are 32-bit aligned but there are data types such as - * u64 which require 64-bit alignment. + * faults for C data types. On architectures that support non-cache coherent + * DMA, such as ARM or arm64, it also takes into account the minimal alignment + * that is required to ensure that the context struct member does not share any + * cachelines with the rest of the struct. This is needed to ensure that cache + * maintenance for non-coherent DMA does not affect data that may be accessed + * by the CPU concurrently. */ #define CRYPTO_MINALIGN ARCH_KMALLOC_MINALIGN (or send it as a separate patch) Note that the comment about 64-bit alignment for 64-bit types on ARM is incorrect.
diff --git a/include/crypto/hash.h b/include/crypto/hash.h index af2ff31ff619..13f8a6a54ca8 100644 --- a/include/crypto/hash.h +++ b/include/crypto/hash.h @@ -149,7 +149,7 @@ struct ahash_alg { struct shash_desc { struct crypto_shash *tfm; - void *__ctx[] CRYPTO_MINALIGN_ATTR; + void *__ctx[] __aligned(ARCH_SLAB_MINALIGN); }; #define HASH_MAX_DIGESTSIZE 64 @@ -162,9 +162,9 @@ struct shash_desc { #define HASH_MAX_STATESIZE 512 -#define SHASH_DESC_ON_STACK(shash, ctx) \ - char __##shash##_desc[sizeof(struct shash_desc) + \ - HASH_MAX_DESCSIZE] CRYPTO_MINALIGN_ATTR; \ +#define SHASH_DESC_ON_STACK(shash, ctx) \ + char __##shash##_desc[sizeof(struct shash_desc) + HASH_MAX_DESCSIZE] \ + __aligned(__alignof__(struct shash_desc)); \ struct shash_desc *shash = (struct shash_desc *)__##shash##_desc /**
Unlike many other structure types defined in the crypto API, the 'shash_desc' structure is permitted to live on the stack, which implies its contents may not be accessed by DMA masters. (This is due to the fact that the stack may be located in the vmalloc area, which requires a different virtual-to-physical translation than the one implemented by the DMA subsystem) Our definition of CRYPTO_MINALIGN_ATTR is based on ARCH_KMALLOC_MINALIGN, which may take DMA constraints into account on architectures that support non-cache coherent DMA such as ARM and arm64. In this case, the value is chosen to reflect the largest cacheline size in the system, in order to ensure that explicit cache maintenance as required by non-coherent DMA masters does not affect adjacent, unrelated slab allocations. On arm64, this value is currently set at 128 bytes. This means that applying CRYPTO_MINALIGN_ATTR to struct shash_desc is both unnecessary (as it is never used for DMA), and undesirable, given that it wastes stack space (on arm64, performing the alignment costs 112 bytes in the worst case, and the hole between the 'tfm' and '__ctx' members takes up another 120 bytes, resulting in an increased stack footprint of up to 232 bytes.) So instead, let's switch to the minimum SLAB alignment, which does not take DMA constraints into account. Note that this is a no-op for x86. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- include/crypto/hash.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)