Message ID | 20210203113626.220151-1-ardb@kernel.org |
---|---|
Headers | show |
Series | arm64: rework NEON yielding to avoid scheduling from asm code | expand |
On Wed, Feb 03, 2021 at 09:31:31PM +0000, Will Deacon wrote: > On Wed, 3 Feb 2021 12:36:17 +0100, Ard Biesheuvel wrote: > > Given how kernel mode NEON code disables preemption (to ensure that the > > FP/SIMD register state is protected without having to context switch it), > > we need to take care not to let those algorithms operate on unbounded > > input data, or we may end up with excessive scheduling blackouts on > > CONFIG_PREEMPT kernels. > > > > This is currently handled by the cond_yield_neon macros, which check the > > preempt count and the TIF_NEED_RESCHED flag from assembler code, and call > > into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule. > > This works as expected, but is a bit messy, given how much of the state > > preserve/restore code in the algorithm needs to be duplicated, as well as > > causing the need to manage the stack frame explicitly. All of this is better > > handled by the compiler, especially now that we have enabled features such > > as the shadow call stack and BTI, and are working to improve call stack > > validation. > > > > [...] > > Applied first patch only to arm64 (for-next/crypto), thanks! Oops, looks like I typo'd the external branch (for-next/crypo). No offense intended! I'll rename it now; SHAs will stay the same. Will
On Thu, Feb 04, 2021 at 10:10:26PM +1100, Herbert Xu wrote: > On Thu, Feb 04, 2021 at 09:29:16AM +0100, Ard Biesheuvel wrote: > > > > Half of the patches in this series conflict with > > > > 0df07d8117c3 crypto: arm64/sha - add missing module aliases > > > > in the cryptodev tree, so that won't work. > > Fair enough. I'll take the patches. Cheers, Herbert. Please just leave the final patch ("arm64: assembler: remove conditional NEON yield macro"), as we can come back to that for 5.13. Will
On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote: > Given how kernel mode NEON code disables preemption (to ensure that the > FP/SIMD register state is protected without having to context switch it), > we need to take care not to let those algorithms operate on unbounded > input data, or we may end up with excessive scheduling blackouts on > CONFIG_PREEMPT kernels. > > This is currently handled by the cond_yield_neon macros, which check the > preempt count and the TIF_NEED_RESCHED flag from assembler code, and call > into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule. > This works as expected, but is a bit messy, given how much of the state > preserve/restore code in the algorithm needs to be duplicated, as well as > causing the need to manage the stack frame explicitly. All of this is better > handled by the compiler, especially now that we have enabled features such > as the shadow call stack and BTI, and are working to improve call stack > validation. > > In some cases, yielding is not necessary at all: algoritms that implement > skciphers and use the skcipher walk API will be invoked at page granularity, > which is granular enough for our purpose. > > In other cases, it is better to simply return early from the assembler > routine if a reschedule is pending, and let the C code handle with this, by > retrying the call until it completes. This removes any voluntary schedule() > calls from the call stack, making the code much easier to reason about in > the context of stack validation, rcu_tasks synchronization, etc. > > Practical note: assuming there are no objections to these changes, it may > be the most convenient to take patch #1 into the arm64 tree for v5.12, > and postpone the rest for merging via the crypto tree. (Note that this > series was created against the cryptodev tree, and so the arm64 maintainers > are also welcome to take the whole set if it applies cleanly to the arm64 > tree) > > Will: if you stick #1 on a separate branch, please base it on v5.11-rc1 > > Changes since v1: > - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield > (patch #1) > > Cc: Dave Martin <dave.martin@arm.com> > Cc: Eric Biggers <ebiggers@google.com> > > Ard Biesheuvel (9): > arm64: assembler: add cond_yield macro > crypto: arm64/sha1-ce - simplify NEON yield > crypto: arm64/sha2-ce - simplify NEON yield > crypto: arm64/sha3-ce - simplify NEON yield > crypto: arm64/sha512-ce - simplify NEON yield > crypto: arm64/aes-neonbs - remove NEON yield calls > crypto: arm64/aes-ce-mac - simplify NEON yield > crypto: arm64/crc-t10dif - move NEON yield to C code > arm64: assembler: remove conditional NEON yield macros > > arch/arm64/crypto/aes-glue.c | 21 +++-- > arch/arm64/crypto/aes-modes.S | 52 +++++-------- > arch/arm64/crypto/aes-neonbs-core.S | 8 +- > arch/arm64/crypto/crct10dif-ce-core.S | 43 +++-------- > arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++-- > arch/arm64/crypto/sha1-ce-core.S | 47 ++++-------- > arch/arm64/crypto/sha1-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha2-ce-core.S | 38 ++++----- > arch/arm64/crypto/sha2-ce-glue.c | 22 +++--- > arch/arm64/crypto/sha3-ce-core.S | 81 ++++++++------------ > arch/arm64/crypto/sha3-ce-glue.c | 14 ++-- > arch/arm64/crypto/sha512-ce-core.S | 29 ++----- > arch/arm64/crypto/sha512-ce-glue.c | 53 +++++++------ > arch/arm64/include/asm/assembler.h | 78 +++---------------- > 14 files changed, 209 insertions(+), 329 deletions(-) Patches 2-8 applied. Thanks. -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt