Message ID | 20241017000051.228294-1-ebiggers@kernel.org |
---|---|
Headers | show |
Series | AEGIS x86 assembly tuning | expand |
Eric Biggers <ebiggers@kernel.org> wrote: > This series cleans up the AES-NI optimized implementation of AEGIS-128. > > Performance is improved by 1-5% depending on the input lengths. Binary > code size is reduced by about 20% (measuring glue + assembly combined), > and source code length is reduced by about 150 lines. > > The first patch also fixes a bug which could theoretically cause > incorrect behavior but was seemingly not being encountered in practice. > > Note: future optimizations for AEGIS-128 could involve adding AVX512 / > AVX10 optimized assembly code. However, unfortunately due to the way > that AEGIS-128 is specified, its level of parallelism is limited, and it > can't really take advantage of vector lengths greater than 128 bits. > So, probably this would provide only another modest improvement, mostly > coming from being able to use the ternary logic instructions. > > Changed in v2: > - Put assoclen and cryptlen in the correct order in the prototype of > aegis128_aesni_final(). > - Expanded commit message of "eliminate some indirect calls" > - Added Ondrej's Reviewed-by. > > Eric Biggers (10): > crypto: x86/aegis128 - access 32-bit arguments as 32-bit > crypto: x86/aegis128 - remove no-op init and exit functions > crypto: x86/aegis128 - eliminate some indirect calls > crypto: x86/aegis128 - don't bother with special code for aligned data > crypto: x86/aegis128 - optimize length block preparation using SSE4.1 > crypto: x86/aegis128 - improve assembly function prototypes > crypto: x86/aegis128 - optimize partial block handling using SSE4.1 > crypto: x86/aegis128 - take advantage of block-aligned len > crypto: x86/aegis128 - remove unneeded FRAME_BEGIN and FRAME_END > crypto: x86/aegis128 - remove unneeded RETs > > arch/x86/crypto/Kconfig | 4 +- > arch/x86/crypto/aegis128-aesni-asm.S | 532 ++++++++++---------------- > arch/x86/crypto/aegis128-aesni-glue.c | 145 ++++--- > 3 files changed, 261 insertions(+), 420 deletions(-) > > base-commit: 5c20772738e1d1d7bec41664eb9d61497e53c10e All applied. Thanks.