Message ID | 20240406002610.37202-3-ebiggers@kernel.org |
---|---|
State | New |
Headers | show |
Series | crypto: x86 - add missing vzeroupper instructions | expand |
On Fri, 2024-04-05 at 20:26 -0400, Eric Biggers wrote: > From: Eric Biggers <ebiggers@google.com> > > Since sha256_transform_rorx() uses ymm registers, execute vzeroupper > before returning from it. This is necessary to avoid reducing the > performance of SSE code. > > Fixes: d34a460092d8 ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions") > Signed-off-by: Eric Biggers <ebiggers@google.com> > --- > arch/x86/crypto/sha256-avx2-asm.S | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S > index 9918212faf91..0ffb072be956 100644 > --- a/arch/x86/crypto/sha256-avx2-asm.S > +++ b/arch/x86/crypto/sha256-avx2-asm.S > @@ -714,10 +714,11 @@ SYM_TYPED_FUNC_START(sha256_transform_rorx) > popq %r15 > popq %r14 > popq %r13 > popq %r12 > popq %rbx > + vzeroupper > RET > SYM_FUNC_END(sha256_transform_rorx) > > .section .rodata.cst512.K256, "aM", @progbits, 512 > .align 64 Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S index 9918212faf91..0ffb072be956 100644 --- a/arch/x86/crypto/sha256-avx2-asm.S +++ b/arch/x86/crypto/sha256-avx2-asm.S @@ -714,10 +714,11 @@ SYM_TYPED_FUNC_START(sha256_transform_rorx) popq %r15 popq %r14 popq %r13 popq %r12 popq %rbx + vzeroupper RET SYM_FUNC_END(sha256_transform_rorx) .section .rodata.cst512.K256, "aM", @progbits, 512 .align 64