From patchwork Mon Jan 23 14:05:17 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92214 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233314obz; Mon, 23 Jan 2017 06:05:36 -0800 (PST) X-Received: by 10.99.228.5 with SMTP id a5mr32973391pgi.1.1485180336094; Mon, 23 Jan 2017 06:05:36 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.36; Mon, 23 Jan 2017 06:05:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751206AbdAWOFf (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:35 -0500 Received: from mail-wm0-f53.google.com ([74.125.82.53]:37461 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdAWOFf (ORCPT ); Mon, 23 Jan 2017 09:05:35 -0500 Received: by mail-wm0-f53.google.com with SMTP id c206so157127865wme.0 for ; Mon, 23 Jan 2017 06:05:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CzjTHOxkTE4Qgd4iwVqP3xmJj/nHZVRS08KnoIo68+M=; b=Rea/jWm/nn5qGoFe64Nnxsbi87psL8hwWR8uUj0X8jLnJNRGd0nvpysJC5yY42TuXY 5iNWXAXMYr0YLQlp5nvzgBu2xpEebgcFXS3gDeN4J+EYddZU052pc+0R7aaGMA3BCNAy pxB7DVBFZYxa4GDQKEzXi/gFgBgGeD2b8CZvM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CzjTHOxkTE4Qgd4iwVqP3xmJj/nHZVRS08KnoIo68+M=; b=SdBUb6B56+0Ot36OlKSyHyg6BjBQMqH8CIAy8VC4/EQ1SQ8o0UJ0jPJUm1RofXowDP 4b0q1P+CIU50xJYdZMmmM8ByE802f1rx4QD4yh6zBSwoJZaBIPCuJlnq3pcIjy1MAXgB Ha8nnMhxm1Bm65VAw4RFkE6eYVPoy/AjQBr5OQenUsATm4KC3hyzHpodSwgwdlgR5I+W ueE15gCZVnEqmnQCKv87roUvz1A5HAdK/HEa8GK9i8vbavU95A9TVmnVYN0RjGm5i0x5 j1ICcQKDHcbw/zlShFfN1f43Ib2aYx5wNsver9//dPj1wfIk7f6QRd/Ax3Pe7ndT9/Wt fSvg== X-Gm-Message-State: AIkVDXIzUl9LkBFDuUS0X4E4Mv13dPTibdOd7RFAlI2itAoWidOgYBV+ed7fi4c+vGOvyfh+ X-Received: by 10.28.169.135 with SMTP id s129mr13395764wme.24.1485180333321; Mon, 23 Jan 2017 06:05:33 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:32 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 01/10] crypto: arm64/aes-neon-bs - honour iv_out requirement in CTR mode Date: Mon, 23 Jan 2017 14:05:17 +0000 Message-Id: <1485180326-25612-2-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Update the new bitsliced NEON AES implementation in CTR mode to return the next IV back to the skcipher API client. This is necessary for chaining to work correctly. Note that this is only done if the request is a round multiple of the block size, since otherwise, chaining is impossible anyway. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-neonbs-core.S | 25 +++++++++++++------- 1 file changed, 16 insertions(+), 9 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-neonbs-core.S b/arch/arm64/crypto/aes-neonbs-core.S index 8d0cdaa2768d..2ada12dd768e 100644 --- a/arch/arm64/crypto/aes-neonbs-core.S +++ b/arch/arm64/crypto/aes-neonbs-core.S @@ -874,12 +874,19 @@ CPU_LE( rev x8, x8 ) csel x4, x4, xzr, pl csel x9, x9, xzr, le + tbnz x9, #1, 0f next_ctr v1 + tbnz x9, #2, 0f next_ctr v2 + tbnz x9, #3, 0f next_ctr v3 + tbnz x9, #4, 0f next_ctr v4 + tbnz x9, #5, 0f next_ctr v5 + tbnz x9, #6, 0f next_ctr v6 + tbnz x9, #7, 0f next_ctr v7 0: mov bskey, x2 @@ -928,11 +935,11 @@ CPU_LE( rev x8, x8 ) eor v5.16b, v5.16b, v15.16b st1 {v5.16b}, [x0], #16 - next_ctr v0 +8: next_ctr v0 cbnz x4, 99b 0: st1 {v0.16b}, [x5] -8: ldp x29, x30, [sp], #16 +9: ldp x29, x30, [sp], #16 ret /* @@ -941,23 +948,23 @@ CPU_LE( rev x8, x8 ) */ 1: cbz x6, 8b st1 {v1.16b}, [x5] - b 8b + b 9b 2: cbz x6, 8b st1 {v4.16b}, [x5] - b 8b + b 9b 3: cbz x6, 8b st1 {v6.16b}, [x5] - b 8b + b 9b 4: cbz x6, 8b st1 {v3.16b}, [x5] - b 8b + b 9b 5: cbz x6, 8b st1 {v7.16b}, [x5] - b 8b + b 9b 6: cbz x6, 8b st1 {v2.16b}, [x5] - b 8b + b 9b 7: cbz x6, 8b st1 {v5.16b}, [x5] - b 8b + b 9b ENDPROC(aesbs_ctr_encrypt) From patchwork Mon Jan 23 14:05:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92215 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233343obz; Mon, 23 Jan 2017 06:05:39 -0800 (PST) X-Received: by 10.99.178.21 with SMTP id x21mr32390887pge.48.1485180338604; Mon, 23 Jan 2017 06:05:38 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.38; Mon, 23 Jan 2017 06:05:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751230AbdAWOFh (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:37 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:37502 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdAWOFh (ORCPT ); Mon, 23 Jan 2017 09:05:37 -0500 Received: by mail-wm0-f44.google.com with SMTP id c206so157129865wme.0 for ; Mon, 23 Jan 2017 06:05:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=CGos+sBofTDyAJuyz2KZt8Y+XtFhAMAw28miaEg4olBmCnnddCyZn7fHvmwoOceNxm LZYHnN1aIziifntwZT1zS8WqFEtOvHxKM0wtLEQWc8Yz2YseX/ybSPaIhkpG2Rk1OpRI ac2TJPxsMwgRwBcE6bfzOgxWiEd726GbjAMko= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=IT/qlahneUIsFtP/EyrTxFvw3QIswuDB7NVWLM7SP2NhwYIbtdLjc+uOguGY9VoaD5 GrwfIwD2PhUkyI6mrWeqK0rO+rZ0u/WD3hXmmnzXFAQpe03mGJ6rS1/Cs41tGjUFRTyi SY613dTGRiZ3tTKIZzxMTdhNzfO+jfOLgNsiaV9tIo0cZjojYNx54nXoKt4BU1tEriwy 571QynwA6JiliIS2g4v3+gVthASzaa5aRhtKMWUt3HoQwtrNqMULFvmAqEzoDZ5rOk0E o4E1+yngBTCmDWEqLDaInJRl6j4dWQGaVke84fXgXJNdUXsJL8ZyRIEOaDE+gJ6bu+nF V9aw== X-Gm-Message-State: AIkVDXIJNDVr5SAZ8+zilusrqZXJz4A0EhfTZRlEfO7aD4EB8zjlONQ297M8inim+32512kv X-Received: by 10.28.23.66 with SMTP id 63mr14434300wmx.46.1485180335673; Mon, 23 Jan 2017 06:05:35 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:34 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 02/10] crypto: arm/aes-ce - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:18 +0000 Message-Id: <1485180326-25612-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 84 ++++++++++---------- arch/arm/crypto/aes-ce-glue.c | 15 ++-- 2 files changed, 47 insertions(+), 52 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 987aa632c9f0..ba8e6a32fdc9 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt) .Lecbencloop3x: subs r4, r4, #3 bmi .Lecbenc1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_encrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbencloop3x .Lecbenc1x: adds r4, r4, #3 beq .Lecbencout .Lecbencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbencloop .Lecbencout: @@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt) .Lecbdecloop3x: subs r4, r4, #3 bmi .Lecbdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_decrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbdecloop3x .Lecbdec1x: adds r4, r4, #3 beq .Lecbdecout .Lecbdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbdecloop .Lecbdecout: @@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt) vld1.8 {q0}, [r5] prepare_key r2, r3 .Lcbcencloop: - vld1.8 {q1}, [r1, :64]! @ get next pt block + vld1.8 {q1}, [r1]! @ get next pt block veor q0, q0, q1 @ ..and xor with iv bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcencloop vst1.8 {q0}, [r5] @@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt) .Lcbcdecloop3x: subs r4, r4, #3 bmi .Lcbcdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! vmov q3, q0 vmov q4, q1 vmov q5, q2 @@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt) veor q1, q1, q3 veor q2, q2, q4 vmov q6, q5 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lcbcdecloop3x .Lcbcdec1x: adds r4, r4, #3 beq .Lcbcdecout vmov q15, q14 @ preserve last round key .Lcbcdecloop: - vld1.8 {q0}, [r1, :64]! @ get next ct block + vld1.8 {q0}, [r1]! @ get next ct block veor q14, q15, q6 @ combine prev ct with last key vmov q6, q0 bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcdecloop .Lcbcdecout: @@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt) rev ip, r6 add r6, r6, #1 vmov s11, ip - vld1.8 {q3-q4}, [r1, :64]! - vld1.8 {q5}, [r1, :64]! + vld1.8 {q3-q4}, [r1]! + vld1.8 {q5}, [r1]! bl aes_encrypt_3x veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 rev ip, r6 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! vmov s27, ip b .Lctrloop3x .Lctr1x: @@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt) vmov q0, q6 bl aes_encrypt subs r4, r4, #1 - bmi .Lctrhalfblock @ blocks < 0 means 1/2 block - vld1.8 {q3}, [r1, :64]! + bmi .Lctrtailblock @ blocks < 0 means tail block + vld1.8 {q3}, [r1]! veor q3, q0, q3 - vst1.8 {q3}, [r0, :64]! + vst1.8 {q3}, [r0]! adds r6, r6, #1 @ increment BE ctr rev ip, r6 @@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt) vst1.8 {q6}, [r5] pop {r4-r6, pc} -.Lctrhalfblock: - vld1.8 {d1}, [r1, :64] - veor d0, d0, d1 - vst1.8 {d0}, [r0, :64] +.Lctrtailblock: + vst1.8 {q0}, [r0, :64] @ return just the key stream pop {r4-r6, pc} .Lctrcarry: @@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt) .Lxtsenc3x: subs r4, r4, #3 bmi .Lxtsenc1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsencout @@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt) adds r4, r4, #3 beq .Lxtsencout .Lxtsencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 bl aes_encrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsencout next_tweak q3, q3, q7, q6 @@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt) .Lxtsdec3x: subs r4, r4, #3 bmi .Lxtsdec1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsdecout @@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt) adds r4, r4, #3 beq .Lxtsdecout .Lxtsdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 add ip, r2, #32 @ 3rd round key bl aes_decrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsdecout next_tweak q3, q3, q7, q6 diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 8857531915bf..883b84d828c5 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, + ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, num_rounds(ctx), blocks, walk.iv); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE, From patchwork Mon Jan 23 14:05:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92216 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233356obz; Mon, 23 Jan 2017 06:05:40 -0800 (PST) X-Received: by 10.84.233.193 with SMTP id m1mr43683808pln.126.1485180340215; Mon, 23 Jan 2017 06:05:40 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.40; Mon, 23 Jan 2017 06:05:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751211AbdAWOFj (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:39 -0500 Received: from mail-wm0-f50.google.com ([74.125.82.50]:34894 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751234AbdAWOFj (ORCPT ); Mon, 23 Jan 2017 09:05:39 -0500 Received: by mail-wm0-f50.google.com with SMTP id r126so134566502wmr.0 for ; Mon, 23 Jan 2017 06:05:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Twhl3TaN4Yleujf68Qk0fFCweF7qBJDAw0TmYcZHr54=; b=LcEw6/q864dt5+4omHGy9/9HWRqXGXfBCSiHexSequfCutwlCXKYagmwhg2MMmCz3J TsW4W4fGLBnZu0h6sa6hStHq12M5wfQFBfwddxkQSZVVcLBAH41BtD404FXkrp+C2r3O zatWw/XYbmeKufoj84tV3xo0dcR4c0NS1rUM4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Twhl3TaN4Yleujf68Qk0fFCweF7qBJDAw0TmYcZHr54=; b=RnMAN2jtZ8XRBD91ImSJtRZ4mF9caxGbPSfg4QIe2MMFUBlWGIojvdNpP3ZYoOQwl7 mQCReOcb2ZWpda6ETca1Nx1CXjRx/mCaqOnDBQhU8PtL7nHcJaPn82LaegmIb3Y3vlSX i4rCcV4yBEel0MAur4zt6i01IjZWKOKIFxmOxE7fKRGjx8Kyw2ZZJkE0JHNTwLGxS65D 8ZZq4sCsaEoCUvg2otB3sCaOihFamWWAwFZ/+k94zoWeN4oL3395ldhBG6Lm7YnBcSte EzukKP4QXTJP9YKNylhF7hxxBkASIhQe6dq+cSHgva+90216r8EJf1EajHzUB8Efu+79 7+TQ== X-Gm-Message-State: AIkVDXL/6X+DQ2YjTA/5cYrO6IFPvS0rn3f8WaXrWbMOuW0xzzOey5UjTXfPkZ0sWRLlhDNH X-Received: by 10.28.152.79 with SMTP id a76mr13294204wme.16.1485180337640; Mon, 23 Jan 2017 06:05:37 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:37 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 03/10] crypto: arm/chacha20 - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:19 +0000 Message-Id: <1485180326-25612-4-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/chacha20-neon-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c index 592f75ae4fa1..59a7be08e80c 100644 --- a/arch/arm/crypto/chacha20-neon-glue.c +++ b/arch/arm/crypto/chacha20-neon-glue.c @@ -94,7 +94,6 @@ static struct skcipher_alg alg = { .base.cra_priority = 300, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha20_ctx), - .base.cra_alignmask = 1, .base.cra_module = THIS_MODULE, .min_keysize = CHACHA20_KEY_SIZE, From patchwork Mon Jan 23 14:05:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92217 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233384obz; Mon, 23 Jan 2017 06:05:42 -0800 (PST) X-Received: by 10.99.137.199 with SMTP id v190mr30635873pgd.120.1485180342465; Mon, 23 Jan 2017 06:05:42 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.42; Mon, 23 Jan 2017 06:05:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751234AbdAWOFl (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:41 -0500 Received: from mail-wm0-f54.google.com ([74.125.82.54]:34925 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdAWOFl (ORCPT ); Mon, 23 Jan 2017 09:05:41 -0500 Received: by mail-wm0-f54.google.com with SMTP id r126so134568044wmr.0 for ; Mon, 23 Jan 2017 06:05:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uEavdoflWkEgxBCfHtZRMITUTt3rHQF+vKAmVwYFXzM=; b=QJ1hFS1riGlSaav9bynYkA8Yf/qv+08EtHnt+mmKR8YBBH7iZObWtJF8qiEegjBShH q1UwLtnMDKcI8oCwKtsGt8kC6CcTJIbb2565A0lWCIcL/N+Au+YXBSrd9r6pjnyXTPmg fVGNXc2ED8NKf6tuK7XWFG8Olv+Se3Oh15vUY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uEavdoflWkEgxBCfHtZRMITUTt3rHQF+vKAmVwYFXzM=; b=oOeZcf20MZNelJqMistOzoJDUKR2SOobtmGWBWoItkIBR879DR5ZJaRqRCFD5wtkOQ YbgG6fMLJlCQNEqLuZ7nR7pwv2SpQcnEBsFs3rgBkZvXOmxwTx0hw4kZb1SMAGiv8x6G p9fayjz00BIUwcT5EocGd5JjweNBmatDUBue4UP/s3te77PtXWZRUf8rJ6Ji7edPD8HB qxcKbewBV3MkcjnvgVpFOCMOORKUeS5Qlv0Vta8eUGYbeIOXI8OJrEUua4nJGrb6wUEw liCfgazhOIfb7FA6Wp1Yvt1xSRMPYon6WfRX9RA7EqZXKvIXZzl2bC2d163nFEN9lTzz pzig== X-Gm-Message-State: AIkVDXIEjsmcpzW35unOS6BjatUKmQjkg2yaddRuUJdUxicGgnNB+u1J6FSuBuznstc+1lk0 X-Received: by 10.28.136.13 with SMTP id k13mr14575803wmd.94.1485180339513; Mon, 23 Jan 2017 06:05:39 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:38 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 04/10] crypto: arm64/aes-ce-ccm - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:20 +0000 Message-Id: <1485180326-25612-5-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index cc5515dac74a..6a7dbc7c83a6 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -258,7 +258,6 @@ static struct aead_alg ccm_aes_alg = { .cra_priority = 300, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .ivsize = AES_BLOCK_SIZE, From patchwork Mon Jan 23 14:05:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92218 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233398obz; Mon, 23 Jan 2017 06:05:44 -0800 (PST) X-Received: by 10.99.228.5 with SMTP id a5mr32974196pgi.1.1485180344058; Mon, 23 Jan 2017 06:05:44 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.43; Mon, 23 Jan 2017 06:05:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751113AbdAWOFn (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:43 -0500 Received: from mail-wm0-f53.google.com ([74.125.82.53]:38832 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828AbdAWOFm (ORCPT ); Mon, 23 Jan 2017 09:05:42 -0500 Received: by mail-wm0-f53.google.com with SMTP id r144so156906950wme.1 for ; Mon, 23 Jan 2017 06:05:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jhswGRPz3ouSFevuzpnYnMoNl4tRT+sWbtop/inoM6k=; b=YRCRPMDkukJGN+P5po8JtQkuvUK3xf7oqB+/Eaug1dqc7XVUQNWLadb+h9TRDUveEb hKhmrJKMR21EcBxvf+gDYs40cOIzUwwVgkZC2OTH0jiW6EewL8n9lvRsbqNQ1TIwIYir 7JMTz9h3Q5wc/Dh39I3NU7knSfQh0R8t8h020= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jhswGRPz3ouSFevuzpnYnMoNl4tRT+sWbtop/inoM6k=; b=t+tWL51Ea65I78uIVpE73vfrROr79aVPrNz6qGXD2uk3qzFATSnfsTEtcts6+ILhJb 8WtVrfJCQMCsnhnN+hcXU9a1nIEOndDvmYiKxWLt84OxU9UvRqBH1Z8B6KDvd6azFZ4W 8JyZNbJamfQgocGuyslIrUUIROh5jLRQxMRQsfn16ViWIIayMeWDm8Oxe7AQoMtMA+WK 4qT6ez/b8w9NWG6RUz0LgdcYNrEDobdLdwBkgXaA/fnjO1wzEvXm+WpgPIJ1L1Mcr0hC eN0lcqicZbRfkrDiuzzoT4odjH0fG45g0US8LYF1M6B+darp2CkYVSIPgbrb3ha8gdEE WH0Q== X-Gm-Message-State: AIkVDXIz7RKmVBJTHY/bgO4Mg1M2IsrXA+1OstcHt0kp9D6j/FARXQzzqLGPfY9E8FO1VzaZ X-Received: by 10.28.103.69 with SMTP id b66mr13118122wmc.73.1485180341277; Mon, 23 Jan 2017 06:05:41 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:40 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 05/10] crypto: arm64/aes-blk - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:21 +0000 Message-Id: <1485180326-25612-6-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 16 ++++++---------- arch/arm64/crypto/aes-modes.S | 8 +++----- 2 files changed, 9 insertions(+), 15 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 5164aaf82c6a..8ee1fb7aaa4f 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -215,14 +215,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, rounds, + aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds, blocks, walk.iv, first); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -282,7 +283,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -298,7 +298,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -315,7 +314,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -332,7 +330,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_priority = PRIO - 1, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -350,7 +347,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE, diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 838dad5c209f..92b982a8b112 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -337,7 +337,7 @@ AES_ENTRY(aes_ctr_encrypt) .Lctrcarrydone: subs w4, w4, #1 - bmi .Lctrhalfblock /* blocks < 0 means 1/2 block */ + bmi .Lctrtailblock /* blocks <0 means tail block */ ld1 {v3.16b}, [x1], #16 eor v3.16b, v0.16b, v3.16b st1 {v3.16b}, [x0], #16 @@ -348,10 +348,8 @@ AES_ENTRY(aes_ctr_encrypt) FRAME_POP ret -.Lctrhalfblock: - ld1 {v3.8b}, [x1] - eor v3.8b, v0.8b, v3.8b - st1 {v3.8b}, [x0] +.Lctrtailblock: + st1 {v0.16b}, [x0] FRAME_POP ret From patchwork Mon Jan 23 14:05:22 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92219 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233414obz; Mon, 23 Jan 2017 06:05:45 -0800 (PST) X-Received: by 10.98.198.199 with SMTP id x68mr32240496pfk.87.1485180345418; Mon, 23 Jan 2017 06:05:45 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.45; Mon, 23 Jan 2017 06:05:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751156AbdAWOFo (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:44 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:35971 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828AbdAWOFo (ORCPT ); Mon, 23 Jan 2017 09:05:44 -0500 Received: by mail-wm0-f44.google.com with SMTP id c85so134181823wmi.1 for ; Mon, 23 Jan 2017 06:05:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zfpd1eO7xxGzNokheRELTOmW0w9HWUScNlNa89gvn3Y=; b=bDNP1+c5TDCCAUUDx3tU9Vs9kXg2qLo6SacXuYr0FnZB24bDETU+4NZa32uRQN6mPZ SUzgTCd2oBtdx5Rc4OHGQ+oXBkSfQqYUM4M85GCAhgkglETwrpNFNYq1XvS1+Jv9hf9U iaTohbBCCOiJViL0EYDNpQ1tTUmhvsgtxRmpY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zfpd1eO7xxGzNokheRELTOmW0w9HWUScNlNa89gvn3Y=; b=ii/nK3FGFCgJjl4m0j0GQ7P6cO6X8sWh5FYOVdH4GfjcpWW27a6/7UpIa3igiRkKLW WE2xVLtQionwlmmKKAvUXNwvvnyqMNLjFubm83OkokcC4h7zYCu79cvjY46ISpHzb8ta WJX3L7PfP9zNjAaloX2tN024YiFdTkNE/6+5fRBVFFFE2jjF2TxPPegS6MWx+USRrfT5 ntDtc3VF4Ojt8JGCr6PhB3TIJbDAOle78shDbKcAiXAaGqHxqh7XlQTtQBcjhMKeZhqN mK3oadR0TqUO6igSULHeu6cl9URsI/bGahNlC0BB5GyYt+6eH+Xz/gNfqaE3fCbmSTGF k6nw== X-Gm-Message-State: AIkVDXIlBU7EPFKW2EfPEXQVATDYhxKINraykJF4fmkusIei1CBgucNfxMa/65JP/nHC/crn X-Received: by 10.28.4.216 with SMTP id 207mr13146805wme.45.1485180342837; Mon, 23 Jan 2017 06:05:42 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:42 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 06/10] crypto: arm64/chacha20 - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:22 +0000 Message-Id: <1485180326-25612-7-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/chacha20-neon-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c index a7f2337d46cf..a7cd575ea223 100644 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ b/arch/arm64/crypto/chacha20-neon-glue.c @@ -93,7 +93,6 @@ static struct skcipher_alg alg = { .base.cra_priority = 300, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha20_ctx), - .base.cra_alignmask = 1, .base.cra_module = THIS_MODULE, .min_keysize = CHACHA20_KEY_SIZE, From patchwork Mon Jan 23 14:05:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92220 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233430obz; Mon, 23 Jan 2017 06:05:46 -0800 (PST) X-Received: by 10.99.167.74 with SMTP id w10mr33275420pgo.2.1485180346824; Mon, 23 Jan 2017 06:05:46 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.46; Mon, 23 Jan 2017 06:05:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751226AbdAWOFq (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:46 -0500 Received: from mail-wm0-f45.google.com ([74.125.82.45]:34995 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828AbdAWOFp (ORCPT ); Mon, 23 Jan 2017 09:05:45 -0500 Received: by mail-wm0-f45.google.com with SMTP id r126so134571186wmr.0 for ; Mon, 23 Jan 2017 06:05:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=i8PTGdBLEln38Rqn6mFXAiQkiYiMzfbJpCdlXeMJ1E4=; b=gyT2mH5b7Y+c3AHrdUDhHPvkRo8EgNPXx3JE3qIY1t01rhoQeJGqe9gI0Fc2gdzend g+eznoznMNFB2XNF7pxCL0lEW1nnnRDasURqBgRUL6jzWdgD7rdGcq07umrZ+lpaLt8l zzKAWw3t3ZQPZlPSG1eefyU5Iy7wmwPp5GxME= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=i8PTGdBLEln38Rqn6mFXAiQkiYiMzfbJpCdlXeMJ1E4=; b=tpbZF0oTkCkwHn5NhEtoXqAECZxiiD2qGGHyn32H+CygB27HCPmhHd1YuDzD5jiRab i72IXRU9f6J0A89aj2yrwW26ZFZMaG6MvuL6HRNZFsVB6jGMi2MUgzWxcHGrfBCd2CGX oYVWWES5FpdFVF/8Db3SkpDEPAQe5MjyaX2soNomQJWHz3rkF+FwjSxz9VARucK7oKVs G6Mfmhq85GrE+mYeq+jeSZznBMDOumXumlrngSUdicpfI0QfEyuWn2PKnQxcNJLHcBWt 34sBIWnPjm9Bal4LAPe+NeDnD9ACqEMfsw4V9lTitPjg6Y/zx4/BgWj2PRvZN0sP/QTi WIVw== X-Gm-Message-State: AIkVDXLKWilylQDUGye4GG2gmbPXvVnI7Ptt/2LF8XeUm8u6Or5Xcl9EbQaE1NHOZ08TLt6d X-Received: by 10.28.221.11 with SMTP id u11mr13172498wmg.75.1485180344269; Mon, 23 Jan 2017 06:05:44 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:43 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 07/10] crypto: arm64/aes - avoid literals for cross-module symbol references Date: Mon, 23 Jan 2017 14:05:23 +0000 Message-Id: <1485180326-25612-8-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Using simple adrp/add pairs to refer to the AES lookup tables exposed by the generic AES driver (which could be loaded far away from this driver when KASLR is in effect) was unreliable at module load time before commit 41c066f2c4d4 ("arm64: assembler: make adr_l work in modules under KASLR"), which is why the AES code used literals instead. So now we can get rid of the literals, and switch to the adr_l macro. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-cipher-core.S | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S index 37590ab8121a..cd58c61e6677 100644 --- a/arch/arm64/crypto/aes-cipher-core.S +++ b/arch/arm64/crypto/aes-cipher-core.S @@ -89,8 +89,8 @@ CPU_BE( rev w8, w8 ) eor w7, w7, w11 eor w8, w8, w12 - ldr tt, =\ttab - ldr lt, =\ltab + adr_l tt, \ttab + adr_l lt, \ltab tbnz rounds, #1, 1f @@ -111,9 +111,6 @@ CPU_BE( rev w8, w8 ) stp w5, w6, [out] stp w7, w8, [out, #8] ret - - .align 4 - .ltorg .endm .align 5 From patchwork Mon Jan 23 14:05:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92221 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233444obz; Mon, 23 Jan 2017 06:05:48 -0800 (PST) X-Received: by 10.98.144.218 with SMTP id q87mr32295918pfk.51.1485180348632; Mon, 23 Jan 2017 06:05:48 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.48; Mon, 23 Jan 2017 06:05:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751235AbdAWOFr (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:47 -0500 Received: from mail-wm0-f50.google.com ([74.125.82.50]:36021 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828AbdAWOFr (ORCPT ); Mon, 23 Jan 2017 09:05:47 -0500 Received: by mail-wm0-f50.google.com with SMTP id c85so134184190wmi.1 for ; Mon, 23 Jan 2017 06:05:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Cy9yiOmXCCJTU3CWvHqx5pREGmQmzaCRklJ9F/Bb1ZQ=; b=HxRQvI9x2cHgThq1C4lJje2VWWP+9KO8ZgfwBydyVtEt+hFNUh2g1Bwkxb1K6QrMf+ fDeKvKvONI4MHEhHp2d9SYL43KxoZ45B3+0IvQdpXJAizEDznmhuYyFcXpnoAKeSzjQ9 Vhhd9r/qwVyHg3px8jMPb8SSSD3JNkBH6k38I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Cy9yiOmXCCJTU3CWvHqx5pREGmQmzaCRklJ9F/Bb1ZQ=; b=hiAmws9Xba2G2zA8B/yy8vTIRNG5u7Q0q6k5Gti0OL/XhjAKtG8+1Z4rtjIQfwTiaP 2rRMHQZpfyEtCqB+7MGENSdONi+wpLLcPKRK8DGJac03BkLaDGJfTDDP5kf2KlFWOfPr mqxfJXSKYKSOT80/S95/Xs2mJDOBGOgStfwRUCqaGARuj5J6LqHII7Wiyo4seC8ZiAF+ xSDopCYljv/NWl0TJhe2f9mTsoGn7cfQ4Zh1Q+FsHwRb1ye2H01kL5J3CO6BnJS7XtCy 4Cp4z2TzsIWsRzxNVQvlyfKeDj2KzffmjZZ1N97Mr8C3tLowpJXXe4+A13nhywao/eZc 4IQw== X-Gm-Message-State: AIkVDXJVvLl1rFIV/OOKllHAPmtrM8P3MH2sysxOOpOmfoaXPD5Dlfz2xoIsHJo+kxIX19Kr X-Received: by 10.28.137.211 with SMTP id l202mr12988680wmd.88.1485180345707; Mon, 23 Jan 2017 06:05:45 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:45 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 08/10] crypto: arm64/aes - performance tweak Date: Mon, 23 Jan 2017 14:05:24 +0000 Message-Id: <1485180326-25612-9-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Shuffle some instructions around in the __hround macro to shave off 0.1 cycles per byte on Cortex-A57. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-cipher-core.S | 52 +++++++------------- 1 file changed, 19 insertions(+), 33 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S index cd58c61e6677..f2f9cc519309 100644 --- a/arch/arm64/crypto/aes-cipher-core.S +++ b/arch/arm64/crypto/aes-cipher-core.S @@ -20,46 +20,32 @@ tt .req x4 lt .req x2 - .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc - ldp \out0, \out1, [rk], #8 - - ubfx w13, \in0, #0, #8 - ubfx w14, \in1, #8, #8 - ldr w13, [tt, w13, uxtw #2] - ldr w14, [tt, w14, uxtw #2] - + .macro __pair, enc, reg0, reg1, in0, in1e, in1d, shift + ubfx \reg0, \in0, #\shift, #8 .if \enc - ubfx w17, \in1, #0, #8 - ubfx w18, \in2, #8, #8 + ubfx \reg1, \in1e, #\shift, #8 .else - ubfx w17, \in3, #0, #8 - ubfx w18, \in0, #8, #8 + ubfx \reg1, \in1d, #\shift, #8 .endif - ldr w17, [tt, w17, uxtw #2] - ldr w18, [tt, w18, uxtw #2] + ldr \reg0, [tt, \reg0, uxtw #2] + ldr \reg1, [tt, \reg1, uxtw #2] + .endm - ubfx w15, \in2, #16, #8 - ubfx w16, \in3, #24, #8 - ldr w15, [tt, w15, uxtw #2] - ldr w16, [tt, w16, uxtw #2] + .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc + ldp \out0, \out1, [rk], #8 - .if \enc - ubfx \t0, \in3, #16, #8 - ubfx \t1, \in0, #24, #8 - .else - ubfx \t0, \in1, #16, #8 - ubfx \t1, \in2, #24, #8 - .endif - ldr \t0, [tt, \t0, uxtw #2] - ldr \t1, [tt, \t1, uxtw #2] + __pair \enc, w13, w14, \in0, \in1, \in3, 0 + __pair \enc, w15, w16, \in1, \in2, \in0, 8 + __pair \enc, w17, w18, \in2, \in3, \in1, 16 + __pair \enc, \t0, \t1, \in3, \in0, \in2, 24 eor \out0, \out0, w13 - eor \out1, \out1, w17 - eor \out0, \out0, w14, ror #24 - eor \out1, \out1, w18, ror #24 - eor \out0, \out0, w15, ror #16 - eor \out1, \out1, \t0, ror #16 - eor \out0, \out0, w16, ror #8 + eor \out1, \out1, w14 + eor \out0, \out0, w15, ror #24 + eor \out1, \out1, w16, ror #24 + eor \out0, \out0, w17, ror #16 + eor \out1, \out1, w18, ror #16 + eor \out0, \out0, \t0, ror #8 eor \out1, \out1, \t1, ror #8 .endm From patchwork Mon Jan 23 14:05:25 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92222 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233467obz; Mon, 23 Jan 2017 06:05:51 -0800 (PST) X-Received: by 10.84.173.168 with SMTP id p37mr43490962plb.30.1485180351299; Mon, 23 Jan 2017 06:05:51 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.51; Mon, 23 Jan 2017 06:05:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751237AbdAWOFu (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:50 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:37706 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbdAWOFt (ORCPT ); Mon, 23 Jan 2017 09:05:49 -0500 Received: by mail-wm0-f42.google.com with SMTP id c206so157139603wme.0 for ; Mon, 23 Jan 2017 06:05:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WlYjWarC5l4inC4sBTmaN1pZkEFZOWtAz9sLlRjbjWg=; b=L9qIYwdnpbKW+HvhbqZW0Rg1vbKVcreoRSWTp+kAZH/9NsTfVc51w0b7W5lwOvtf28 10X80VCFNo8Li4eDzATWtZnc40xcYMagr45Psitelhhwr52NbB2IfMWgHU1AXDdN0bwC QiO3tBSxCSJFeSZwG8lR23qHcxcLHSCrCvv+Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WlYjWarC5l4inC4sBTmaN1pZkEFZOWtAz9sLlRjbjWg=; b=gThhxc1yNLeWBUjdQZ2gQj05kr+FsqmGzYn6xO7l8xH+t6ouccwsMuKHmW5VRWiOPV 9hGtfHXd/d3KQ4701Dz+bEfN6i1vQc3yd0SRGT4oGBtB8/d39LJ2ukdTvLqXT07EdD9O NZbVIUq632dttQe/M8LuFTys6oK35dzJDioZCdQARwLG4iGHJ21Cu/OPnbMmp+TZnO14 5nSG/G56gVqmdujpv5LBvbGR4t0veuCIwJuRLWgUi1nYHRc4/XC8okIzQEKZQQwb+4Y3 78u1BGDnXfAaHCpRZqtlCHZH3//o2DubP1sxS9lfz4Q5Hg45DQ2T3aOIcgRhnIZ+k83R z9pw== X-Gm-Message-State: AIkVDXK1wRqHch2FZ8P5PHbVSTTrjz01Gr3+dUqequ/lc6YNoyruYaaGZ9g2XQytFQU2UP24 X-Received: by 10.28.169.135 with SMTP id s129mr13396745wme.24.1485180347666; Mon, 23 Jan 2017 06:05:47 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:47 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 09/10] crypto: arm64/aes-neon-blk - tweak performance for low end cores Date: Mon, 23 Jan 2017 14:05:25 +0000 Message-Id: <1485180326-25612-10-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The non-bitsliced AES implementation using the NEON is highly sensitive to micro-architectural details, and, as it turns out, the Cortex-A53 on the Raspberry Pi 3 is a core that can benefit from this code, given that its scalar AES performance is abysmal (32.9 cycles per byte). The new bitsliced AES code manages 19.8 cycles per byte on this core, but can only operate on 8 blocks at a time, which is not supported by all chaining modes. With a bit of tweaking, we can get the plain NEON code to run at 22.0 cycles per byte, making it useful for sequential modes like CBC encryption. (Like bitsliced NEON, the plain NEON implementation does not use any lookup tables, which makes it easy on the D-cache, and invulnerable to cache timing attacks) So tweak the plain NEON AES code to use tbl instructions rather than shl/sri pairs, and to avoid the need to reload permutation vectors or other constants from memory in every round. To allow the ECB and CBC encrypt routines to be reused by the bitsliced NEON code in a subsequent patch, export them from the module. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 2 + arch/arm64/crypto/aes-neon.S | 210 ++++++++------------ 2 files changed, 81 insertions(+), 131 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 8ee1fb7aaa4f..055bc3f61138 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -409,5 +409,7 @@ static int __init aes_init(void) module_cpu_feature_match(AES, aes_init); #else module_init(aes_init); +EXPORT_SYMBOL(neon_aes_ecb_encrypt); +EXPORT_SYMBOL(neon_aes_cbc_encrypt); #endif module_exit(aes_exit); diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S index 85f07ead7c5c..0d111a6fc229 100644 --- a/arch/arm64/crypto/aes-neon.S +++ b/arch/arm64/crypto/aes-neon.S @@ -1,7 +1,7 @@ /* * linux/arch/arm64/crypto/aes-neon.S - AES cipher for ARMv8 NEON * - * Copyright (C) 2013 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -25,9 +25,9 @@ /* preload the entire Sbox */ .macro prepare, sbox, shiftrows, temp adr \temp, \sbox - movi v12.16b, #0x40 + movi v12.16b, #0x1b ldr q13, \shiftrows - movi v14.16b, #0x1b + ldr q14, .Lror32by8 ld1 {v16.16b-v19.16b}, [\temp], #64 ld1 {v20.16b-v23.16b}, [\temp], #64 ld1 {v24.16b-v27.16b}, [\temp], #64 @@ -50,37 +50,32 @@ /* apply SubBytes transformation using the the preloaded Sbox */ .macro sub_bytes, in - sub v9.16b, \in\().16b, v12.16b + sub v9.16b, \in\().16b, v15.16b tbl \in\().16b, {v16.16b-v19.16b}, \in\().16b - sub v10.16b, v9.16b, v12.16b + sub v10.16b, v9.16b, v15.16b tbx \in\().16b, {v20.16b-v23.16b}, v9.16b - sub v11.16b, v10.16b, v12.16b + sub v11.16b, v10.16b, v15.16b tbx \in\().16b, {v24.16b-v27.16b}, v10.16b tbx \in\().16b, {v28.16b-v31.16b}, v11.16b .endm /* apply MixColumns transformation */ - .macro mix_columns, in - mul_by_x v10.16b, \in\().16b, v9.16b, v14.16b - rev32 v8.8h, \in\().8h - eor \in\().16b, v10.16b, \in\().16b - shl v9.4s, v8.4s, #24 - shl v11.4s, \in\().4s, #24 - sri v9.4s, v8.4s, #8 - sri v11.4s, \in\().4s, #8 - eor v9.16b, v9.16b, v8.16b - eor v10.16b, v10.16b, v9.16b - eor \in\().16b, v10.16b, v11.16b - .endm - + .macro mix_columns, in, enc + .if \enc == 0 /* Inverse MixColumns: pre-multiply by { 5, 0, 4, 0 } */ - .macro inv_mix_columns, in - mul_by_x v11.16b, \in\().16b, v10.16b, v14.16b - mul_by_x v11.16b, v11.16b, v10.16b, v14.16b - eor \in\().16b, \in\().16b, v11.16b - rev32 v11.8h, v11.8h - eor \in\().16b, \in\().16b, v11.16b - mix_columns \in + mul_by_x v8.16b, \in\().16b, v9.16b, v12.16b + mul_by_x v8.16b, v8.16b, v9.16b, v12.16b + eor \in\().16b, \in\().16b, v8.16b + rev32 v8.8h, v8.8h + eor \in\().16b, \in\().16b, v8.16b + .endif + + mul_by_x v9.16b, \in\().16b, v8.16b, v12.16b + rev32 v8.8h, \in\().8h + eor v8.16b, v8.16b, v9.16b + eor \in\().16b, \in\().16b, v8.16b + tbl \in\().16b, {\in\().16b}, v14.16b + eor \in\().16b, \in\().16b, v8.16b .endm .macro do_block, enc, in, rounds, rk, rkp, i @@ -88,16 +83,13 @@ add \rkp, \rk, #16 mov \i, \rounds 1111: eor \in\().16b, \in\().16b, v15.16b /* ^round key */ + movi v15.16b, #0x40 tbl \in\().16b, {\in\().16b}, v13.16b /* ShiftRows */ sub_bytes \in - ld1 {v15.4s}, [\rkp], #16 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns \in - .else - inv_mix_columns \in - .endif + mix_columns \in, \enc b 1111b 2222: eor \in\().16b, \in\().16b, v15.16b /* ^round key */ .endm @@ -116,48 +108,48 @@ */ .macro sub_bytes_2x, in0, in1 - sub v8.16b, \in0\().16b, v12.16b - sub v9.16b, \in1\().16b, v12.16b + sub v8.16b, \in0\().16b, v15.16b tbl \in0\().16b, {v16.16b-v19.16b}, \in0\().16b + sub v9.16b, \in1\().16b, v15.16b tbl \in1\().16b, {v16.16b-v19.16b}, \in1\().16b - sub v10.16b, v8.16b, v12.16b - sub v11.16b, v9.16b, v12.16b + sub v10.16b, v8.16b, v15.16b tbx \in0\().16b, {v20.16b-v23.16b}, v8.16b + sub v11.16b, v9.16b, v15.16b tbx \in1\().16b, {v20.16b-v23.16b}, v9.16b - sub v8.16b, v10.16b, v12.16b - sub v9.16b, v11.16b, v12.16b + sub v8.16b, v10.16b, v15.16b tbx \in0\().16b, {v24.16b-v27.16b}, v10.16b + sub v9.16b, v11.16b, v15.16b tbx \in1\().16b, {v24.16b-v27.16b}, v11.16b tbx \in0\().16b, {v28.16b-v31.16b}, v8.16b tbx \in1\().16b, {v28.16b-v31.16b}, v9.16b .endm .macro sub_bytes_4x, in0, in1, in2, in3 - sub v8.16b, \in0\().16b, v12.16b + sub v8.16b, \in0\().16b, v15.16b tbl \in0\().16b, {v16.16b-v19.16b}, \in0\().16b - sub v9.16b, \in1\().16b, v12.16b + sub v9.16b, \in1\().16b, v15.16b tbl \in1\().16b, {v16.16b-v19.16b}, \in1\().16b - sub v10.16b, \in2\().16b, v12.16b + sub v10.16b, \in2\().16b, v15.16b tbl \in2\().16b, {v16.16b-v19.16b}, \in2\().16b - sub v11.16b, \in3\().16b, v12.16b + sub v11.16b, \in3\().16b, v15.16b tbl \in3\().16b, {v16.16b-v19.16b}, \in3\().16b tbx \in0\().16b, {v20.16b-v23.16b}, v8.16b tbx \in1\().16b, {v20.16b-v23.16b}, v9.16b - sub v8.16b, v8.16b, v12.16b + sub v8.16b, v8.16b, v15.16b tbx \in2\().16b, {v20.16b-v23.16b}, v10.16b - sub v9.16b, v9.16b, v12.16b + sub v9.16b, v9.16b, v15.16b tbx \in3\().16b, {v20.16b-v23.16b}, v11.16b - sub v10.16b, v10.16b, v12.16b + sub v10.16b, v10.16b, v15.16b tbx \in0\().16b, {v24.16b-v27.16b}, v8.16b - sub v11.16b, v11.16b, v12.16b + sub v11.16b, v11.16b, v15.16b tbx \in1\().16b, {v24.16b-v27.16b}, v9.16b - sub v8.16b, v8.16b, v12.16b + sub v8.16b, v8.16b, v15.16b tbx \in2\().16b, {v24.16b-v27.16b}, v10.16b - sub v9.16b, v9.16b, v12.16b + sub v9.16b, v9.16b, v15.16b tbx \in3\().16b, {v24.16b-v27.16b}, v11.16b - sub v10.16b, v10.16b, v12.16b + sub v10.16b, v10.16b, v15.16b tbx \in0\().16b, {v28.16b-v31.16b}, v8.16b - sub v11.16b, v11.16b, v12.16b + sub v11.16b, v11.16b, v15.16b tbx \in1\().16b, {v28.16b-v31.16b}, v9.16b tbx \in2\().16b, {v28.16b-v31.16b}, v10.16b tbx \in3\().16b, {v28.16b-v31.16b}, v11.16b @@ -174,59 +166,30 @@ eor \out1\().16b, \out1\().16b, \tmp1\().16b .endm - .macro mix_columns_2x, in0, in1 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - rev32 v10.8h, \in0\().8h - rev32 v11.8h, \in1\().8h - eor \in0\().16b, v8.16b, \in0\().16b - eor \in1\().16b, v9.16b, \in1\().16b - shl v12.4s, v10.4s, #24 - shl v13.4s, v11.4s, #24 - eor v8.16b, v8.16b, v10.16b - sri v12.4s, v10.4s, #8 - shl v10.4s, \in0\().4s, #24 - eor v9.16b, v9.16b, v11.16b - sri v13.4s, v11.4s, #8 - shl v11.4s, \in1\().4s, #24 - sri v10.4s, \in0\().4s, #8 - eor \in0\().16b, v8.16b, v12.16b - sri v11.4s, \in1\().4s, #8 - eor \in1\().16b, v9.16b, v13.16b - eor \in0\().16b, v10.16b, \in0\().16b - eor \in1\().16b, v11.16b, \in1\().16b - .endm - - .macro inv_mix_cols_2x, in0, in1 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - mul_by_x_2x v8, v9, v8, v9, v10, v11, v14 + .macro mix_columns_2x, in0, in1, enc + .if \enc == 0 + /* Inverse MixColumns: pre-multiply by { 5, 0, 4, 0 } */ + mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v12 + mul_by_x_2x v8, v9, v8, v9, v10, v11, v12 eor \in0\().16b, \in0\().16b, v8.16b - eor \in1\().16b, \in1\().16b, v9.16b rev32 v8.8h, v8.8h - rev32 v9.8h, v9.8h - eor \in0\().16b, \in0\().16b, v8.16b eor \in1\().16b, \in1\().16b, v9.16b - mix_columns_2x \in0, \in1 - .endm - - .macro inv_mix_cols_4x, in0, in1, in2, in3 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - mul_by_x_2x v10, v11, \in2, \in3, v12, v13, v14 - mul_by_x_2x v8, v9, v8, v9, v12, v13, v14 - mul_by_x_2x v10, v11, v10, v11, v12, v13, v14 - eor \in0\().16b, \in0\().16b, v8.16b - eor \in1\().16b, \in1\().16b, v9.16b - eor \in2\().16b, \in2\().16b, v10.16b - eor \in3\().16b, \in3\().16b, v11.16b - rev32 v8.8h, v8.8h rev32 v9.8h, v9.8h - rev32 v10.8h, v10.8h - rev32 v11.8h, v11.8h eor \in0\().16b, \in0\().16b, v8.16b eor \in1\().16b, \in1\().16b, v9.16b - eor \in2\().16b, \in2\().16b, v10.16b - eor \in3\().16b, \in3\().16b, v11.16b - mix_columns_2x \in0, \in1 - mix_columns_2x \in2, \in3 + .endif + + mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v12 + rev32 v10.8h, \in0\().8h + rev32 v11.8h, \in1\().8h + eor v10.16b, v10.16b, v8.16b + eor v11.16b, v11.16b, v9.16b + eor \in0\().16b, \in0\().16b, v10.16b + eor \in1\().16b, \in1\().16b, v11.16b + tbl \in0\().16b, {\in0\().16b}, v14.16b + tbl \in1\().16b, {\in1\().16b}, v14.16b + eor \in0\().16b, \in0\().16b, v10.16b + eor \in1\().16b, \in1\().16b, v11.16b .endm .macro do_block_2x, enc, in0, in1 rounds, rk, rkp, i @@ -235,20 +198,14 @@ mov \i, \rounds 1111: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ - sub_bytes_2x \in0, \in1 + movi v15.16b, #0x40 tbl \in0\().16b, {\in0\().16b}, v13.16b /* ShiftRows */ tbl \in1\().16b, {\in1\().16b}, v13.16b /* ShiftRows */ - ld1 {v15.4s}, [\rkp], #16 + sub_bytes_2x \in0, \in1 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns_2x \in0, \in1 - ldr q13, .LForward_ShiftRows - .else - inv_mix_cols_2x \in0, \in1 - ldr q13, .LReverse_ShiftRows - .endif - movi v12.16b, #0x40 + mix_columns_2x \in0, \in1, \enc b 1111b 2222: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ @@ -262,23 +219,17 @@ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ eor \in2\().16b, \in2\().16b, v15.16b /* ^round key */ eor \in3\().16b, \in3\().16b, v15.16b /* ^round key */ - sub_bytes_4x \in0, \in1, \in2, \in3 + movi v15.16b, #0x40 tbl \in0\().16b, {\in0\().16b}, v13.16b /* ShiftRows */ tbl \in1\().16b, {\in1\().16b}, v13.16b /* ShiftRows */ tbl \in2\().16b, {\in2\().16b}, v13.16b /* ShiftRows */ tbl \in3\().16b, {\in3\().16b}, v13.16b /* ShiftRows */ - ld1 {v15.4s}, [\rkp], #16 + sub_bytes_4x \in0, \in1, \in2, \in3 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns_2x \in0, \in1 - mix_columns_2x \in2, \in3 - ldr q13, .LForward_ShiftRows - .else - inv_mix_cols_4x \in0, \in1, \in2, \in3 - ldr q13, .LReverse_ShiftRows - .endif - movi v12.16b, #0x40 + mix_columns_2x \in0, \in1, \enc + mix_columns_2x \in2, \in3, \enc b 1111b 2222: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ @@ -305,19 +256,7 @@ #include "aes-modes.S" .text - .align 4 -.LForward_ShiftRows: -CPU_LE( .byte 0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3 ) -CPU_LE( .byte 0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb ) -CPU_BE( .byte 0xb, 0x6, 0x1, 0xc, 0x7, 0x2, 0xd, 0x8 ) -CPU_BE( .byte 0x3, 0xe, 0x9, 0x4, 0xf, 0xa, 0x5, 0x0 ) - -.LReverse_ShiftRows: -CPU_LE( .byte 0x0, 0xd, 0xa, 0x7, 0x4, 0x1, 0xe, 0xb ) -CPU_LE( .byte 0x8, 0x5, 0x2, 0xf, 0xc, 0x9, 0x6, 0x3 ) -CPU_BE( .byte 0x3, 0x6, 0x9, 0xc, 0xf, 0x2, 0x5, 0x8 ) -CPU_BE( .byte 0xb, 0xe, 0x1, 0x4, 0x7, 0xa, 0xd, 0x0 ) - + .align 6 .LForward_Sbox: .byte 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5 .byte 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76 @@ -385,3 +324,12 @@ CPU_BE( .byte 0xb, 0xe, 0x1, 0x4, 0x7, 0xa, 0xd, 0x0 ) .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61 .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d + +.LForward_ShiftRows: + .octa 0x0b06010c07020d08030e09040f0a0500 + +.LReverse_ShiftRows: + .octa 0x0306090c0f0205080b0e0104070a0d00 + +.Lror32by8: + .octa 0x0c0f0e0d080b0a090407060500030201 From patchwork Mon Jan 23 14:05:26 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92223 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233480obz; Mon, 23 Jan 2017 06:05:52 -0800 (PST) X-Received: by 10.84.211.137 with SMTP id c9mr42831582pli.8.1485180352488; Mon, 23 Jan 2017 06:05:52 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.52; Mon, 23 Jan 2017 06:05:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751238AbdAWOFv (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:51 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:36083 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbdAWOFv (ORCPT ); Mon, 23 Jan 2017 09:05:51 -0500 Received: by mail-wm0-f46.google.com with SMTP id c85so134187148wmi.1 for ; Mon, 23 Jan 2017 06:05:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=EAh6qnLbD9kkjaqYN2p4YBs+qfUcjOxSpI+QSwaz7zx9Ne+JFOscjYaLadQMgUfZfN MqzhknHK4WyNpOdMjO4i47QTYkB2/wLKv9xPH6jqIQKHATuABdhhf35SVV7xYPl3xaJX Rnj6PSBquG5j97Dfze3QWwA/UJmcavIoPdLQs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=mU2Hg/XkKd6xM4jpB3bOTnOmbvDuKbi6sTHwiZCvwT9xiTqcT8K4dJcynyq/GO+gqd 0DBHrfw5ipPn4GV1hWJkVLDwzTrvv5GIMsDGwcjHZWC7PBx7D8/g9RyQ46kJodg62iv5 /gW0rt5xJmFawzcoUt+QcTYtyQ/1E5oqPr4fvVS19WO6MaU0SlNu701Wrpcb2AJc7y5Z M51znVJSJioeZAZUxBOQ28jgt4aKW2uIMgmdI26TII9oKkz+fojOymtZD9fkesGDpoBx EDltJ+oKbW/cqcMmwuSKG17aqhjMQGAo0G7+e2R+n2xFtRQFGAhbt0hZ5//MWsXLdmCX KsUw== X-Gm-Message-State: AIkVDXLM/j6zPN7sN5onWNWwc42jhdsD9/LX9yipxNRUrcxaA2cE+HwT4Th5XUb6RLqFyOcD X-Received: by 10.28.31.212 with SMTP id f203mr12954256wmf.130.1485180349428; Mon, 23 Jan 2017 06:05:49 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:48 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 10/10] crypto: arm64/aes - replace scalar fallback with plain NEON fallback Date: Mon, 23 Jan 2017 14:05:26 +0000 Message-Id: <1485180326-25612-11-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The new bitsliced NEON implementation of AES uses a fallback in two places: CBC encryption (which is strictly sequential, whereas this driver can only operate efficiently on 8 blocks at a time), and the XTS tweak generation, which involves encrypting a single AES block with a different key schedule. The plain (i.e., non-bitsliced) NEON code is more suitable as a fallback, given that it is faster than scalar on low end cores (which is what the NEON implementations target, since high end cores have dedicated instructions for AES), and shows similar behavior in terms of D-cache footprint and sensitivity to cache timing attacks. So switch the fallback handling to the plain NEON driver. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/aes-neonbs-glue.c | 38 ++++++++++++++------ 2 files changed, 29 insertions(+), 11 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 5de75c3dcbd4..bed7feddfeed 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -86,7 +86,7 @@ config CRYPTO_AES_ARM64_BS tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER - select CRYPTO_AES_ARM64 + select CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD endif diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index 323dd76ae5f0..863e436ecf89 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -10,7 +10,6 @@ #include #include -#include #include #include #include @@ -42,7 +41,12 @@ asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 iv[]); -asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); +/* borrowed from aes-neon-blk.ko */ +asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, int first); +asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, u8 iv[], + int first); struct aesbs_ctx { u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; @@ -140,16 +144,28 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return 0; } -static void cbc_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst) +static int cbc_encrypt(struct skcipher_request *req) { + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + int err, first = 1; - __aes_arm64_encrypt(ctx->enc, dst, src, ctx->key.rounds); -} + err = skcipher_walk_virt(&walk, req, true); -static int cbc_encrypt(struct skcipher_request *req) -{ - return crypto_cbc_encrypt_walk(req, cbc_encrypt_one); + kernel_neon_begin(); + while (walk.nbytes >= AES_BLOCK_SIZE) { + unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; + + /* fall back to the non-bitsliced NEON implementation */ + neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->enc, ctx->key.rounds, blocks, walk.iv, + first); + err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + first = 0; + } + kernel_neon_end(); + return err; } static int cbc_decrypt(struct skcipher_request *req) @@ -254,9 +270,11 @@ static int __xts_crypt(struct skcipher_request *req, err = skcipher_walk_virt(&walk, req, true); - __aes_arm64_encrypt(ctx->twkey, walk.iv, walk.iv, ctx->key.rounds); - kernel_neon_begin(); + + neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, + ctx->key.rounds, 1, 1); + while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;