From patchwork Mon Jan 23 14:05:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92215 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233343obz; Mon, 23 Jan 2017 06:05:39 -0800 (PST) X-Received: by 10.99.178.21 with SMTP id x21mr32390887pge.48.1485180338604; Mon, 23 Jan 2017 06:05:38 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.38; Mon, 23 Jan 2017 06:05:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751230AbdAWOFh (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:37 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:37502 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdAWOFh (ORCPT ); Mon, 23 Jan 2017 09:05:37 -0500 Received: by mail-wm0-f44.google.com with SMTP id c206so157129865wme.0 for ; Mon, 23 Jan 2017 06:05:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=CGos+sBofTDyAJuyz2KZt8Y+XtFhAMAw28miaEg4olBmCnnddCyZn7fHvmwoOceNxm LZYHnN1aIziifntwZT1zS8WqFEtOvHxKM0wtLEQWc8Yz2YseX/ybSPaIhkpG2Rk1OpRI ac2TJPxsMwgRwBcE6bfzOgxWiEd726GbjAMko= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=IT/qlahneUIsFtP/EyrTxFvw3QIswuDB7NVWLM7SP2NhwYIbtdLjc+uOguGY9VoaD5 GrwfIwD2PhUkyI6mrWeqK0rO+rZ0u/WD3hXmmnzXFAQpe03mGJ6rS1/Cs41tGjUFRTyi SY613dTGRiZ3tTKIZzxMTdhNzfO+jfOLgNsiaV9tIo0cZjojYNx54nXoKt4BU1tEriwy 571QynwA6JiliIS2g4v3+gVthASzaa5aRhtKMWUt3HoQwtrNqMULFvmAqEzoDZ5rOk0E o4E1+yngBTCmDWEqLDaInJRl6j4dWQGaVke84fXgXJNdUXsJL8ZyRIEOaDE+gJ6bu+nF V9aw== X-Gm-Message-State: AIkVDXIJNDVr5SAZ8+zilusrqZXJz4A0EhfTZRlEfO7aD4EB8zjlONQ297M8inim+32512kv X-Received: by 10.28.23.66 with SMTP id 63mr14434300wmx.46.1485180335673; Mon, 23 Jan 2017 06:05:35 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:34 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 02/10] crypto: arm/aes-ce - remove cra_alignmask Date: Mon, 23 Jan 2017 14:05:18 +0000 Message-Id: <1485180326-25612-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 84 ++++++++++---------- arch/arm/crypto/aes-ce-glue.c | 15 ++-- 2 files changed, 47 insertions(+), 52 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 987aa632c9f0..ba8e6a32fdc9 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt) .Lecbencloop3x: subs r4, r4, #3 bmi .Lecbenc1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_encrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbencloop3x .Lecbenc1x: adds r4, r4, #3 beq .Lecbencout .Lecbencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbencloop .Lecbencout: @@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt) .Lecbdecloop3x: subs r4, r4, #3 bmi .Lecbdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_decrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbdecloop3x .Lecbdec1x: adds r4, r4, #3 beq .Lecbdecout .Lecbdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbdecloop .Lecbdecout: @@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt) vld1.8 {q0}, [r5] prepare_key r2, r3 .Lcbcencloop: - vld1.8 {q1}, [r1, :64]! @ get next pt block + vld1.8 {q1}, [r1]! @ get next pt block veor q0, q0, q1 @ ..and xor with iv bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcencloop vst1.8 {q0}, [r5] @@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt) .Lcbcdecloop3x: subs r4, r4, #3 bmi .Lcbcdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! vmov q3, q0 vmov q4, q1 vmov q5, q2 @@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt) veor q1, q1, q3 veor q2, q2, q4 vmov q6, q5 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lcbcdecloop3x .Lcbcdec1x: adds r4, r4, #3 beq .Lcbcdecout vmov q15, q14 @ preserve last round key .Lcbcdecloop: - vld1.8 {q0}, [r1, :64]! @ get next ct block + vld1.8 {q0}, [r1]! @ get next ct block veor q14, q15, q6 @ combine prev ct with last key vmov q6, q0 bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcdecloop .Lcbcdecout: @@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt) rev ip, r6 add r6, r6, #1 vmov s11, ip - vld1.8 {q3-q4}, [r1, :64]! - vld1.8 {q5}, [r1, :64]! + vld1.8 {q3-q4}, [r1]! + vld1.8 {q5}, [r1]! bl aes_encrypt_3x veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 rev ip, r6 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! vmov s27, ip b .Lctrloop3x .Lctr1x: @@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt) vmov q0, q6 bl aes_encrypt subs r4, r4, #1 - bmi .Lctrhalfblock @ blocks < 0 means 1/2 block - vld1.8 {q3}, [r1, :64]! + bmi .Lctrtailblock @ blocks < 0 means tail block + vld1.8 {q3}, [r1]! veor q3, q0, q3 - vst1.8 {q3}, [r0, :64]! + vst1.8 {q3}, [r0]! adds r6, r6, #1 @ increment BE ctr rev ip, r6 @@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt) vst1.8 {q6}, [r5] pop {r4-r6, pc} -.Lctrhalfblock: - vld1.8 {d1}, [r1, :64] - veor d0, d0, d1 - vst1.8 {d0}, [r0, :64] +.Lctrtailblock: + vst1.8 {q0}, [r0, :64] @ return just the key stream pop {r4-r6, pc} .Lctrcarry: @@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt) .Lxtsenc3x: subs r4, r4, #3 bmi .Lxtsenc1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsencout @@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt) adds r4, r4, #3 beq .Lxtsencout .Lxtsencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 bl aes_encrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsencout next_tweak q3, q3, q7, q6 @@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt) .Lxtsdec3x: subs r4, r4, #3 bmi .Lxtsdec1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsdecout @@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt) adds r4, r4, #3 beq .Lxtsdecout .Lxtsdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 add ip, r2, #32 @ 3rd round key bl aes_decrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsdecout next_tweak q3, q3, q7, q6 diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 8857531915bf..883b84d828c5 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, + ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, num_rounds(ctx), blocks, walk.iv); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE,