From patchwork Mon Jan 23 14:05:26 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92223 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233480obz; Mon, 23 Jan 2017 06:05:52 -0800 (PST) X-Received: by 10.84.211.137 with SMTP id c9mr42831582pli.8.1485180352488; Mon, 23 Jan 2017 06:05:52 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.52; Mon, 23 Jan 2017 06:05:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751238AbdAWOFv (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:51 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:36083 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbdAWOFv (ORCPT ); Mon, 23 Jan 2017 09:05:51 -0500 Received: by mail-wm0-f46.google.com with SMTP id c85so134187148wmi.1 for ; Mon, 23 Jan 2017 06:05:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=EAh6qnLbD9kkjaqYN2p4YBs+qfUcjOxSpI+QSwaz7zx9Ne+JFOscjYaLadQMgUfZfN MqzhknHK4WyNpOdMjO4i47QTYkB2/wLKv9xPH6jqIQKHATuABdhhf35SVV7xYPl3xaJX Rnj6PSBquG5j97Dfze3QWwA/UJmcavIoPdLQs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=mU2Hg/XkKd6xM4jpB3bOTnOmbvDuKbi6sTHwiZCvwT9xiTqcT8K4dJcynyq/GO+gqd 0DBHrfw5ipPn4GV1hWJkVLDwzTrvv5GIMsDGwcjHZWC7PBx7D8/g9RyQ46kJodg62iv5 /gW0rt5xJmFawzcoUt+QcTYtyQ/1E5oqPr4fvVS19WO6MaU0SlNu701Wrpcb2AJc7y5Z M51znVJSJioeZAZUxBOQ28jgt4aKW2uIMgmdI26TII9oKkz+fojOymtZD9fkesGDpoBx EDltJ+oKbW/cqcMmwuSKG17aqhjMQGAo0G7+e2R+n2xFtRQFGAhbt0hZ5//MWsXLdmCX KsUw== X-Gm-Message-State: AIkVDXLM/j6zPN7sN5onWNWwc42jhdsD9/LX9yipxNRUrcxaA2cE+HwT4Th5XUb6RLqFyOcD X-Received: by 10.28.31.212 with SMTP id f203mr12954256wmf.130.1485180349428; Mon, 23 Jan 2017 06:05:49 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:48 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 10/10] crypto: arm64/aes - replace scalar fallback with plain NEON fallback Date: Mon, 23 Jan 2017 14:05:26 +0000 Message-Id: <1485180326-25612-11-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The new bitsliced NEON implementation of AES uses a fallback in two places: CBC encryption (which is strictly sequential, whereas this driver can only operate efficiently on 8 blocks at a time), and the XTS tweak generation, which involves encrypting a single AES block with a different key schedule. The plain (i.e., non-bitsliced) NEON code is more suitable as a fallback, given that it is faster than scalar on low end cores (which is what the NEON implementations target, since high end cores have dedicated instructions for AES), and shows similar behavior in terms of D-cache footprint and sensitivity to cache timing attacks. So switch the fallback handling to the plain NEON driver. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/aes-neonbs-glue.c | 38 ++++++++++++++------ 2 files changed, 29 insertions(+), 11 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 5de75c3dcbd4..bed7feddfeed 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -86,7 +86,7 @@ config CRYPTO_AES_ARM64_BS tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER - select CRYPTO_AES_ARM64 + select CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD endif diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index 323dd76ae5f0..863e436ecf89 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -10,7 +10,6 @@ #include #include -#include #include #include #include @@ -42,7 +41,12 @@ asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 iv[]); -asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); +/* borrowed from aes-neon-blk.ko */ +asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, int first); +asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, u8 iv[], + int first); struct aesbs_ctx { u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; @@ -140,16 +144,28 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return 0; } -static void cbc_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst) +static int cbc_encrypt(struct skcipher_request *req) { + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + int err, first = 1; - __aes_arm64_encrypt(ctx->enc, dst, src, ctx->key.rounds); -} + err = skcipher_walk_virt(&walk, req, true); -static int cbc_encrypt(struct skcipher_request *req) -{ - return crypto_cbc_encrypt_walk(req, cbc_encrypt_one); + kernel_neon_begin(); + while (walk.nbytes >= AES_BLOCK_SIZE) { + unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; + + /* fall back to the non-bitsliced NEON implementation */ + neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->enc, ctx->key.rounds, blocks, walk.iv, + first); + err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + first = 0; + } + kernel_neon_end(); + return err; } static int cbc_decrypt(struct skcipher_request *req) @@ -254,9 +270,11 @@ static int __xts_crypt(struct skcipher_request *req, err = skcipher_walk_virt(&walk, req, true); - __aes_arm64_encrypt(ctx->twkey, walk.iv, walk.iv, ctx->key.rounds); - kernel_neon_begin(); + + neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, + ctx->key.rounds, 1, 1); + while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;