From patchwork Sat Mar 10 15:21:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 131292 Delivered-To: patch@linaro.org Received: by 10.46.66.2 with SMTP id p2csp2244379lja; Sat, 10 Mar 2018 07:22:38 -0800 (PST) X-Google-Smtp-Source: AG47ELs+Iv58ziBjgFPoZaJx+Lds2WPf/XN+oz8nRhO0oKGLL0j/9xNpmt3+8ZP2eVpFGb4qc17N X-Received: by 10.99.66.135 with SMTP id p129mr1941541pga.220.1520695358155; Sat, 10 Mar 2018 07:22:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520695358; cv=none; d=google.com; s=arc-20160816; b=SyVke4wsb01wM9NVwe2OVhw1dt7arn9WQNsUtznFPaIWUZDnqRKU4pYFJe+eArj076 Cyc5bjjvim1gyGeRXPK9iUQ/8WiwhafAlqZOC61XyFosSM5wzkPXU24Ipa59XVhOD8iP hKwMj6D8FwfXbJrrbfWK3Wm9kpvAclAXstHcXLPd+xtthcbCk50XQ+5Bf84ZdEVJnBuH Wlh6x22GsZNDkUQU/cK9fMlVnCACzDZIfWcdN9JVi/rANcP9EayoXBdVylJQ9eiqBwy8 OgmZnyDNPVTiQecFGYilPDBxBPTn9JdRDUwIBmL3/0I3OVjYI4F7IwpFqueWKLA7SOh8 hxow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=JdJLNL1b2SAPFDYe5cR+GVEkPFZb9F6GBJ002Vtvdw4=; b=SbOY5Xd8JIBPwTvSdRJUr+ASOr3pihEW3qpaHWNtQ4rY91eQuQSL8ARUP9KR5Pr7/w TljL/mxCtzIFmTuLjtBiTGdtBFGFtY0OyLWz2n3Ee2OCsGcE+EMbDK+2NQ4XMQxzGxJf ZmaJlM+XhArPL/fZI3Tzd3Y+kZKIH80tCzhj8jy/FjKzqihnftU/yjaLLSwMHIvuMLIg Bl2cCGpeC6sW1Dt83onz+R+cL8pf/1JsPD3dKDvyQG/ci+NZHexvKp2uDHOs4CAjCOfu qMsxFwPJ6rPKF5Zs1vpilUWFZW2570Lo6wXBcgjcaqvCcq4mVkDvU1ZusFuMn0RzHVIW EuXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AblMJqQ+; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q16si2824191pfg.221.2018.03.10.07.22.37; Sat, 10 Mar 2018 07:22:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AblMJqQ+; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932227AbeCJPWf (ORCPT + 1 other); Sat, 10 Mar 2018 10:22:35 -0500 Received: from mail-wr0-f196.google.com ([209.85.128.196]:45833 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932130AbeCJPWe (ORCPT ); Sat, 10 Mar 2018 10:22:34 -0500 Received: by mail-wr0-f196.google.com with SMTP id h2so4348236wre.12 for ; Sat, 10 Mar 2018 07:22:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JdJLNL1b2SAPFDYe5cR+GVEkPFZb9F6GBJ002Vtvdw4=; b=AblMJqQ+0tQhrO7zvn+8UzEVPiExppjRyJfqIw/WOr6XAlbIvl0bJdgmS5QvbsdkBg pSCipKxlyJrsTx7671u8Y19ky0bKsnbWacgsWTPy6Re7Aq4rFyANXYLPMVfhMXvikt23 hJAe9XGqzBaIx6V3tEoaHPdQUcYOkFAR7O+a4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JdJLNL1b2SAPFDYe5cR+GVEkPFZb9F6GBJ002Vtvdw4=; b=DtNIANqnNN5cz7YXL3c3YIY8Bp7iJLqYXTmOU8TXOm6kveGRzS2FhLThmDO/EaKSC/ l+s4VorkUu+GNKlljjRdi/uRCJPlqSlMM7nS7J+E7fAmNS6CX7/Rvt6v+onLz/4nEjJu dFg8gIsKW0LbkpilIHjGzET8WVUH34JFuJQsVS/egUzdW/+A8onV3kGhyFvh6meJsRXa Gh1BggTbDCooIBUlTOv38cZUmNHOxSLkdUfQ3zfrTEId+ja7nMZ9O25X4mIPjhxTYgdh byJ32wfcJdgP/H0P0E4HQLSzvM/IGZ8Y1B+o4wDHzv0oGN9MwgQukoc6sLh02KIQne1M wbbw== X-Gm-Message-State: AElRT7HC4+FKgVQOTxdFleD4AVAfusze2QJ5GOHBj9hM1j0kUnbCctJR X11Qf2T4GxqUR43t5sKdl6p8r7Mtczk= X-Received: by 10.223.134.136 with SMTP id 8mr1814997wrx.86.1520695352440; Sat, 10 Mar 2018 07:22:32 -0800 (PST) Received: from localhost.localdomain ([105.148.128.186]) by smtp.gmail.com with ESMTPSA id m9sm7027531wrf.13.2018.03.10.07.22.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Mar 2018 07:22:30 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH v5 02/23] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop Date: Sat, 10 Mar 2018 15:21:47 +0000 Message-Id: <20180310152208.10369-3-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180310152208.10369-1-ard.biesheuvel@linaro.org> References: <20180310152208.10369-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 47 ++++++++++---------- 1 file changed, 23 insertions(+), 24 deletions(-) -- 2.15.1 diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index a1254036f2b1..68b11aa690e4 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -107,11 +107,13 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen) } static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], - u32 abytes, u32 *macp, bool use_neon) + u32 abytes, u32 *macp) { - if (likely(use_neon)) { + if (may_use_simd()) { + kernel_neon_begin(); ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc, num_rounds(key)); + kernel_neon_end(); } else { if (*macp > 0 && *macp < AES_BLOCK_SIZE) { int added = min(abytes, AES_BLOCK_SIZE - *macp); @@ -143,8 +145,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], } } -static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], - bool use_neon) +static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) { struct crypto_aead *aead = crypto_aead_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead); @@ -163,7 +164,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], ltag.len = 6; } - ccm_update_mac(ctx, mac, (u8 *)<ag, ltag.len, &macp, use_neon); + ccm_update_mac(ctx, mac, (u8 *)<ag, ltag.len, &macp); scatterwalk_start(&walk, req->src); do { @@ -175,7 +176,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], n = scatterwalk_clamp(&walk, len); } p = scatterwalk_map(&walk); - ccm_update_mac(ctx, mac, p, n, &macp, use_neon); + ccm_update_mac(ctx, mac, p, n, &macp); len -= n; scatterwalk_unmap(p); @@ -242,43 +243,42 @@ static int ccm_encrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen; - bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - if (likely(use_neon)) - kernel_neon_begin(); - if (req->assoclen) - ccm_calculate_auth_mac(req, mac, use_neon); + ccm_calculate_auth_mac(req, mac); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_encrypt(&walk, req, true); - if (likely(use_neon)) { + if (may_use_simd()) { while (walk.nbytes) { u32 tail = walk.nbytes % AES_BLOCK_SIZE; if (walk.nbytes == walk.total) tail = 0; + kernel_neon_begin(); ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, tail); } - if (!err) + if (!err) { + kernel_neon_begin(); ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - - kernel_neon_end(); + kernel_neon_end(); + } } else { err = ccm_crypt_fallback(&walk, mac, buf, ctx, true); } @@ -301,43 +301,42 @@ static int ccm_decrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen - authsize; - bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - if (likely(use_neon)) - kernel_neon_begin(); - if (req->assoclen) - ccm_calculate_auth_mac(req, mac, use_neon); + ccm_calculate_auth_mac(req, mac); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_decrypt(&walk, req, true); - if (likely(use_neon)) { + if (may_use_simd()) { while (walk.nbytes) { u32 tail = walk.nbytes % AES_BLOCK_SIZE; if (walk.nbytes == walk.total) tail = 0; + kernel_neon_begin(); ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, tail); } - if (!err) + if (!err) { + kernel_neon_begin(); ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - - kernel_neon_end(); + kernel_neon_end(); + } } else { err = ccm_crypt_fallback(&walk, mac, buf, ctx, false); }