From patchwork Mon Jan 23 14:05:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92213 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp1233294obz; Mon, 23 Jan 2017 06:05:34 -0800 (PST) X-Received: by 10.84.173.168 with SMTP id p37mr43488839plb.30.1485180334006; Mon, 23 Jan 2017 06:05:34 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si15767527plg.145.2017.01.23.06.05.33; Mon, 23 Jan 2017 06:05:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751127AbdAWOFd (ORCPT + 1 other); Mon, 23 Jan 2017 09:05:33 -0500 Received: from mail-wm0-f53.google.com ([74.125.82.53]:36800 "EHLO mail-wm0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751113AbdAWOFc (ORCPT ); Mon, 23 Jan 2017 09:05:32 -0500 Received: by mail-wm0-f53.google.com with SMTP id c85so134172927wmi.1 for ; Mon, 23 Jan 2017 06:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=Pde+7Shcn5DnFKI/dYZd9+5dGS0FufC/ASPpvBlLIpw=; b=fUXukI1hOSoD9ejBHTYdtSdX9nr0HyB5Gg3HLb9rHWtmxvtPd3OIbDZy1tvJZWuMGb PaLtlYFFvRl84GxSsakY/HKQYiaHA3+nFV6Hseu9BzAMApGuUIevoLbSgmLiZZg7fINL xSshLgaHQjBritAadNy5vLUvW5lAtk94QFyZo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Pde+7Shcn5DnFKI/dYZd9+5dGS0FufC/ASPpvBlLIpw=; b=D4uPMa1xTuJPlhyM4JS6oxSCk10vSrXT0z9feygsDzTfk8hYlfTXWRPTuUR3gNDYL2 RcIvFaHlFySdoiMMxO/LqhuhdUWIEHCcxvii5r8DsGFUh8XpDoYwt7r6EhPV9+edaGAF hmh/iI03P4va5YqFOpuDW/bEKuahI1WPDH+dbwiQL0chIlMAi23/T+w4h8fdxhMzitNu F5xRc7xNbXStMzHUmFeYrGFbKjV2PQyMIbIStNSQkreWYPefuQwaH2HDRg024Lc8WzaG JBlt/ISpGttV9GD7l89u84jXeK4FvasFRhimwBAr1VxMU7nSZOj7R9OIhGsqkGrlYz6V UzGw== X-Gm-Message-State: AIkVDXLivcjjDbuoDNN0ZKeJhZNymm/D2VE1Ta61P+wfsN9hjOLumgWqVBR8hOZvcDbNpK1w X-Received: by 10.28.137.211 with SMTP id l202mr12987732wmd.88.1485180331462; Mon, 23 Jan 2017 06:05:31 -0800 (PST) Received: from localhost.localdomain ([160.160.111.139]) by smtp.gmail.com with ESMTPSA id y65sm21319790wmb.5.2017.01.23.06.05.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Jan 2017 06:05:30 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: Ard Biesheuvel Subject: [PATCH v2 00/10] crypto - AES for ARM/arm64 updates for v4.11 (round #2) Date: Mon, 23 Jan 2017 14:05:16 +0000 Message-Id: <1485180326-25612-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Patch #1 is a fix for the CBC chaining issue that was discussed on the mailing list. The driver itself is queued for v4.11, so this fix can go right on top. Patches #2 - #6 clear the cra_alignmasks of various drivers: all NEON capable CPUs can perform unaligned accesses, and the advantage of using the slightly faster aligned accessors (which only exist on ARM not arm64) is certainly outweighed by the cost of copying data to suitably aligned buffers. NOTE: patch #5 won't apply unless 'crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes' is applied first, which was sent out separately as a bugfix for v3.16 - v4.9. If this is a problem, this patch can wait. Patch #7 and #8 are minor tweaks to the new scalar AES code. Patch #9 improves the performance of the plain NEON AES code, to make it more suitable as a fallback for the new bitsliced NEON code, which can only operate on 8 blocks in parallel, and needs another driver to perform CBC encryption or XTS tweak generation. Patch #10 updates the new bitsliced AES NEON code to switch to the plain NEON driver as a fallback. Patches #9 and #10 improve the performance of CBC encryption by ~35% on low end cores such as the Cortex-A53 that can be found in the Raspberry Pi3 Changes since v1: - shave off another few cycles from the sequential AES NEON code (patch #9) Ard Biesheuvel (10): crypto: arm64/aes-neon-bs - honour iv_out requirement in CTR mode crypto: arm/aes-ce - remove cra_alignmask crypto: arm/chacha20 - remove cra_alignmask crypto: arm64/aes-ce-ccm - remove cra_alignmask crypto: arm64/aes-blk - remove cra_alignmask crypto: arm64/chacha20 - remove cra_alignmask crypto: arm64/aes - avoid literals for cross-module symbol references crypto: arm64/aes - performance tweak crypto: arm64/aes-neon-blk - tweak performance for low end cores crypto: arm64/aes - replace scalar fallback with plain NEON fallback arch/arm/crypto/aes-ce-core.S | 84 ++++---- arch/arm/crypto/aes-ce-glue.c | 15 +- arch/arm/crypto/chacha20-neon-glue.c | 1 - arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/aes-ce-ccm-glue.c | 1 - arch/arm64/crypto/aes-cipher-core.S | 59 ++---- arch/arm64/crypto/aes-glue.c | 18 +- arch/arm64/crypto/aes-modes.S | 8 +- arch/arm64/crypto/aes-neon.S | 210 ++++++++------------ arch/arm64/crypto/aes-neonbs-core.S | 25 ++- arch/arm64/crypto/aes-neonbs-glue.c | 38 +++- arch/arm64/crypto/chacha20-neon-glue.c | 1 - 12 files changed, 203 insertions(+), 259 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html