From patchwork Tue Dec 26 10:29:27 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 122741 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp782391qgn; Tue, 26 Dec 2017 02:34:32 -0800 (PST) X-Google-Smtp-Source: ACJfBosZ3nDXnLeIayyWG2kw+pIMHyiokBDC2XKcxuofa7tcILhw4MdzgQwAXisxLEHym3CdiMu6 X-Received: by 10.99.97.66 with SMTP id v63mr22014569pgb.184.1514284472886; Tue, 26 Dec 2017 02:34:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1514284472; cv=none; d=google.com; s=arc-20160816; b=RkUGfLF3GvLi1e6yvgqOU9ZX+hf+4xKTDSdWepgw/JjxiSpWmaahdqepB1XnZ3toex uQqAwkt2W+noKUCZ108HD3a3OfmgmWGZte6UBlA7d4NJfleGazH4KGbEifsc+Zu1f7YV HECRAU9tjBXFARa2KoatZRCEP0HtEaEE7ygiAMfsGUZx4bMtDZdAsk3YnnCYAm5u1KZ8 yLo6uEyiWaO0YVDBk2ZrwIB9dPImGboMwGZuwZNdQE6GZRvB2OnVVAKip1jpHTVwY1Cx ZHwu/Sjm9Kyu2Eik38XVLJ2WKWVjRN48t6fN8IMIvkuchFi9Dg7vBq3zDXsKiQpglAtS uqSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=I6Gnvs3Ig+XMQwO7jGoODHjYQvJyRj6RsM7VfBN0Rns=; b=V8XbQ0UNcnQli/5KiP8yhc2tz9H2E1dMFl640r+xakoNp8HZ7DT+DZJrBIkdVHt3Jy xe3x6i2vFlp2vBxiuMIPYY6YRx8H0s003DJ1fVnuDAO/6Pz/xisxj8Vw/2bv0qBMTH69 Qo5Hm0+rDndlpZq/qJXbNHl3vZdFGlbjAao3oJVBIBQWu+NzioQwgx+rg8RE4VZybLlo kJR8gvJo2Ek3mYcItpWYxg3/JSu0gkY/GoXg4JBc8OI4qqxiCl6iZOX8sSctYxcVHC2L 4RLHoAKOTLBWZk4/cjLphYxvOCGTw9gfq7zWJNRbj2pm7EMEjBzw0XzGvaeUKK3u3S7W WYFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IzqHiJlq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g8si23296861pln.755.2017.12.26.02.34.32; Tue, 26 Dec 2017 02:34:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IzqHiJlq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751346AbdLZKe3 (ORCPT + 28 others); Tue, 26 Dec 2017 05:34:29 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:40139 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751111AbdLZKbA (ORCPT ); Tue, 26 Dec 2017 05:31:00 -0500 Received: by mail-wm0-f67.google.com with SMTP id f206so34333486wmf.5 for ; Tue, 26 Dec 2017 02:30:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=I6Gnvs3Ig+XMQwO7jGoODHjYQvJyRj6RsM7VfBN0Rns=; b=IzqHiJlqVMwHqeK502R0Rb0gO0YcyEoy8mBTLyern3S5c69/IVSddU0nXGk0tWEa7Z YNwa8e4HwCKzdAx6taiIeLGblw+Cx0moo1Vn1IOhicNG8o7jQT9+vwiSWDyDVry3qsi8 0NKu/d5cTXWUVR6Es8x1DfIoiZ5Z8q40MGuno= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=I6Gnvs3Ig+XMQwO7jGoODHjYQvJyRj6RsM7VfBN0Rns=; b=FHw/R91dGqQgo1mLJZ+DtF5JkPg0wp0he3l4zRig+JXMScVrTZQMOeQYZ4FDLaFpVE QhExKF/fgbk8SpLKBwC3AJ1p2SVU/F6RX9xT9vttbtgB56R6i1FUKyaS5036/vPVsrgy OtteGT9yxufQMnaKRFMxQfXQEU8Sl3wHVrUc9bp2amr5+zlPsntI8EIUkm71LRb52KGT 0UwsckGIF74KpLH7IT8rWunovxr/8zN0xAG1Cgg8c3aLH9Ldyt/jw/02d1uRi4q9PRyE n9MznGdCxr3YIV4fICSL/ogHMnTZqOUqBCyDiuP5GUxtkS8mfYVpEZfl4r+OTG4kTvrZ wYzw== X-Gm-Message-State: AKGB3mLlQEP5U+ymBYS6ZqhxxLhNbADABinMc7Ujr+o2j1u7L++GpTWm SBQ7ZKLP1VjJufUYwOlGkIzyARG/eG8= X-Received: by 10.28.222.132 with SMTP id v126mr19882764wmg.127.1514284258956; Tue, 26 Dec 2017 02:30:58 -0800 (PST) Received: from localhost.localdomain ([160.171.216.245]) by smtp.gmail.com with ESMTPSA id l142sm13974036wmb.43.2017.12.26.02.30.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 26 Dec 2017 02:30:58 -0800 (PST) From: Ard Biesheuvel To: linux-kernel@vger.kernel.org Cc: Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH v4 07/20] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path Date: Tue, 26 Dec 2017 10:29:27 +0000 Message-Id: <20171226102940.26908-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171226102940.26908-1-ard.biesheuvel@linaro.org> References: <20171226102940.26908-1-ard.biesheuvel@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-modes.S | 31 ++++++++++++++++---- 1 file changed, 25 insertions(+), 6 deletions(-) -- 2.11.0 diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 27a235b2ddee..e86535a1329d 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -94,17 +94,36 @@ AES_ENDPROC(aes_ecb_decrypt) */ AES_ENTRY(aes_cbc_encrypt) - ld1 {v0.16b}, [x5] /* get iv */ + ld1 {v4.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 -.Lcbcencloop: - ld1 {v1.16b}, [x1], #16 /* get next pt block */ - eor v0.16b, v0.16b, v1.16b /* ..and xor with iv */ +.Lcbcencloop4x: + subs w4, w4, #4 + bmi .Lcbcenc1x + ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */ + eor v0.16b, v0.16b, v4.16b /* ..and xor with iv */ encrypt_block v0, w3, x2, x6, w7 - st1 {v0.16b}, [x0], #16 + eor v1.16b, v1.16b, v0.16b + encrypt_block v1, w3, x2, x6, w7 + eor v2.16b, v2.16b, v1.16b + encrypt_block v2, w3, x2, x6, w7 + eor v3.16b, v3.16b, v2.16b + encrypt_block v3, w3, x2, x6, w7 + st1 {v0.16b-v3.16b}, [x0], #64 + mov v4.16b, v3.16b + b .Lcbcencloop4x +.Lcbcenc1x: + adds w4, w4, #4 + beq .Lcbcencout +.Lcbcencloop: + ld1 {v0.16b}, [x1], #16 /* get next pt block */ + eor v4.16b, v4.16b, v0.16b /* ..and xor with iv */ + encrypt_block v4, w3, x2, x6, w7 + st1 {v4.16b}, [x0], #16 subs w4, w4, #1 bne .Lcbcencloop - st1 {v0.16b}, [x5] /* return iv */ +.Lcbcencout: + st1 {v4.16b}, [x5] /* return iv */ ret AES_ENDPROC(aes_cbc_encrypt)