From patchwork Sat Mar 10 15:22:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 131312 Delivered-To: patch@linaro.org Received: by 10.46.66.2 with SMTP id p2csp2245291lja; Sat, 10 Mar 2018 07:23:43 -0800 (PST) X-Google-Smtp-Source: AG47ELtoQ4pBujDEezfalrqflha3exBzkYsoxSEtZt+VhXmTxhIIuLJoHNj67VgbxqBYSGwmSSRe X-Received: by 10.98.86.151 with SMTP id h23mr2257711pfj.79.1520695423577; Sat, 10 Mar 2018 07:23:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520695423; cv=none; d=google.com; s=arc-20160816; b=gdios5JAg7XUJKtPOw6e/rGKJr3gz3Q3yJm6mlpccztHg8LOQtqM5RKcnZsOzjyeAL kKii1+DMuDzKHbrcgen8Ev2n0rbZ77je0cYaY6zFIQ3q/rQ+QLH1da3Y7l5Nr/dVIkWk vDS8+uWbjd2FXjj7wR8M/d4GnTnEHeTNMSE6M42vecB4dUdiOBYJdD35rget1MBkC+oT IzilDn/QlR3wYykgXdw2ush35oEeRwsRBK+Daihvk7UYoGHkT7fLw377A0Cze/0iS04y SztthHRqoktyHO8jDPVGfj0tc41Jv9BM+7JRNwedMR2pvshIrd5Y+OWf8mDBgpDXzwxz tSng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=ONrynQy694N0f18mAFgrnch1E0o/w8Y4ap1M7Kv/2PQ=; b=Cl8HYUnoytG3gVDueKF2soQ9JbP68pRGLpabU9doymHlMp7BMQ5fNgEkKdmImRAeo+ RyR5lTaz/YxYkdDhChrEoqj4oWcGZ/1y0cnQw1DE67S00BahEbyxygK6A9IWoNwf+XcK 0yrNNkPWJwUVlfvAaHU7MxInNxOtkBtut3X1tOVrxKUVNBgF751kJZclXFsdyst25Bfh KF8iC7sZycsFrGjALoLSnnO94LO83kjVF022eEv8AaonneMh+n9tJQXkOMuGx742Aj7N bhNseh0b0oq0kGV5jpJO59dV3RCo9LwVfcyauK6DnoFTcTCuWhqJ/LbzupbQ4pjK4PCh zEtg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TkqZNI/S; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q16si2824191pfg.221.2018.03.10.07.23.43; Sat, 10 Mar 2018 07:23:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TkqZNI/S; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932350AbeCJPXl (ORCPT + 1 other); Sat, 10 Mar 2018 10:23:41 -0500 Received: from mail-wr0-f196.google.com ([209.85.128.196]:46458 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932368AbeCJPXi (ORCPT ); Sat, 10 Mar 2018 10:23:38 -0500 Received: by mail-wr0-f196.google.com with SMTP id m12so11623489wrm.13 for ; Sat, 10 Mar 2018 07:23:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ONrynQy694N0f18mAFgrnch1E0o/w8Y4ap1M7Kv/2PQ=; b=TkqZNI/SMik+PW8VNPdSpSjqYOXY0AgsToTWPleZMIUoUcr4mbLXuTxlc6ryGal7Ll QbJjSHnu87JjN+dumPSPN04z1wbDIAup0TbJ8/yxiIhh+/LLS/A4KS3LqrzRqn9ZxTRg 7RvJR98iQYfzRnUVkTcR64YFK/BV65z071L9I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ONrynQy694N0f18mAFgrnch1E0o/w8Y4ap1M7Kv/2PQ=; b=YVxnlOE9OPiMLPC4qAL342ZPl8Kf/b6OOCG+xluNdNodMSwZ7S3qsV5qNjvSwoIBnq k6MxcFPrKJzeXOxjqKRwbFrD8ZcuD3kgXOM5/kvTvlTcZA6TtgElbeLb5OIHctuTLuYf o/1E0zv0Q03JFaEMS6lR3gdbbWkcq2YcWGOKK++iT8JiqDVkej5yVKmfjLtlVQvAReba +ilqhZera97oC7PX2iOaYsZhBW+Tw4xIS8EX4qccu831SgryL9LXoXn4NQ6ohFeEyUH4 Co+z7aaEVqeSc+e9tWS3jVkkKWuIy9738q2a0JZtcIdZtSrME30Pn7cEn+EHdY7ts8xZ FYsQ== X-Gm-Message-State: AElRT7Foo/LY2B+G1LhTBdGqE77+FTDltvMIDYtmP3toU2nTBTQm6KvA wveqkfTMofoykXX6Hu8qNu+tNtaJJ6w= X-Received: by 10.223.147.36 with SMTP id 33mr1659969wro.133.1520695417246; Sat, 10 Mar 2018 07:23:37 -0800 (PST) Received: from localhost.localdomain ([105.148.128.186]) by smtp.gmail.com with ESMTPSA id m9sm7027531wrf.13.2018.03.10.07.23.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Mar 2018 07:23:36 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH v5 22/23] crypto: arm64/sm3-ce - yield NEON after every block of input Date: Sat, 10 Mar 2018 15:22:07 +0000 Message-Id: <20180310152208.10369-23-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180310152208.10369-1-ard.biesheuvel@linaro.org> References: <20180310152208.10369-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Avoid excessive scheduling delays under a preemptible kernel by conditionally yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/sm3-ce-core.S | 30 +++++++++++++++----- 1 file changed, 23 insertions(+), 7 deletions(-) -- 2.15.1 diff --git a/arch/arm64/crypto/sm3-ce-core.S b/arch/arm64/crypto/sm3-ce-core.S index 27169fe07a68..5a116c8d0cee 100644 --- a/arch/arm64/crypto/sm3-ce-core.S +++ b/arch/arm64/crypto/sm3-ce-core.S @@ -77,19 +77,25 @@ */ .text ENTRY(sm3_ce_transform) + frame_push 3 + + mov x19, x0 + mov x20, x1 + mov x21, x2 + /* load state */ - ld1 {v8.4s-v9.4s}, [x0] + ld1 {v8.4s-v9.4s}, [x19] rev64 v8.4s, v8.4s rev64 v9.4s, v9.4s ext v8.16b, v8.16b, v8.16b, #8 ext v9.16b, v9.16b, v9.16b, #8 - adr_l x8, .Lt +0: adr_l x8, .Lt ldp s13, s14, [x8] /* load input */ -0: ld1 {v0.16b-v3.16b}, [x1], #64 - sub w2, w2, #1 +1: ld1 {v0.16b-v3.16b}, [x20], #64 + sub w21, w21, #1 mov v15.16b, v8.16b mov v16.16b, v9.16b @@ -125,14 +131,24 @@ CPU_LE( rev32 v3.16b, v3.16b ) eor v9.16b, v9.16b, v16.16b /* handled all input blocks? */ - cbnz w2, 0b + cbz w21, 2f + + if_will_cond_yield_neon + st1 {v8.4s-v9.4s}, [x19] + do_cond_yield_neon + ld1 {v8.4s-v9.4s}, [x19] + b 0b + endif_yield_neon + + b 1b /* save state */ - rev64 v8.4s, v8.4s +2: rev64 v8.4s, v8.4s rev64 v9.4s, v9.4s ext v8.16b, v8.16b, v8.16b, #8 ext v9.16b, v9.16b, v9.16b, #8 - st1 {v8.4s-v9.4s}, [x0] + st1 {v8.4s-v9.4s}, [x19] + frame_pop ret ENDPROC(sm3_ce_transform)