From patchwork Mon Apr 30 16:18:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 134720 Delivered-To: patch@linaro.org Received: by 10.46.151.6 with SMTP id r6csp3948380lji; Mon, 30 Apr 2018 09:19:06 -0700 (PDT) X-Google-Smtp-Source: AB8JxZorZswQbLxQzM7xxt+Eq+OI/gv4TROPQ7MigwlFxAGqVU+9zqFhDCgT4PSCesPFwt2inj/w X-Received: by 2002:a63:6783:: with SMTP id b125-v6mr10238079pgc.177.1525105146453; Mon, 30 Apr 2018 09:19:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525105146; cv=none; d=google.com; s=arc-20160816; b=FoTqK4Ji3Z0idZ4vnsBBzLr/Nrtjsb1pAuCArz7aJD3coXUDMr710c36KziJMpk8g1 QWZxkxgM20bCPlbJ8R/zr4GO46Iri955iXKONfFCVgimkTBAUVTK2RUQEjztrHeyuzDd 1+hzeD3AeUp+lSjmQGjWgvnv30EEoCclHo0R5jARnKsvZU9hSlSQMIKWjNbSXKhkrUvM ruPr44b8k6wBt4Z3py0UpbdG206+59AJRCEeTL7qDmmCz0ZiCNX8zRfiVzHbA1wkMbzd plVKCyVzytJzbCUjeAxoeVOdn9yJiZU9yhVBPFZyntdzWI92EXAAlEstyobpEuk5OObv EJDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=mwG0WG0+HI1ga14AdY6J5/Dk0o23fBmL75xEvbxHT5g=; b=0awNmv3+HfxcpLtTt9crmAveFoKuaXFX7quG0+I5v+UjoE5sqPHM0SufJx5ZMd3eDY l7oEhERN04jkmtFvlSJJLlWUkvm+Jqin+CmBDMQhqHOhggzmyeMk+hoFPKGSCf48ABJF /qgh8xViWJevPayKrlC+oXmMxqqK6B8dGFLmrKFwB/HXEfVltUQ36/Mw4UoAD67Gsd2W 6Z98/4mDAFx7WcZaSrUlNYbtPt3pDrAGIDQ7FGm1Kq+TlBE/63Zq6qroguROn21GpNlx KBopXXHOK5ebAm3JzxymuVHnm59HBamg4c60WE9OafInmT0R01T7sxqVgGH74IhdAW6K 3C8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CvIWwP0P; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f66-v6si5264259pgc.391.2018.04.30.09.19.06; Mon, 30 Apr 2018 09:19:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CvIWwP0P; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754705AbeD3QTF (ORCPT + 1 other); Mon, 30 Apr 2018 12:19:05 -0400 Received: from mail-wr0-f193.google.com ([209.85.128.193]:46236 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754674AbeD3QTE (ORCPT ); Mon, 30 Apr 2018 12:19:04 -0400 Received: by mail-wr0-f193.google.com with SMTP id o2-v6so5673370wrj.13 for ; Mon, 30 Apr 2018 09:19:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mwG0WG0+HI1ga14AdY6J5/Dk0o23fBmL75xEvbxHT5g=; b=CvIWwP0PEVVhdyqomj0HzfDrVHbGRDdBm0ORNmdI8+UO5PQH/hIPQ8BGd8/ukoj93X GOFK5FWPE5BVGGxtcdKm+bMbqm52U0aCjagoooO6yS3CNAPiIZf1V66gwNkfnKHX9oXA DZo1p7+6fy4+947WOcP193waFK98zejrAK0wE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mwG0WG0+HI1ga14AdY6J5/Dk0o23fBmL75xEvbxHT5g=; b=Acb89LLkLfdMDOr5e/j7Zrc5boLSlfVkU9UvIEssl0vMb58810fiVNNil6zcSOMCjl XbRjxwDqkTNy6005hmkXRbheq1nFzasZWcrs+wnUcl7/JJKVuId1EwPQHrhh/aeUmsB2 g1JMxQHC9VWbfDEGoyn1n3WqYPQYAo0f10KWzrCN24SJHzoKSYpNEUkR3U7H4YV88qil NKxmj6LjtAduQPXxYD5y2C8/bVJsvoC1HA31fj/6Yj20WiKZYh/NEOXAswWaS0wqn/6F CIZUqwzYy0ezRcfDG/+4yEv751aErDSMQtsGur/XA3RyF0MmOw+OTjLrLPFQPgZ+RZ52 EZlw== X-Gm-Message-State: ALQs6tDp/Re2wO//V5vwyLfNzp4ObzHSTEtlkM0QZ1Zhf4ycaPm3qDYD aipfzCPJx6lzePMX75kBLBJhCvLQwuY= X-Received: by 2002:adf:db85:: with SMTP id u5-v6mr8828662wri.278.1525105143500; Mon, 30 Apr 2018 09:19:03 -0700 (PDT) Received: from localhost.localdomain ([2a01:e35:3995:5470:200:1aff:fe1b:b328]) by smtp.gmail.com with ESMTPSA id l1-v6sm5753845wre.54.2018.04.30.09.19.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Apr 2018 09:19:02 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au Cc: linux-arm-kernel@lists.infradead.org, dave.martin@arm.com, will.deacon@arm.com, Ard Biesheuvel Subject: [PATCH resend 10/10] crypto: arm64/sha512-ce - yield NEON after every block of input Date: Mon, 30 Apr 2018 18:18:30 +0200 Message-Id: <20180430161830.14892-11-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180430161830.14892-1-ard.biesheuvel@linaro.org> References: <20180430161830.14892-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Avoid excessive scheduling delays under a preemptible kernel by conditionally yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/sha512-ce-core.S | 27 +++++++++++++++----- 1 file changed, 21 insertions(+), 6 deletions(-) -- 2.17.0 diff --git a/arch/arm64/crypto/sha512-ce-core.S b/arch/arm64/crypto/sha512-ce-core.S index 7f3bca5c59a2..ce65e3abe4f2 100644 --- a/arch/arm64/crypto/sha512-ce-core.S +++ b/arch/arm64/crypto/sha512-ce-core.S @@ -107,17 +107,23 @@ */ .text ENTRY(sha512_ce_transform) + frame_push 3 + + mov x19, x0 + mov x20, x1 + mov x21, x2 + /* load state */ - ld1 {v8.2d-v11.2d}, [x0] +0: ld1 {v8.2d-v11.2d}, [x19] /* load first 4 round constants */ adr_l x3, .Lsha512_rcon ld1 {v20.2d-v23.2d}, [x3], #64 /* load input */ -0: ld1 {v12.2d-v15.2d}, [x1], #64 - ld1 {v16.2d-v19.2d}, [x1], #64 - sub w2, w2, #1 +1: ld1 {v12.2d-v15.2d}, [x20], #64 + ld1 {v16.2d-v19.2d}, [x20], #64 + sub w21, w21, #1 CPU_LE( rev64 v12.16b, v12.16b ) CPU_LE( rev64 v13.16b, v13.16b ) @@ -196,9 +202,18 @@ CPU_LE( rev64 v19.16b, v19.16b ) add v11.2d, v11.2d, v3.2d /* handled all input blocks? */ - cbnz w2, 0b + cbz w21, 3f + + if_will_cond_yield_neon + st1 {v8.2d-v11.2d}, [x19] + do_cond_yield_neon + b 0b + endif_yield_neon + + b 1b /* store new state */ -3: st1 {v8.2d-v11.2d}, [x0] +3: st1 {v8.2d-v11.2d}, [x19] + frame_pop ret ENDPROC(sha512_ce_transform)