From patchwork Sun Jan 27 09:16:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 156686 Delivered-To: patch@linaro.org Received: by 2002:a02:48:0:0:0:0:0 with SMTP id 69csp2218823jaa; Sun, 27 Jan 2019 01:17:18 -0800 (PST) X-Google-Smtp-Source: ALg8bN5exqronMDU4k77Pm3RAfjbgm7N4w5lxWt/vUvbfy30XG4VOqN1v1NwS5TzW9v3Bnrb2TtY X-Received: by 2002:a17:902:2887:: with SMTP id f7mr17279595plb.176.1548580638461; Sun, 27 Jan 2019 01:17:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548580638; cv=none; d=google.com; s=arc-20160816; b=sX0D2hC7rBaPr+HD69Raw+kpn4b4LbrG0Q+0MBiC27ssXXaWdCR8Mwx3whe4fYaycq UZvcOmJG1lsEiIxdJDItiKEfWsQ5aNMJAZcCj2n8BB3qeZcckVXAxMNOrCqAPjbjifsd 3TtARkaqAsF65et1Ob6bfjpw7eVyp0bRew447Fo6Vpm19sxHVq0Z1zv3qQ6qMhzulohK H0Z/HqV8M6peKTdFVDQU7UFhsCHtSlxmKIk+U/UQ/EolPkbK1Ls2oJD9Up+wx78aBFsc Qou8AAS2yMbFwOlpW/oPQAekEv0LQobuvUMR+rBz4TwYfvlJUbtQQxR7Y9M1vKrZM8Gc N8qQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=E1b+RzfgoGRiRm1hKJ0ZbQ3LFQyMbA6410V+FseCjyQ=; b=VxREc7g7hfH2+Zn7mjzQPz+AyFlw7phWAGqsSghFfTzmD8TkbcKJIPZ4gr0Gwh5n32 C3FZi53YqQ8usifJJZuN/4KZQZG/JK/zffiqXonI4TjdqD8V/4L1omTJ8V5x++6X4sXs oGvybUUCGsWqSTALoTOQJ7vSsFXdNfIsO7pNiQ3kIHKWR/H7+TZ1IEvMCi7bh/ZJ0oqZ a6x2mUMArcYPKxkT2x47USQsLX7ZO0ZmdqW6L21iko5JqkhuehOHhUKAKvyDYkpKbQPU E63ox86Ni1kNSLSXKg8//cFuZbCHNSiT7qXXeGyafpHHzoARop+iBNre44m7D1xKmQnO UTDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="iJ9ozE/N"; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q24si5992230pgi.334.2019.01.27.01.17.18; Sun, 27 Jan 2019 01:17:18 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="iJ9ozE/N"; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726453AbfA0JRQ (ORCPT + 3 others); Sun, 27 Jan 2019 04:17:16 -0500 Received: from mail-wm1-f67.google.com ([209.85.128.67]:55647 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726327AbfA0JRQ (ORCPT ); Sun, 27 Jan 2019 04:17:16 -0500 Received: by mail-wm1-f67.google.com with SMTP id y139so10644314wmc.5 for ; Sun, 27 Jan 2019 01:17:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=E1b+RzfgoGRiRm1hKJ0ZbQ3LFQyMbA6410V+FseCjyQ=; b=iJ9ozE/N27xwQOzWq7nlQW2d0IrPfNr2Gs8VqAGzKTnDMW59Pw5NN+60LgbeqyP5vj KfVrLrHOxuwuzk8A0utvgEHbyPUyp94Ck5akrJDiJZcG2SihZZuXhHUjTi2+3twaTVLh CBVm1GMjf0MnXWxxsAQbKdttlPf3SX0w/jMWI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=E1b+RzfgoGRiRm1hKJ0ZbQ3LFQyMbA6410V+FseCjyQ=; b=Ye99qyq4Pqg19XOS7OQW+AhGD4TU9necwm1Pb7lyUrlh5BxIzsWuyvtvSN76BPv5+Z j/p5/2OopEULc2LC/iNG4mQmqRcHzKC7HtRnpKlRdyItMRyGncLQzH/+Qv/E3tKzaoH3 KN6og4XJlYdmpP2iWA/MsqoLikcGI8n1tqdQaVXNJzLGP5KlsPK7d0+/BmiilkUf0NCy E51h7g75qDRpqcUTzannZ/ZOm1Ut37SvFgxP0bvk9K8tvZd+fkSzqgt2T+BcXRRjxWYF XabTnM24MqHxhdXwe9TfGwrF0ZBXlmXSLF8A+RFNX6frCy9oe4En1IcSbAcClPHbyCRM ZNhA== X-Gm-Message-State: AJcUukc5ieS6COhnXdgNiik5Wy70gYCuVU8kknrgqYaFdhPqe/Mx8+eE C+/VjCK5dUfuaGfFCScxQx5MDbfcBfcNPQ== X-Received: by 2002:a1c:7d06:: with SMTP id y6mr12720895wmc.7.1548580634492; Sun, 27 Jan 2019 01:17:14 -0800 (PST) Received: from sudo.home ([2a01:cb1d:112:6f00:887d:32e9:9391:d3bf]) by smtp.gmail.com with ESMTPSA id o5sm45397013wrw.46.2019.01.27.01.17.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 27 Jan 2019 01:17:13 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, ebiggers@kernel.org, Ard Biesheuvel Subject: [PATCH v3 1/4] crypto: arm/crct10dif - revert to C code for short inputs Date: Sun, 27 Jan 2019 10:16:52 +0100 Message-Id: <20190127091655.6262-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190127091655.6262-1-ard.biesheuvel@linaro.org> References: <20190127091655.6262-1-ard.biesheuvel@linaro.org> MIME-Version: 1.0 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The SIMD routine ported from x86 used to have a special code path for inputs < 16 bytes, which got lost somewhere along the way. Instead, the current glue code aligns the input pointer to permit the NEON routine to use special versions of the vld1 instructions that assume 16 byte alignment, but this could result in inputs of less than 16 bytes to be passed in. This not only fails the new extended tests that Eric has implemented, it also results in the code reading past the end of the input, which could potentially result in crashes when dealing with less than 16 bytes of input at the end of a page which is followed by an unmapped page. So update the glue code to only invoke the NEON routine if the input is at least 16 bytes. Reported-by: Eric Biggers Reviewed-by: Eric Biggers Fixes: 1d481f1cd892 ("crypto: arm/crct10dif - port x86 SSE implementation to ARM") Cc: # v4.10+ Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/crct10dif-ce-core.S | 14 ++++++------ arch/arm/crypto/crct10dif-ce-glue.c | 23 +++++--------------- 2 files changed, 13 insertions(+), 24 deletions(-) -- 2.20.1 diff --git a/arch/arm/crypto/crct10dif-ce-core.S b/arch/arm/crypto/crct10dif-ce-core.S index ce45ba0c0687..16019b5961e7 100644 --- a/arch/arm/crypto/crct10dif-ce-core.S +++ b/arch/arm/crypto/crct10dif-ce-core.S @@ -124,10 +124,10 @@ ENTRY(crc_t10dif_pmull) vext.8 q10, qzr, q0, #4 // receive the initial 64B data, xor the initial crc value - vld1.64 {q0-q1}, [arg2, :128]! - vld1.64 {q2-q3}, [arg2, :128]! - vld1.64 {q4-q5}, [arg2, :128]! - vld1.64 {q6-q7}, [arg2, :128]! + vld1.64 {q0-q1}, [arg2]! + vld1.64 {q2-q3}, [arg2]! + vld1.64 {q4-q5}, [arg2]! + vld1.64 {q6-q7}, [arg2]! CPU_LE( vrev64.8 q0, q0 ) CPU_LE( vrev64.8 q1, q1 ) CPU_LE( vrev64.8 q2, q2 ) @@ -167,7 +167,7 @@ CPU_LE( vrev64.8 q7, q7 ) _fold_64_B_loop: .macro fold64, reg1, reg2 - vld1.64 {q11-q12}, [arg2, :128]! + vld1.64 {q11-q12}, [arg2]! vmull.p64 q8, \reg1\()h, d21 vmull.p64 \reg1, \reg1\()l, d20 @@ -238,7 +238,7 @@ _16B_reduction_loop: vmull.p64 q7, d15, d21 veor.8 q7, q7, q8 - vld1.64 {q0}, [arg2, :128]! + vld1.64 {q0}, [arg2]! CPU_LE( vrev64.8 q0, q0 ) vswp d0, d1 veor.8 q7, q7, q0 @@ -335,7 +335,7 @@ _less_than_128: vmov.i8 q0, #0 vmov s3, arg1_low32 // get the initial crc value - vld1.64 {q7}, [arg2, :128]! + vld1.64 {q7}, [arg2]! CPU_LE( vrev64.8 q7, q7 ) vswp d14, d15 veor.8 q7, q7, q0 diff --git a/arch/arm/crypto/crct10dif-ce-glue.c b/arch/arm/crypto/crct10dif-ce-glue.c index d428355cf38d..14c19c70a841 100644 --- a/arch/arm/crypto/crct10dif-ce-glue.c +++ b/arch/arm/crypto/crct10dif-ce-glue.c @@ -35,26 +35,15 @@ static int crct10dif_update(struct shash_desc *desc, const u8 *data, unsigned int length) { u16 *crc = shash_desc_ctx(desc); - unsigned int l; - if (!may_use_simd()) { - *crc = crc_t10dif_generic(*crc, data, length); + if (length >= CRC_T10DIF_PMULL_CHUNK_SIZE && may_use_simd()) { + kernel_neon_begin(); + *crc = crc_t10dif_pmull(*crc, data, length); + kernel_neon_end(); } else { - if (unlikely((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE)) { - l = min_t(u32, length, CRC_T10DIF_PMULL_CHUNK_SIZE - - ((u32)data % CRC_T10DIF_PMULL_CHUNK_SIZE)); - - *crc = crc_t10dif_generic(*crc, data, l); - - length -= l; - data += l; - } - if (length > 0) { - kernel_neon_begin(); - *crc = crc_t10dif_pmull(*crc, data, length); - kernel_neon_end(); - } + *crc = crc_t10dif_generic(*crc, data, length); } + return 0; }