From patchwork Wed Oct 12 21:59:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E57DC4332F for ; Wed, 12 Oct 2022 22:00:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229955AbiJLWAW (ORCPT ); Wed, 12 Oct 2022 18:00:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229910AbiJLWAN (ORCPT ); Wed, 12 Oct 2022 18:00:13 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C63B29836; Wed, 12 Oct 2022 15:00:08 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7PBG016325; Wed, 12 Oct 2022 21:59:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ieDB45JQqpAn8o0cFOMeE/nL1kcgeiYxQWZKC81F9j4=; b=fsyctmOy/FYzK3XyIUb8Si33A4BcMpuF9C67VH44qcpSAcuIn4pcet+hyATkTV7j28pE OFVippZ56La6QNVD03Pnxz6K0LHJ1IxrBZpK2UP7gUUB49RBEYP5WJkRvm9asm92yZ1x qR6Wc/YtK2gIhxfM6c9yNsl/u7+XIsp7608Qg2OHSLP7vRHtXUp5DG6bEcLl8msPt6Mt q9EKIg5pRWClv/eYKV4E0KDFITFi9AYzo6BO6cm+BI78Rbk/os3OoAk/AfiuJIZfeY06 unw02LnhGLPj51ENwgjOoHEalmXVVqPxPvVIvhNI0i7jWA+hUThLS0xkgiNO+fvhQ7GU mw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8a86-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:44 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 93BE21395E; Wed, 12 Oct 2022 21:59:43 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 2AF43806B7E; Wed, 12 Oct 2022 21:59:43 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 01/19] crypto: tcrypt - test crc32 Date: Wed, 12 Oct 2022 16:59:13 -0500 Message-Id: <20221012215931.3896-2-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: M8yLfQztDkfGvUQlwxTk7O468trsTZRl X-Proofpoint-GUID: M8yLfQztDkfGvUQlwxTk7O468trsTZRl X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=943 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test and speed tests for crc32, paralleling those offered for crc32c and crct10dif. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index a82679b576bb..4426386dfb42 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1711,6 +1711,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("gcm(aria)"); break; + case 59: + ret += tcrypt_test("crc32"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; @@ -2317,6 +2321,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) generic_hash_speed_template); if (mode > 300 && mode < 400) break; fallthrough; + case 329: + test_hash_speed("crc32", sec, generic_hash_speed_template); + if (mode > 300 && mode < 400) break; + fallthrough; case 399: break; From patchwork Wed Oct 12 21:59:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB88C433FE for ; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229565AbiJLWAD (ORCPT ); Wed, 12 Oct 2022 18:00:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229755AbiJLWAB (ORCPT ); Wed, 12 Oct 2022 18:00:01 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A91421A3A4; Wed, 12 Oct 2022 14:59:59 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL4RTk005193; Wed, 12 Oct 2022 21:59:46 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=Nj/d5OC2AY6b4soOh7HzxmFB+I7wGPKsbzfoKqMoiAo=; b=mPsvLxOfIXLYpsskHdU5Yah6fDTyDmk7EYqojBSolApJF9of4Kda4tJxBDYFeOFhJAzY a5eAuMuT/V+wnNv8XPjcUtCyZRDbHtORDRNzTU7a/hsKiQdg5EsoLqJgX5tV3RUgaAzO SnuwiQWaqXMahGsjhjDZeSksbjavla2tW20RCsjbY8CubbaIk/5CJlUvisA6K9s6cvnc H7mNmVNM5zIofME/gtoMuYWRiA/74YoaXA+eyrV/e5m33zY3KZCSd5tK9s3F5aOzaQqM VC/85Ofa4iX4niHexAqB0X+OWH8HTNwD6NmAHaSBrOqQ2JTnCOpocIlVGE2+2tPoBcAW Gg== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtmr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:45 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 359EDD25E; Wed, 12 Oct 2022 21:59:45 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id C79D6805032; Wed, 12 Oct 2022 21:59:44 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 02/19] crypto: tcrypt - test nhpoly1305 Date: Wed, 12 Oct 2022 16:59:14 -0500 Message-Id: <20221012215931.3896-3-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: TfiL1oFu1kgu4lTkUG7plhaa5JSLfse- X-Proofpoint-ORIG-GUID: TfiL1oFu1kgu4lTkUG7plhaa5JSLfse- X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=917 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add self-test mode for nhpoly1305. Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 4426386dfb42..7a6a56751043 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -1715,6 +1715,10 @@ static int do_test(const char *alg, u32 type, u32 mask, int m, u32 num_mb) ret += tcrypt_test("crc32"); break; + case 60: + ret += tcrypt_test("nhpoly1305"); + break; + case 100: ret += tcrypt_test("hmac(md5)"); break; From patchwork Wed Oct 12 21:59:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 235B7C4321E for ; Wed, 12 Oct 2022 22:00:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229893AbiJLWAK (ORCPT ); Wed, 12 Oct 2022 18:00:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229844AbiJLWAE (ORCPT ); Wed, 12 Oct 2022 18:00:04 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 832931D673; Wed, 12 Oct 2022 15:00:01 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7teG016232; Wed, 12 Oct 2022 21:59:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=EFnJoixPZ3/Qjr5SSbSdQKS+GwLwG6oMKcGkzRv3xr4=; b=jSk9PP8WRCVAOeGUixGXW9NLr2sXIe6c1aUAQMCd7ycrg3gvQZMqyp68g6j7DZMfWsBX xkMVpZeSW1OQ6y2jPA3bHTZLI5TP7tk4SoyeM7pX4h+JQcYVO9TU8TqBRwE2fBFru+jZ Ah1wYcYFxGsNN+aQjWi9vAvzMEkx46MWpLp3CIf3lX0+cyHYYow40G8R/zYZMEP6W8Hp uiodQ3dNzJgwcKEdWrfOJoBwRIUAimu/TwGhOgLVp64y2zKfSVXxSttnLXbslK9exD8z 96oTbYOBl4i4ZykfTsmNE0HeTJmy2+XKfjqOb7n4SXf/0wtZJFvjVRptOSQngqdPJ1K0 ag== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a3w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:48 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id CFA29806B5B; Wed, 12 Oct 2022 21:59:46 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 6C00D806B7E; Wed, 12 Oct 2022 21:59:46 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 03/19] crypto: tcrypt - reschedule during cycles speed tests Date: Wed, 12 Oct 2022 16:59:15 -0500 Message-Id: <20221012215931.3896-4-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: GHeQjPytoHSRh0cwqRJqjh7xyq6hpR-c X-Proofpoint-GUID: GHeQjPytoHSRh0cwqRJqjh7xyq6hpR-c X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org commit 2af632996b89 ("crypto: tcrypt - reschedule during speed tests") added cond_resched() calls to "Avoid RCU stalls in the case of non-preemptible kernel and lengthy speed tests by rescheduling when advancing from one block size to another." It only makes those calls if the sec module parameter is used (run the speed test for a certain number of seconds), not the default "cycles" mode. Expand those to also run in "cycles" mode to reduce the rate of rcu stall warnings: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: Suggested-by: Herbert Xu Tested-by: Taehee Yoo Signed-off-by: Robert Elliott --- crypto/tcrypt.c | 44 ++++++++++++++++++-------------------------- 1 file changed, 18 insertions(+), 26 deletions(-) diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c index 7a6a56751043..c025ba26b663 100644 --- a/crypto/tcrypt.c +++ b/crypto/tcrypt.c @@ -408,14 +408,13 @@ static void test_mb_aead_speed(const char *algo, int enc, int secs, } - if (secs) { + if (secs) ret = test_mb_aead_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_aead_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -661,13 +660,11 @@ static void test_aead_speed(const char *algo, int enc, unsigned int secs, bs + (enc ? 0 : authsize), iv); - if (secs) { - ret = test_aead_jiffies(req, enc, bs, - secs); - cond_resched(); - } else { + if (secs) + ret = test_aead_jiffies(req, enc, bs, secs); + else ret = test_aead_cycles(req, enc, bs); - } + cond_resched(); if (ret) { pr_err("%s() failed return code=%d\n", e, ret); @@ -917,14 +914,13 @@ static void test_ahash_speed_common(const char *algo, unsigned int secs, ahash_request_set_crypt(req, sg, output, speed[i].plen); - if (secs) { + if (secs) ret = test_ahash_jiffies(req, speed[i].blen, speed[i].plen, output, secs); - cond_resched(); - } else { + else ret = test_ahash_cycles(req, speed[i].blen, speed[i].plen, output); - } + cond_resched(); if (ret) { pr_err("hashing failed ret=%d\n", ret); @@ -1184,15 +1180,14 @@ static void test_mb_skcipher_speed(const char *algo, int enc, int secs, cur->sg, bs, iv); } - if (secs) { + if (secs) ret = test_mb_acipher_jiffies(data, enc, bs, secs, num_mb); - cond_resched(); - } else { + else ret = test_mb_acipher_cycles(data, enc, bs, num_mb); - } + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, @@ -1401,14 +1396,11 @@ static void test_skcipher_speed(const char *algo, int enc, unsigned int secs, skcipher_request_set_crypt(req, sg, sg, bs, iv); - if (secs) { - ret = test_acipher_jiffies(req, enc, - bs, secs); - cond_resched(); - } else { - ret = test_acipher_cycles(req, enc, - bs); - } + if (secs) + ret = test_acipher_jiffies(req, enc, bs, secs); + else + ret = test_acipher_cycles(req, enc, bs); + cond_resched(); if (ret) { pr_err("%s() failed flags=%x\n", e, From patchwork Wed Oct 12 21:59:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614896 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D8E3C433FE for ; Wed, 12 Oct 2022 22:04:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229735AbiJLWEb (ORCPT ); Wed, 12 Oct 2022 18:04:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiJLWDo (ORCPT ); Wed, 12 Oct 2022 18:03:44 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC1F912C8A3; Wed, 12 Oct 2022 15:01:20 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwflC006339; Wed, 12 Oct 2022 21:59:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=pkIjY3xNYdD7u0iVrBec5NLvLKLYByXvLSfeEAPzOWA=; b=H/+AVupIA0qT3WW+ZAU2dM2Gj2l3wqwR0WsAa/VfsVsS5iNMLJC7y8Ht8zY9f5OvhkBx 2yLbpwmj37y7COimS4bcYZXap7tWfA+f9jfIliFcbQ+513C5zXVqoqpMPMUKpaQZdv18 hktdM2YGYo3tRFTVAi7YV1xAMWnSW89cLVpbAz4A923DrH/GUDNRl5Ld07SqTiP+PXvj VO8N9e4AHFDwKfHaVK05G3rZ6EdQ3Am7Xvyz8NwOIvIN/3DX0mQtHBM0ONsISnk24f22 +trjHtXOEgf2mx27hfc02CqJehN6CTaOKRtBmoBwWIjCo4twBOe4dR3MxIan4xh9zojX PQ== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8da7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:50 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 8611C806B5C; Wed, 12 Oct 2022 21:59:49 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 28602808ECE; Wed, 12 Oct 2022 21:59:49 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 04/19] crypto: x86/sha - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:16 -0500 Message-Id: <20221012215931.3896-5-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Uwyq1NOSShnQQpJzF5T27vXCpBPm0wlx X-Proofpoint-ORIG-GUID: Uwyq1NOSShnQQpJzF5T27vXCpBPm0wlx X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running. This leads to "rcu_preempt detected expedited stalls" with stack dumps pointing to the optimized hash function if the module is loaded and used a lot: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... For example, that can occur during boot with the stack track pointing to the sha512-x86 function if the system set to use SHA-512 for module signing. The call trace includes: module_sig_check mod_verify_sig pkcs7_verify pkcs7_digest sha512_finup sha512_base_do_update Fixes: 66be89515888 ("crypto: sha1 - SSSE3 based SHA1 implementation for x86-64") Fixes: 8275d1aa6422 ("crypto: sha256 - Create module providing optimized SHA256 routines using SSSE3, AVX or AVX2 instructions.") Fixes: 87de4579f92d ("crypto: sha512 - Create module providing optimized SHA512 routines using SSSE3, AVX or AVX2 instructions.") Fixes: aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") Suggested-by: Herbert Xu Reviewed-by: Tim Chen Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 32 ++++++++++++++++++++++++----- arch/x86/crypto/sha256_ssse3_glue.c | 32 ++++++++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 32 ++++++++++++++++++++++++----- 3 files changed, 81 insertions(+), 15 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 44340a1139e0..a9f5779b41ca 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -26,6 +26,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { @@ -41,9 +43,18 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0); - kernel_fpu_begin(); - sha1_base_do_update(desc, data, len, sha1_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -54,9 +65,20 @@ static int sha1_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha1_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha1_base_do_update(desc, data, chunk, sha1_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha1_base_do_update(desc, data, len, sha1_xform); sha1_base_do_finalize(desc, sha1_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 3a5f6be7dbba..322c8aa907af 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -40,6 +40,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); @@ -58,9 +60,18 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); - kernel_fpu_begin(); - sha256_base_do_update(desc, data, len, sha256_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -71,9 +82,20 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha256_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha256_base_do_update(desc, data, chunk, sha256_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha256_base_do_update(desc, data, len, sha256_xform); sha256_base_do_finalize(desc, sha256_xform); kernel_fpu_end(); diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 6d3b85e53d0e..fd5075a32613 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -39,6 +39,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); @@ -57,9 +59,18 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); - kernel_fpu_begin(); - sha512_base_do_update(desc, data, len, sha512_xform); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -70,9 +81,20 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, if (!crypto_simd_usable()) return crypto_sha512_finup(desc, data, len, out); + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sha512_base_do_update(desc, data, chunk, sha512_xform); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); + kernel_fpu_begin(); - if (len) - sha512_base_do_update(desc, data, len, sha512_xform); sha512_base_do_finalize(desc, sha512_xform); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5B7FC4332F for ; Wed, 12 Oct 2022 22:00:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229550AbiJLWAn (ORCPT ); Wed, 12 Oct 2022 18:00:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229972AbiJLWA3 (ORCPT ); Wed, 12 Oct 2022 18:00:29 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90C1E422C0; Wed, 12 Oct 2022 15:00:12 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwqKa008071; Wed, 12 Oct 2022 21:59:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=QTkyAOc86d/ALjZrO/JAy9mwf2Rc5zZNS5XWX8x3KVU=; b=F15R0CXaZFRQYQbo63rBujJeUwVGm5wVfZpKkJGSgieEZcMt2EFbdYPECgs4xLBYmb4k LVjfvJsuUDHHZj9eaItdHeqAsp7HgE4dtBkHmImBJydkqiJ0E0738Wk7VR7cisacdOxN l0BjpNb0qPwiM8UHOBG2Qv6gzu4KWUOV8yvHUOJH2pSKyYygfoYBFiAL/HmDfHLeDbaM xNHt+gQuiWoWktqp09iNI4/5DMoIzqVQdnGB+Vy4RlbDPgBPq+L/EiMrPrdag1c/Y4n0 XOXJAzx4LoOKDOUfQEWSczS8mtxI+JV3QPrIBo95k4OevsH35qrm1Sd2X87xZta2SmJb ww== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8daf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:52 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id DFE7613965; Wed, 12 Oct 2022 21:59:51 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 853DA808EE1; Wed, 12 Oct 2022 21:59:51 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 05/19] crypto: x86/crc - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:17 -0500 Message-Id: <20221012215931.3896-6-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: P2-UHsNACr6wYY-UIFlHqmPBROyZAoO5 X-Proofpoint-ORIG-GUID: P2-UHsNACr6wYY-UIFlHqmPBROyZAoO5 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 78c37d191dd6 ("crypto: crc32 - add crc32 pclmulqdq implementation and wrappers for table implementation") Fixes: 6a8ce1ef3940 ("crypto: crc32c - Optimize CRC32C calculation with PCLMULQDQ instruction") Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_asm.S | 6 ++-- arch/x86/crypto/crc32-pclmul_glue.c | 19 ++++++++---- arch/x86/crypto/crc32c-intel_glue.c | 29 ++++++++++++++---- arch/x86/crypto/crct10dif-pclmul_glue.c | 39 ++++++++++++++++++++----- 4 files changed, 71 insertions(+), 22 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_asm.S b/arch/x86/crypto/crc32-pclmul_asm.S index ca53e96996ac..9abd861636c3 100644 --- a/arch/x86/crypto/crc32-pclmul_asm.S +++ b/arch/x86/crypto/crc32-pclmul_asm.S @@ -72,15 +72,15 @@ .text /** * Calculate crc32 - * BUF - buffer (16 bytes aligned) - * LEN - sizeof buffer (16 bytes aligned), LEN should be grater than 63 + * BUF - buffer - must be 16 bytes aligned + * LEN - sizeof buffer - must be multiple of 16 bytes and greater than 63 * CRC - initial crc32 * return %eax crc32 * uint crc32_pclmul_le_16(unsigned char const *buffer, * size_t len, uint crc32) */ -SYM_FUNC_START(crc32_pclmul_le_16) /* buffer and buffer size are 16 bytes aligned */ +SYM_FUNC_START(crc32_pclmul_le_16) movdqa (BUF), %xmm1 movdqa 0x10(BUF), %xmm2 movdqa 0x20(BUF), %xmm3 diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 98cf3b4e4c9f..38539c6edfe5 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -46,6 +46,8 @@ #define SCALE_F 16L /* size of xmm register */ #define SCALE_F_MASK (SCALE_F - 1) +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + u32 crc32_pclmul_le_16(unsigned char const *buffer, size_t len, u32 crc32); static u32 __attribute__((pure)) @@ -70,12 +72,19 @@ static u32 __attribute__((pure)) iquotient = len & (~SCALE_F_MASK); iremainder = len & SCALE_F_MASK; - kernel_fpu_begin(); - crc = crc32_pclmul_le_16(p, iquotient, crc); - kernel_fpu_end(); + do { + unsigned int chunk = min(iquotient, FPU_BYTES); + + kernel_fpu_begin(); + crc = crc32_pclmul_le_16(p, chunk, crc); + kernel_fpu_end(); + + iquotient -= chunk; + p += chunk; + } while (iquotient >= PCLMUL_MIN_LEN); - if (iremainder) - crc = crc32_le(crc, p + iquotient, iremainder); + if (iquotient || iremainder) + crc = crc32_le(crc, p, iquotient + iremainder); return crc; } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index feccb5254c7e..ece620227057 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -41,6 +41,8 @@ */ #define CRC32C_PCL_BREAKEVEN 512 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage unsigned int crc_pcl(const u8 *buffer, int len, unsigned int crc_init); #endif /* CONFIG_X86_64 */ @@ -158,9 +160,16 @@ static int crc32c_pcl_intel_update(struct shash_desc *desc, const u8 *data, * overcome kernel fpu state save/restore overhead */ if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *crcp = crc_pcl(data, len, *crcp); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); } else *crcp = crc32c_intel_le_hw(*crcp, data, len); return 0; @@ -170,9 +179,17 @@ static int __crc32c_pcl_intel_finup(u32 *crcp, const u8 *data, unsigned int len, u8 *out) { if (len >= CRC32C_PCL_BREAKEVEN && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__le32 *)out = ~cpu_to_le32(crc_pcl(data, len, *crcp)); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + *crcp = crc_pcl(data, chunk, *crcp); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); + *(__le32 *)out = ~cpu_to_le32(*crcp); } else *(__le32 *)out = ~cpu_to_le32(crc32c_intel_le_hw(*crcp, data, len)); diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 71291d5af9f4..54a537fc88ee 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -34,6 +34,10 @@ #include #include +#define PCLMUL_MIN_LEN 16U /* minimum size of buffer for crc_t10dif_pcl */ + +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage u16 crc_t10dif_pcl(u16 init_crc, const u8 *buf, size_t len); struct chksum_desc_ctx { @@ -54,10 +58,19 @@ static int chksum_update(struct shash_desc *desc, const u8 *data, { struct chksum_desc_ctx *ctx = shash_desc_ctx(desc); - if (length >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - ctx->crc = crc_t10dif_pcl(ctx->crc, data, length); - kernel_fpu_end(); + if (length >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + do { + unsigned int chunk = min(length, FPU_BYTES); + + kernel_fpu_begin(); + ctx->crc = crc_t10dif_pcl(ctx->crc, data, chunk); + kernel_fpu_end(); + + length -= chunk; + data += chunk; + } while (length >= PCLMUL_MIN_LEN); + if (length) + ctx->crc = crc_t10dif_generic(ctx->crc, data, length); } else ctx->crc = crc_t10dif_generic(ctx->crc, data, length); return 0; @@ -73,10 +86,20 @@ static int chksum_final(struct shash_desc *desc, u8 *out) static int __chksum_finup(__u16 crc, const u8 *data, unsigned int len, u8 *out) { - if (len >= 16 && crypto_simd_usable()) { - kernel_fpu_begin(); - *(__u16 *)out = crc_t10dif_pcl(crc, data, len); - kernel_fpu_end(); + if (len >= PCLMUL_MIN_LEN && crypto_simd_usable()) { + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + crc = crc_t10dif_pcl(crc, data, chunk); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len >= PCLMUL_MIN_LEN); + if (len) + crc = crc_t10dif_generic(crc, data, len); + *(__u16 *)out = crc; } else *(__u16 *)out = crc_t10dif_generic(crc, data, len); return 0; From patchwork Wed Oct 12 21:59:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EC69C433FE for ; Wed, 12 Oct 2022 22:00:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229912AbiJLWAN (ORCPT ); Wed, 12 Oct 2022 18:00:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229854AbiJLWAE (ORCPT ); Wed, 12 Oct 2022 18:00:04 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 830BD1D656; Wed, 12 Oct 2022 15:00:01 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7nUY016092; Wed, 12 Oct 2022 21:59:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=ucCBxZTErIqQD5ALBAy3477d9TkzMjYJkeMraMaDllo=; b=oO9OfIR5WNQgWIqFPF/XIYrSG39bYT/aMzXVmpBqZEKeMCtE64GD0+lwWYtTH3gh7rDL 9R0RFoM+KeHjtUeBor6HVeDgPKlzpcqmh4WNPrbqe2JpiHzSRn7kjvLGnyZNnG8Jcvac o7VWjZiXjMhACKwIYhYDtCtPW2UvVDVt9jraUZUmEd5O6AOXmnLBTofSvC/Rr3MWWd56 Pz3ucwFB5cVr35pGdIlzXZd5RWSz0GAJuS1o53L8+31nApp33wo6njkMVohYPhtHMLbm VAtf+X2xPgMljQeYSZvziN+FNapTs1wA5YBpVkGYDhOd+bzX1M9prGBZ0k4XUCofuwZt PQ== Received: from p1lg14879.it.hpe.com (p1lg14879.it.hpe.com [16.230.97.200]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a4b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:54 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14879.it.hpe.com (Postfix) with ESMTPS id 8C6A429583; Wed, 12 Oct 2022 21:59:53 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 300CA806B5D; Wed, 12 Oct 2022 21:59:53 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 06/19] crypto: x86/sm3 - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:18 -0500 Message-Id: <20221012215931.3896-7-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: SsfEUl-bSa_7-e8O3HxfPvS3a8M686_J X-Proofpoint-GUID: SsfEUl-bSa_7-e8O3HxfPvS3a8M686_J X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, causing: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 930ab34d906d ("crypto: x86/sm3 - add AVX assembly implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/sm3_avx_glue.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/sm3_avx_glue.c b/arch/x86/crypto/sm3_avx_glue.c index 661b6f22ffcd..ffb6d2f409ef 100644 --- a/arch/x86/crypto/sm3_avx_glue.c +++ b/arch/x86/crypto/sm3_avx_glue.c @@ -17,6 +17,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void sm3_transform_avx(struct sm3_state *state, const u8 *data, int nblocks); @@ -37,9 +39,16 @@ static int sm3_avx_update(struct shash_desc *desc, const u8 *data, */ BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); - kernel_fpu_begin(); - sm3_base_do_update(desc, data, len, sm3_transform_avx); - kernel_fpu_end(); + do { + unsigned int chunk = min(len, FPU_BYTES); + + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + + len -= chunk; + data += chunk; + } while (len); return 0; } @@ -57,9 +66,19 @@ static int sm3_avx_finup(struct shash_desc *desc, const u8 *data, return 0; } + do { + unsigned int chunk = min(len, FPU_BYTES); + + if (chunk) { + kernel_fpu_begin(); + sm3_base_do_update(desc, data, chunk, sm3_transform_avx); + kernel_fpu_end(); + } + + len -= chunk; + data += chunk; + } while (len); kernel_fpu_begin(); - if (len) - sm3_base_do_update(desc, data, len, sm3_transform_avx); sm3_base_do_finalize(desc, sm3_transform_avx); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2460C43219 for ; Wed, 12 Oct 2022 22:00:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229495AbiJLWAP (ORCPT ); Wed, 12 Oct 2022 18:00:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229892AbiJLWAI (ORCPT ); Wed, 12 Oct 2022 18:00:08 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3589E205E6; Wed, 12 Oct 2022 15:00:04 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CJxoEv014051; Wed, 12 Oct 2022 21:59:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=YlzDIESEfgTHM3UWOTC3N8erZlt5QA9vPO1xjBSnLF0=; b=D7WmPvZh2tewYv0m43vvUEFLRXK17NSFGmmHwTe7OsTULsEOStWj9m6E9EysrJ6vHDHg i6uY73vFKpCCqor2m6SJNd9tzXdVf5fB0O54yPor+ETQqTUay4uyt6kqEKxNkuvhYgHo dDZKGCVqWs+7zLT3gEWbgbTrPAKdxcfAvU5vAsWpie2/doWuit373EQB+3tbZmJWiFCE W80/+09eVfBA+HDMDSsaP9bXa6WQNPcCKa75Zn92TBj7V4+gqVTm7UMjfuKQ/ZrG9r2I hmBSJs/FexNgs2Fq2yz6csFdwbq3siGM0bylVguz+mv3hqIwEv+yEWIM713PGc5+0VEr Hw== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtp0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:56 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 9A72B804719; Wed, 12 Oct 2022 21:59:55 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 39899808EDA; Wed, 12 Oct 2022 21:59:55 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 07/19] crypto: x86/ghash - restructure FPU context saving Date: Wed, 12 Oct 2022 16:59:19 -0500 Message-Id: <20221012215931.3896-8-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 77A3sNrJXj1Vd87FM5GRaq59qsi5gdvl X-Proofpoint-ORIG-GUID: 77A3sNrJXj1Vd87FM5GRaq59qsi5gdvl X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wrap each of the calls to clmul_hash_update and clmul_ghash__mul in its own set of kernel_fpu_begin and kernel_fpu_end calls, preparing to limit the amount of data processed by each _update call to avoid RCU stalls. This is more like how polyval-clmulni_glue is structured. Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 1f1a95f3dd0c..53aa286ec27f 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -80,7 +80,6 @@ static int ghash_update(struct shash_desc *desc, struct ghash_ctx *ctx = crypto_shash_ctx(desc->tfm); u8 *dst = dctx->buffer; - kernel_fpu_begin(); if (dctx->bytes) { int n = min(srclen, dctx->bytes); u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); @@ -91,10 +90,14 @@ static int ghash_update(struct shash_desc *desc, while (n--) *pos++ ^= *src++; - if (!dctx->bytes) + if (!dctx->bytes) { + kernel_fpu_begin(); clmul_ghash_mul(dst, &ctx->shash); + kernel_fpu_end(); + } } + kernel_fpu_begin(); clmul_ghash_update(dst, src, srclen, &ctx->shash); kernel_fpu_end(); From patchwork Wed Oct 12 21:59:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1337C433FE for ; Wed, 12 Oct 2022 22:01:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229979AbiJLWBP (ORCPT ); Wed, 12 Oct 2022 18:01:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229983AbiJLWAa (ORCPT ); Wed, 12 Oct 2022 18:00:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2AE8B42E7C; Wed, 12 Oct 2022 15:00:12 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwnKK007661; Wed, 12 Oct 2022 21:59:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=hdZBxgZ/TUVLMXZM2B/5tdVJYyvU0kV133lw76XKM8w=; b=PY7mnfOtlfIXFa4yUUY0tZviVcH/bFeyu3JBG6VGqXNQw1sfRWRcxaLVZyLmS2IrtQfz WVbnPSsUpjspFG1WKDsSh1Qw917II0pkRLzWUzNpG8WaEYECgzxxpyZBtclU5W7aXdND YyTWFu1fVJgUBCXAqrKo0Q5K/D8U6H83oEydtgDMlcPFONrhOvJMBJiup7h1ijEuSqVU MLFtqABpE3a8WAYQDRwu4uMQMZPoN6k8Qhe+Tf1oT67G9ej96Ukm8WCiO7Cy5BGBXHdf Xi8WHckpBkZCL7da+O1k77EX1gx+5T17F/OFBtLAto9aNJ6s27T/02BCC6tRjEkBNmt5 Qw== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8dba-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:59 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 29CD513966; Wed, 12 Oct 2022 21:59:57 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id C24208036BF; Wed, 12 Oct 2022 21:59:56 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 08/19] crypto: x86/ghash - limit FPU preemption Date: Wed, 12 Oct 2022 16:59:20 -0500 Message-Id: <20221012215931.3896-9-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: GN3sVZsnEFkLjRI-H6zI7Felq_M5_ziE X-Proofpoint-ORIG-GUID: GN3sVZsnEFkLjRI-H6zI7Felq_M5_ziE X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org As done by the ECB and CBC helpers in arch/x86/crypt/ecb_cbc_helpers.h, limit the number of bytes processed between kernel_fpu_begin() and kernel_fpu_end() calls. Those functions call preempt_disable() and preempt_enable(), so the CPU core is unavailable for scheduling while running, leading to: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: ... Fixes: 0e1227d356e9 ("crypto: ghash - Add PCLMULQDQ accelerated implementation") Suggested-by: Herbert Xu Signed-off-by: Robert Elliott --- arch/x86/crypto/ghash-clmulni-intel_glue.c | 26 ++++++++++++++++------ 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 53aa286ec27f..a39fc405c7cf 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -23,6 +23,8 @@ #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + void clmul_ghash_mul(char *dst, const u128 *shash); void clmul_ghash_update(char *dst, const char *src, unsigned int srclen, @@ -82,7 +84,7 @@ static int ghash_update(struct shash_desc *desc, if (dctx->bytes) { int n = min(srclen, dctx->bytes); - u8 *pos = dst + (GHASH_BLOCK_SIZE - dctx->bytes); + u8 *pos = dst + GHASH_BLOCK_SIZE - dctx->bytes; dctx->bytes -= n; srclen -= n; @@ -97,13 +99,23 @@ static int ghash_update(struct shash_desc *desc, } } - kernel_fpu_begin(); - clmul_ghash_update(dst, src, srclen, &ctx->shash); - kernel_fpu_end(); + while (srclen >= GHASH_BLOCK_SIZE) { + unsigned int fpulen = min(srclen, FPU_BYTES); + + kernel_fpu_begin(); + while (fpulen >= GHASH_BLOCK_SIZE) { + int n = min_t(unsigned int, fpulen, GHASH_BLOCK_SIZE); + + clmul_ghash_update(dst, src, n, &ctx->shash); + + srclen -= n; + fpulen -= n; + src += n; + } + kernel_fpu_end(); + } - if (srclen & 0xf) { - src += srclen - (srclen & 0xf); - srclen &= 0xf; + if (srclen) { dctx->bytes = GHASH_BLOCK_SIZE - srclen; while (srclen--) *dst++ ^= *src++; From patchwork Wed Oct 12 21:59:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42013C4332F for ; Wed, 12 Oct 2022 22:00:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229924AbiJLWAe (ORCPT ); Wed, 12 Oct 2022 18:00:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229711AbiJLWAU (ORCPT ); Wed, 12 Oct 2022 18:00:20 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EB343E742; Wed, 12 Oct 2022 15:00:11 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7nUZ016092; Wed, 12 Oct 2022 21:59:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=PJ23TVkPoznEmI/AzY+zwfyDq1fzzxeVCOJWO7ijd2E=; b=NOgzVQ6FqkEG8LF7XdZO10hs5ViDwApvL5Rc/w9UogENt2AtV+dZVdTI1fnlCldKSRs+ BTzEab8hy5QFzbnfcxL6rVMEf4YFlWfUyy3lLmGyJuh/8MeLSDhKr29y6w29WSznkYCG V64eeYk+kEod5S7mijkokBYrgW/7d1/lP8nYaa3b535if8SWJUhcGjsMJdR8c5YRF1yM UNOmlNeqAG+RcXRR+jEhs+veQsKQriTQkQgDW2hnq6ecMrJqxDKPpscNFPc831sqpSnf Q64CkZX/bpfrU4QBPHKLWsdbeQ71XA64u2cRLyW23Mh/rxrLgKz28Dj67EPZCCCMF34Z 6w== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657j8a4y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 21:59:59 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id 81D031396A; Wed, 12 Oct 2022 21:59:58 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 28796805032; Wed, 12 Oct 2022 21:59:58 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 09/19] crypto: x86 - use common macro for FPU limit Date: Wed, 12 Oct 2022 16:59:21 -0500 Message-Id: <20221012215931.3896-10-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: cvBzHnz1Jrp4-w5puDxizPr6_DNBD6L8 X-Proofpoint-GUID: cvBzHnz1Jrp4-w5puDxizPr6_DNBD6L8 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 spamscore=0 phishscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 clxscore=1015 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Use a common macro name (FPU_BYTES) for the limit of the number of bytes processed within kernel_fpu_begin and kernel_fpu_end rather than using SZ_4K (which is a signed value), or a magic value of 4096U. Use unsigned int rather than size_t for some of the arguments to avoid typecasting for the min() macro. Signed-off-by: Robert Elliott --- arch/x86/crypto/blake2s-glue.c | 7 ++++--- arch/x86/crypto/chacha_glue.c | 4 +++- arch/x86/crypto/nhpoly1305-avx2-glue.c | 4 +++- arch/x86/crypto/nhpoly1305-sse2-glue.c | 4 +++- arch/x86/crypto/poly1305_glue.c | 25 ++++++++++++++----------- arch/x86/crypto/polyval-clmulni_glue.c | 5 +++-- 6 files changed, 30 insertions(+), 19 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index aaba21230528..3054ee7fa219 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -16,6 +16,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, const u8 *block, const size_t nblocks, const u32 inc); @@ -29,8 +31,7 @@ static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); void blake2s_compress(struct blake2s_state *state, const u8 *block, size_t nblocks, const u32 inc) { - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); + BUILD_BUG_ON(FPU_BYTES / BLAKE2S_BLOCK_SIZE < 8); if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { blake2s_compress_generic(state, block, nblocks, inc); @@ -39,7 +40,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, do { const size_t blocks = min_t(size_t, nblocks, - SZ_4K / BLAKE2S_BLOCK_SIZE); + FPU_BYTES / BLAKE2S_BLOCK_SIZE); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7b3a1cf0984b..0d7e172862db 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -15,6 +15,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void chacha_block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, unsigned int len, int nrounds); asmlinkage void chacha_4block_xor_ssse3(u32 *state, u8 *dst, const u8 *src, @@ -147,7 +149,7 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, return chacha_crypt_generic(state, dst, src, bytes, nrounds); do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + unsigned int todo = min(bytes, FPU_BYTES); kernel_fpu_begin(); chacha_dosimd(state, dst, src, todo, nrounds); diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 8ea5ab0f1ca7..59615ae95e86 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -13,6 +13,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_avx2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -30,7 +32,7 @@ static int nhpoly1305_avx2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_avx2); diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index 2b353d42ed13..bf91c375821a 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -13,6 +13,8 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void nh_sse2(const u32 *key, const u8 *message, size_t message_len, u8 hash[NH_HASH_BYTES]); @@ -30,7 +32,7 @@ static int nhpoly1305_sse2_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = min(srclen, FPU_BYTES); kernel_fpu_begin(); crypto_nhpoly1305_update_helper(desc, src, n, _nh_sse2); diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 1dfb8af48a3c..3764301bdf1b 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -15,20 +15,24 @@ #include #include +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + asmlinkage void poly1305_init_x86_64(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]); asmlinkage void poly1305_blocks_x86_64(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); asmlinkage void poly1305_emit_x86_64(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); asmlinkage void poly1305_emit_avx(void *ctx, u8 mac[POLY1305_DIGEST_SIZE], const u32 nonce[4]); -asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); -asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, const size_t len, - const u32 padbit); +asmlinkage void poly1305_blocks_avx(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); +asmlinkage void poly1305_blocks_avx2(void *ctx, const u8 *inp, + const unsigned int len, const u32 padbit); asmlinkage void poly1305_blocks_avx512(void *ctx, const u8 *inp, - const size_t len, const u32 padbit); + const unsigned int len, + const u32 padbit); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx); static __ro_after_init DEFINE_STATIC_KEY_FALSE(poly1305_use_avx2); @@ -86,14 +90,13 @@ static void poly1305_simd_init(void *ctx, const u8 key[POLY1305_BLOCK_SIZE]) poly1305_init_x86_64(ctx, key); } -static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, +static void poly1305_simd_blocks(void *ctx, const u8 *inp, unsigned int len, const u32 padbit) { struct poly1305_arch_internal *state = ctx; - /* SIMD disables preemption, so relax after processing each page. */ - BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE || - SZ_4K % POLY1305_BLOCK_SIZE); + BUILD_BUG_ON(FPU_BYTES < POLY1305_BLOCK_SIZE || + FPU_BYTES % POLY1305_BLOCK_SIZE); if (!static_branch_likely(&poly1305_use_avx) || (len < (POLY1305_BLOCK_SIZE * 18) && !state->is_base2_26) || @@ -104,7 +107,7 @@ static void poly1305_simd_blocks(void *ctx, const u8 *inp, size_t len, } do { - const size_t bytes = min_t(size_t, len, SZ_4K); + const unsigned int bytes = min(len, FPU_BYTES); kernel_fpu_begin(); if (IS_ENABLED(CONFIG_AS_AVX512) && static_branch_likely(&poly1305_use_avx512)) diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index b7664d018851..2502964afef6 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -29,6 +29,8 @@ #define NUM_KEY_POWERS 8 +#define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ + struct polyval_tfm_ctx { /* * These powers must be in the order h^8, ..., h^1. @@ -123,8 +125,7 @@ static int polyval_x86_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* Allow rescheduling every 4K bytes. */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + nblocks = min(srclen, FPU_BYTES) / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; From patchwork Wed Oct 12 21:59:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57C01C43219 for ; Wed, 12 Oct 2022 22:01:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230052AbiJLWBR (ORCPT ); Wed, 12 Oct 2022 18:01:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229885AbiJLWAc (ORCPT ); Wed, 12 Oct 2022 18:00:32 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6A285726C; Wed, 12 Oct 2022 15:00:17 -0700 (PDT) Received: from pps.filterd (m0150244.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CGUv8R027242; Wed, 12 Oct 2022 22:00:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=+nE5VKyWUSEW08QSYxLtY7i4Gw38jTfKxO2sgAjNaZ0=; b=KzCk7OZduvaetOB2063hl/uSWKsOE7U5VUXfRM/HdmRdqgIw92enm50CxKmjqfnJT3/P d3ZIb5KHQ79gv1D0LRvFyF/E0YC/+FPCSgDNw57DoPis4MDDfe/Y6JkAVPFJ4fFByVMe rN0qCnLmEuIclTZiGeXCBkDwuEjvcgl6esotD4jGy5DIYGgaLQTSTRl/+vQ7VpAMq4gt UzQ1JCOn6FYKWwgmYhut4TkDVPVSnMcyzuWrDwcDYVYOblXPQV6/KNGxUlZN7W7ZmgTN WKDzX/i1258la1dmp9wBzzxHdVQAXphqHoSzKEI3lgS4UbNFFSTTTMyDujjh6qgn6fst 0A== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k615t2eqb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:00 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id D03BD801AC0; Wed, 12 Oct 2022 21:59:59 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 75AC8800344; Wed, 12 Oct 2022 21:59:59 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 10/19] crypto: x86/sha1, sha256 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:22 -0500 Message-Id: <20221012215931.3896-11-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: MKcI_SpIQ6gI5fGSk5W54aFNBmuJTDxR X-Proofpoint-ORIG-GUID: MKcI_SpIQ6gI5fGSk5W54aFNBmuJTDxR X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 mlxlogscore=952 lowpriorityscore=0 suspectscore=0 adultscore=0 impostorscore=0 mlxscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sha1, sha256 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 13 +++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 13 +++++++++++++ 2 files changed, 26 insertions(+) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index a9f5779b41ca..edffc33bd12e 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -310,6 +311,15 @@ static int register_sha1_ni(void) return 0; } +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static void unregister_sha1_ni(void) { if (boot_cpu_has(X86_FEATURE_SHA_NI)) @@ -326,6 +336,9 @@ static int __init sha1_ssse3_mod_init(void) if (register_sha1_ssse3()) goto fail; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha1_avx()) { unregister_sha1_ssse3(); goto fail; diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 322c8aa907af..42e8cb1a6708 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -366,6 +367,15 @@ static struct shash_alg sha256_ni_algs[] = { { } } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int register_sha256_ni(void) { if (boot_cpu_has(X86_FEATURE_SHA_NI)) @@ -388,6 +398,9 @@ static inline void unregister_sha256_ni(void) { } static int __init sha256_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha256_ssse3()) goto fail; From patchwork Wed Oct 12 21:59:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614897 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C918FC4332F for ; Wed, 12 Oct 2022 22:02:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229672AbiJLWC1 (ORCPT ); Wed, 12 Oct 2022 18:02:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229704AbiJLWBa (ORCPT ); Wed, 12 Oct 2022 18:01:30 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E954A58089; Wed, 12 Oct 2022 15:00:17 -0700 (PDT) Received: from pps.filterd (m0134425.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL7Xtx016427; Wed, 12 Oct 2022 22:00:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=IHgdcqY57yff1zidfBdTpXQFa6PN/RGQ+7WerJQTk2I=; b=H5vD7ecinNIj7w7ZEZzSEMqJ1ASb1KmrQnVhBK0AaEuNmAog6qKRJtznVbURM6OVc5fR bz2v6JFOFPMnezZx6/Rp9sjP/i/5/c6znkW5fUgJ68jL8nZ8ahZ2o+UXZUu3EaFqZldD 2mgIA2JlTgnRLqo8H9uYsg5Phq4KHbEt5QLi0kdnBTuvv/y4n0kpI+JE/S0qDxQBPjQ9 2Nl/TgPTZWIFMhGDKCRxI0wk38n8hhTtnLHKbAPP+rc1pS62qJudX8/T6ojGk1dQLMx1 zgNoTte2G05k7JZCIV3+/fd4zcFqPew4yTNE9Rw8aHP8/BK+2TQnlhlE/UJqfLS6w2lo Gw== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k657c8aas-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:01 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 11C2B804701; Wed, 12 Oct 2022 22:00:01 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id A6C01806B7E; Wed, 12 Oct 2022 22:00:00 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 11/19] crypto: x86/crc - load based on CPU features Date: Wed, 12 Oct 2022 16:59:23 -0500 Message-Id: <20221012215931.3896-12-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: MDQZ-3AN4UnDZwqsSrh64-VrKqaaLGYZ X-Proofpoint-GUID: MDQZ-3AN4UnDZwqsSrh64-VrKqaaLGYZ X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 clxscore=1015 impostorscore=0 suspectscore=0 bulkscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), these x86-optimized crypto modules already have module aliases based on CPU feature bits: crc32, crc32c, and crct10dif Rename the unique device table data structure to a generic name so the code has the same pattern in all the modules. Remove the print on a device table mismatch from crc32 that is not present in the other modules. Modules are not supposed to print unless they are active. This commit covers modules that created rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/crc32-pclmul_glue.c | 9 +++------ arch/x86/crypto/crc32c-intel_glue.c | 6 +++--- arch/x86/crypto/crct10dif-pclmul_glue.c | 6 +++--- 3 files changed, 9 insertions(+), 12 deletions(-) diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index 38539c6edfe5..d49a19dcee37 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -178,20 +178,17 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32pclmul_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32pclmul_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32_pclmul_mod_init(void) { - - if (!x86_match_cpu(crc32pclmul_cpu_id)) { - pr_info("PCLMULQDQ-NI instructions are not detected.\n"); + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; - } return crypto_register_shash(&alg); } diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index ece620227057..980c62929256 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -231,15 +231,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crc32c_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_XMM4_2, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crc32c_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crc32c_intel_mod_init(void) { - if (!x86_match_cpu(crc32c_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_PCLMULQDQ)) { diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 54a537fc88ee..3b8e9394c40d 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -136,15 +136,15 @@ static struct shash_alg alg = { } }; -static const struct x86_cpu_id crct10dif_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, crct10dif_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init crct10dif_intel_mod_init(void) { - if (!x86_match_cpu(crct10dif_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; return crypto_register_shash(&alg); From patchwork Thu Nov 3 04:27:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 621138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA6B6C433FE for ; Thu, 3 Nov 2022 04:32:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231280AbiKCEcK (ORCPT ); Thu, 3 Nov 2022 00:32:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiKCEbz (ORCPT ); Thu, 3 Nov 2022 00:31:55 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFE681741A; Wed, 2 Nov 2022 21:30:16 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A34ES1M005074; Thu, 3 Nov 2022 04:29:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=WJ7ii9IOMwA1u/2F4iWjmOa0t9X/ieu9t/3cUuHQVnA=; b=OJOWxDIjNSfBdRAi1kareRQAhSx39aGRIvUq4iXSYGZqXs+hAo+vcZYLvHy+aQnAJ9Ot 0Qkb2E4VETC5eSqlo1tW4/fgj1fA8TFtYNqgW3NkFKoZs5AFvzWUBbLmBXDCf1ljM4EX MOKINbWRw79djFZI4DeQtRqQo9zRXqDrKwpabCQxzTcr4pQiqq7zAoH012umMJjugMaR xVNyPJYDR8OGBsOUndLHOebR7YVbBw57XBDQFeaed5adtUlWOlXG0f3JpD7lIV+LoJUB Bld1Q1P5YONl6bSbltj0+5oOWJspgTOybz2/zQA7bqnA752XawOhd/z4qvYpIsDTy0oE IA== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kktx6w26n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Nov 2022 04:29:49 +0000 Received: from p1lg14886.dc01.its.hpecorp.net (unknown [10.119.18.237]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 72EFE804720; Thu, 3 Nov 2022 04:28:20 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14886.dc01.its.hpecorp.net (Postfix) with ESMTP id 0031D802B9C; Thu, 3 Nov 2022 04:28:19 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v3 12/17] crypto: x86/sha - minimize time in FPU context Date: Wed, 2 Nov 2022 23:27:35 -0500 Message-Id: <20221103042740.6556-13-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221103042740.6556-1-elliott@hpe.com> References: <20221012215931.3896-1-elliott@hpe.com> <20221103042740.6556-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Ze3RjlDYsSLA8wP4Jrm9lOb0APM3I4fX X-Proofpoint-ORIG-GUID: Ze3RjlDYsSLA8wP4Jrm9lOb0APM3I4fX X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-02_15,2022-11-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 spamscore=0 adultscore=10 mlxscore=0 lowpriorityscore=0 bulkscore=0 mlxlogscore=999 clxscore=1015 impostorscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211030031 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Narrow the kernel_fpu_begin()/kernel_fpu_end() to just wrap the assembly functions, not any extra C code around them (which includes several memcpy() calls). This reduces unnecessary time in FPU context, in which the scheduler is prevented from preempting and the RCU subsystem is kept from doing its work. Example results measuring a boot, in which SHA-512 is used to check all module signatures using finup() calls: Before: calls maxcycles bpf update finup algorithm module ======== ============ ======== ======== ======== =========== ============== 168390 1233188 19456 0 19456 sha512-avx2 sha512_ssse3 After: 182694 1007224 19456 0 19456 sha512-avx2 sha512_ssse3 That means it stayed in FPU context for 226k fewer clocks cycles (which is 102 microseconds on this system, 18% less). Signed-off-by: Robert Elliott --- arch/x86/crypto/sha1_ssse3_glue.c | 82 ++++++++++++++++++++--------- arch/x86/crypto/sha256_ssse3_glue.c | 67 ++++++++++++++++++----- arch/x86/crypto/sha512_ssse3_glue.c | 48 ++++++++++++----- 3 files changed, 145 insertions(+), 52 deletions(-) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 89aa5b787f2f..cd390083451f 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -34,6 +34,54 @@ static const unsigned int bytes_per_fpu_avx2 = 34 * 1024; static const unsigned int bytes_per_fpu_avx = 30 * 1024; static const unsigned int bytes_per_fpu_ssse3 = 26 * 1024; +asmlinkage void sha1_transform_ssse3(struct sha1_state *state, + const u8 *data, int blocks); + +asmlinkage void sha1_transform_avx(struct sha1_state *state, + const u8 *data, int blocks); + +asmlinkage void sha1_transform_avx2(struct sha1_state *state, + const u8 *data, int blocks); + +#ifdef CONFIG_AS_SHA1_NI +asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, + int rounds); +#endif + +static void fpu_sha1_transform_ssse3(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha1_transform_avx(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha1_transform_avx2(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_transform_avx2(state, data, blocks); + kernel_fpu_end(); +} + +#ifdef CONFIG_AS_SHA1_NI +static void fpu_sha1_transform_shani(struct sha1_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha1_ni_transform(state, data, blocks); + kernel_fpu_end(); +} +#endif + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha1_block_fn *sha1_xform) @@ -53,9 +101,7 @@ static int sha1_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha1_base_do_update(desc, data, chunk, sha1_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -74,36 +120,29 @@ static int sha1_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha1_base_do_update(desc, data, chunk, sha1_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha1_base_do_finalize(desc, sha1_xform); - kernel_fpu_end(); return sha1_base_finish(desc, out); } -asmlinkage void sha1_transform_ssse3(struct sha1_state *state, - const u8 *data, int blocks); - static int sha1_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_ssse3, - sha1_transform_ssse3); + fpu_sha1_transform_ssse3); } static int sha1_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_ssse3, out, - sha1_transform_ssse3); + fpu_sha1_transform_ssse3); } /* Add padding and return the message digest. */ @@ -141,21 +180,18 @@ static void unregister_sha1_ssse3(void) } } -asmlinkage void sha1_transform_avx(struct sha1_state *state, - const u8 *data, int blocks); - static int sha1_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_avx, - sha1_transform_avx); + fpu_sha1_transform_avx); } static int sha1_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_avx, out, - sha1_transform_avx); + fpu_sha1_transform_avx); } static int sha1_avx_final(struct shash_desc *desc, u8 *out) @@ -189,17 +225,14 @@ static void unregister_sha1_avx(void) #define SHA1_AVX2_BLOCK_OPTSIZE 4 /* optimal 4*64 bytes of SHA1 blocks */ -asmlinkage void sha1_transform_avx2(struct sha1_state *state, - const u8 *data, int blocks); - static void sha1_apply_transform_avx2(struct sha1_state *state, const u8 *data, int blocks) { /* Select the optimal transform based on data block size */ if (blocks >= SHA1_AVX2_BLOCK_OPTSIZE) - sha1_transform_avx2(state, data, blocks); + fpu_sha1_transform_avx2(state, data, blocks); else - sha1_transform_avx(state, data, blocks); + fpu_sha1_transform_avx(state, data, blocks); } static int sha1_avx2_update(struct shash_desc *desc, const u8 *data, @@ -246,21 +279,18 @@ static void unregister_sha1_avx2(void) } #ifdef CONFIG_AS_SHA1_NI -asmlinkage void sha1_ni_transform(struct sha1_state *digest, const u8 *data, - int rounds); - static int sha1_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha1_update(desc, data, len, bytes_per_fpu_shani, - sha1_ni_transform); + fpu_sha1_transform_shani); } static int sha1_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha1_finup(desc, data, len, bytes_per_fpu_shani, out, - sha1_ni_transform); + fpu_sha1_transform_shani); } static int sha1_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index de320973e473..692d6f010a4d 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -51,6 +51,51 @@ static const unsigned int bytes_per_fpu_ssse3 = 11 * 1024; asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks); +asmlinkage void sha256_transform_avx(struct sha256_state *state, + const u8 *data, int blocks); + +asmlinkage void sha256_transform_rorx(struct sha256_state *state, + const u8 *data, int blocks); + +#ifdef CONFIG_AS_SHA256_NI +asmlinkage void sha256_ni_transform(struct sha256_state *digest, + const u8 *data, int rounds); +#endif + +static void fpu_sha256_transform_ssse3(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha256_transform_avx(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha256_transform_avx2(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_transform_rorx(state, data, blocks); + kernel_fpu_end(); +} + +#ifdef CONFIG_AS_SHA1_NI +static void fpu_sha256_transform_shani(struct sha256_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha256_ni_transform(state, data, blocks); + kernel_fpu_end(); +} +#endif + static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha256_block_fn *sha256_xform) @@ -70,9 +115,7 @@ static int _sha256_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha256_base_do_update(desc, data, chunk, sha256_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -91,17 +134,13 @@ static int sha256_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha256_base_do_update(desc, data, chunk, sha256_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha256_base_do_finalize(desc, sha256_xform); - kernel_fpu_end(); return sha256_base_finish(desc, out); } @@ -110,14 +149,14 @@ static int sha256_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_ssse3, - sha256_transform_ssse3); + fpu_sha256_transform_ssse3); } static int sha256_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_ssse3, - out, sha256_transform_ssse3); + out, fpu_sha256_transform_ssse3); } /* Add padding and return the message digest. */ @@ -177,14 +216,14 @@ static int sha256_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_avx, - sha256_transform_avx); + fpu_sha256_transform_avx); } static int sha256_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_avx, - out, sha256_transform_avx); + out, fpu_sha256_transform_avx); } static int sha256_avx_final(struct shash_desc *desc, u8 *out) @@ -238,14 +277,14 @@ static int sha256_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_avx2, - sha256_transform_rorx); + fpu_sha256_transform_avx2); } static int sha256_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_avx2, - out, sha256_transform_rorx); + out, fpu_sha256_transform_avx2); } static int sha256_avx2_final(struct shash_desc *desc, u8 *out) @@ -300,14 +339,14 @@ static int sha256_ni_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return _sha256_update(desc, data, len, bytes_per_fpu_shani, - sha256_ni_transform); + fpu_sha256_transform_shani); } static int sha256_ni_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha256_finup(desc, data, len, bytes_per_fpu_shani, - out, sha256_ni_transform); + out, fpu_sha256_transform_shani); } static int sha256_ni_final(struct shash_desc *desc, u8 *out) diff --git a/arch/x86/crypto/sha512_ssse3_glue.c b/arch/x86/crypto/sha512_ssse3_glue.c index 3e96fe51f1a0..e2698545bf47 100644 --- a/arch/x86/crypto/sha512_ssse3_glue.c +++ b/arch/x86/crypto/sha512_ssse3_glue.c @@ -47,6 +47,36 @@ static const unsigned int bytes_per_fpu_ssse3 = 17 * 1024; asmlinkage void sha512_transform_ssse3(struct sha512_state *state, const u8 *data, int blocks); +asmlinkage void sha512_transform_avx(struct sha512_state *state, + const u8 *data, int blocks); + +asmlinkage void sha512_transform_rorx(struct sha512_state *state, + const u8 *data, int blocks); + +static void fpu_sha512_transform_ssse3(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_ssse3(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha512_transform_avx(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_avx(state, data, blocks); + kernel_fpu_end(); +} + +static void fpu_sha512_transform_avx2(struct sha512_state *state, + const u8 *data, int blocks) +{ + kernel_fpu_begin(); + sha512_transform_rorx(state, data, blocks); + kernel_fpu_end(); +} + static int sha512_update(struct shash_desc *desc, const u8 *data, unsigned int len, unsigned int bytes_per_fpu, sha512_block_fn *sha512_xform) @@ -66,9 +96,7 @@ static int sha512_update(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha512_base_do_update(desc, data, chunk, sha512_xform); - kernel_fpu_end(); len -= chunk; data += chunk; @@ -87,17 +115,13 @@ static int sha512_finup(struct shash_desc *desc, const u8 *data, while (len) { unsigned int chunk = min(len, bytes_per_fpu); - kernel_fpu_begin(); sha512_base_do_update(desc, data, chunk, sha512_xform); - kernel_fpu_end(); len -= chunk; data += chunk; } - kernel_fpu_begin(); sha512_base_do_finalize(desc, sha512_xform); - kernel_fpu_end(); return sha512_base_finish(desc, out); } @@ -106,14 +130,14 @@ static int sha512_ssse3_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_ssse3, - sha512_transform_ssse3); + fpu_sha512_transform_ssse3); } static int sha512_ssse3_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_ssse3, - out, sha512_transform_ssse3); + out, fpu_sha512_transform_ssse3); } /* Add padding and return the message digest. */ @@ -172,14 +196,14 @@ static int sha512_avx_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_avx, - sha512_transform_avx); + fpu_sha512_transform_avx); } static int sha512_avx_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_avx, - out, sha512_transform_avx); + out, fpu_sha512_transform_avx); } /* Add padding and return the message digest. */ @@ -234,14 +258,14 @@ static int sha512_avx2_update(struct shash_desc *desc, const u8 *data, unsigned int len) { return sha512_update(desc, data, len, bytes_per_fpu_avx2, - sha512_transform_rorx); + fpu_sha512_transform_avx2); } static int sha512_avx2_finup(struct shash_desc *desc, const u8 *data, unsigned int len, u8 *out) { return sha512_finup(desc, data, len, bytes_per_fpu_avx2, - out, sha512_transform_rorx); + out, fpu_sha512_transform_avx2); } /* Add padding and return the message digest. */ From patchwork Thu Nov 3 04:27:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 621139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23BFFC4332F for ; Thu, 3 Nov 2022 04:30:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230500AbiKCEae (ORCPT ); Thu, 3 Nov 2022 00:30:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230504AbiKCE3c (ORCPT ); Thu, 3 Nov 2022 00:29:32 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E10211A0E; Wed, 2 Nov 2022 21:28:43 -0700 (PDT) Received: from pps.filterd (m0150245.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A30jDjk010497; Thu, 3 Nov 2022 04:28:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=JZPdVhi4LuEFBoclA7AVxB12a7DjlsW/T2E4udvE+wc=; b=pAi/VWKf5eteGEOJQrk6RCvNI+dQVb3F3/jjFE3PFJVGdPCaDWxNoaf1oPuwZWNJMsqq 255r3136kSC0+YShaH0ekSHiAi0d6/pm7ZXU34NWng1dngzh+1jGrMslXZYNqnMjU/HM 6xkaaNKhQy3XMwOt4K/ZZyZqiWTOJWXhojxwVgaiO2xRItBS7yf+KkfoF/HgyhELNoy6 S3UX/FeUp+mAGhlCLr+UYYqK1y+tIXlvqi5ZqPZp1MEFDOtlfLR7iHC1M+dhNgwezyLV 63DUag1SN/igq/xbJFbEUzF+g1wbphztbck5ExdN3OSkm9asM+iLpa/DkEM5E2Le2lyP HA== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3kkvxsc1px-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Nov 2022 04:28:22 +0000 Received: from p1lg14886.dc01.its.hpecorp.net (unknown [10.119.18.237]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id E39ED804722; Thu, 3 Nov 2022 04:28:21 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14886.dc01.its.hpecorp.net (Postfix) with ESMTP id 70C7A808EB4; Thu, 3 Nov 2022 04:28:21 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v3 13/17] crypto: x86/sha1, sha256 - load based on CPU features Date: Wed, 2 Nov 2022 23:27:36 -0500 Message-Id: <20221103042740.6556-14-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221103042740.6556-1-elliott@hpe.com> References: <20221012215931.3896-1-elliott@hpe.com> <20221103042740.6556-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-ORIG-GUID: jqAOFoThLNkvOvl7CBCA1bm1G-sdLHyl X-Proofpoint-GUID: jqAOFoThLNkvOvl7CBCA1bm1G-sdLHyl X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-02_15,2022-11-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 mlxscore=0 spamscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 malwarescore=0 bulkscore=0 mlxlogscore=999 adultscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211030031 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: sha1, sha256 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. Signed-off-by: Robert Elliott --- v3 put device table SHA_NI entries inside CONFIG_SHAn_NI ifdefs, ensure builds properly with arch/x86/Kconfig.assembler changed to not set CONFIG_AS_SHA*_NI --- arch/x86/crypto/sha1_ssse3_glue.c | 15 +++++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index cd390083451f..7269beaa9291 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,6 +24,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -326,12 +327,26 @@ static void unregister_sha1_ni(void) static inline void unregister_sha1_ni(void) { } #endif +static const struct x86_cpu_id module_cpu_ids[] = { +#ifdef CONFIG_AS_SHA1_NI + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), +#endif + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sha1_ssse3_mod_init(void) { const char *feature_name; const char *driver_name = NULL; int ret; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + #ifdef CONFIG_AS_SHA1_NI /* SHA-NI */ if (boot_cpu_has(X86_FEATURE_SHA_NI)) { diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 692d6f010a4d..5ce42f1d228b 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,6 +38,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -397,6 +398,17 @@ static void unregister_sha256_ni(void) static inline void unregister_sha256_ni(void) { } #endif +static const struct x86_cpu_id module_cpu_ids[] = { +#ifdef CONFIG_AS_SHA256_NI + X86_MATCH_FEATURE(X86_FEATURE_SHA_NI, NULL), +#endif + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init sha256_ssse3_mod_init(void) { const char *feature_name; @@ -404,6 +416,9 @@ static int __init sha256_ssse3_mod_init(void) const char *driver_name2 = NULL; int ret; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + #ifdef CONFIG_AS_SHA256_NI /* SHA-NI */ if (boot_cpu_has(X86_FEATURE_SHA_NI)) { From patchwork Wed Oct 12 21:59:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE4FAC433FE for ; Wed, 12 Oct 2022 22:01:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229941AbiJLWBr (ORCPT ); Wed, 12 Oct 2022 18:01:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230015AbiJLWAm (ORCPT ); Wed, 12 Oct 2022 18:00:42 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B51A22937D; Wed, 12 Oct 2022 15:00:20 -0700 (PDT) Received: from pps.filterd (m0134423.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CKwftF006351; Wed, 12 Oct 2022 22:00:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=r+X+jLySvqDiKiHMsNUb6PPWVOp5GHi0Deu+a2h35I8=; b=UiswrlZFgQtCZIAdMZNiMGZr/q6aI8bgKtaV37CyrdXPdKzrgww2NYex3vIIll5zpqdH k9Hpf8/OT8JZvtnTtUFokJ6ms/pRVLpj/1EdxQS/veTxjpFCKcAt8S9aesQnZi2iCnvr ksdiQ1kGFc1WdedR8EaOTSrAnzdaWii8d+f5TSrjz8ogGCSC1d+rhJ8h59ZdlGyP68NI 8dvB1nXDtk12JRVOh2wKwwvDZGsoDclvgAfA3ZwYGMxhDt88JrSxIScrJDWE3yMgXGUV d1tFFpTyNY8D1zKpvX/cF/mxqYvWugErNSrFS/joWPpcAnL1jzAN1TYVvJPiZuz+QW9u AA== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k653c8dd4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:06 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id B508313943; Wed, 12 Oct 2022 22:00:05 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 616D5807DB8; Wed, 12 Oct 2022 22:00:05 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 14/19] crypto: x86 - load based on CPU features Date: Wed, 12 Oct 2022 16:59:26 -0500 Message-Id: <20221012215931.3896-15-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: 7z1WUvCAZ4qfCWocIMKTMMooH90dGYAE X-Proofpoint-ORIG-GUID: 7z1WUvCAZ4qfCWocIMKTMMooH90dGYAE X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 clxscore=1015 impostorscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org x86 optimized crypto modules built as modules rather than built-in to the kernel end up as .ko files in the filesystem, e.g., in /usr/lib/modules. If the filesystem itself is a module, these might not be available when the crypto API is initialized, resulting in the generic implementation being used (e.g., sha512_transform rather than sha512_transform_avx2). In one test case, CPU utilization in the sha512 function dropped from 15.34% to 7.18% after forcing loading of the optimized module. Set module aliases for x86 optimized crypto modules based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. For example, with sha256, sha512, aesni_intel, and blake2s configured as built-in and the rest configured as modules: [ 13.749145] sha256_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes, SHA-NI=no) [ 13.758502] sha512_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes) [ 13.766939] libblake2s_x86_64: CPU-optimized crypto module loaded (SSSE3=yes, AVX512=yes) [ 16.794502] aesni_intel: CPU-optimized crypto module loaded (GCM SSE=no, AVX=yes, AVX2=yes)(CTR AVX=yes) ... [ 18.160648] Run /init as init process ... [ 20.073484] twofish_x86_64: CPU-optimized crypto module loaded [ 23.974029] serpent_sse2_x86_64: CPU-optimized crypto module loaded [ 24.080749] serpent_avx_x86_64: CPU-optimized crypto module loaded [ 24.187148] serpent_avx2: CPU-optimized crypto module loaded [ 24.358980] des3_ede_x86_64: CPU-optimized crypto module loaded [ 24.459257] camellia_x86_64: CPU-optimized crypto module loaded [ 24.548487] camellia_aesni_avx_x86_64: CPU-optimized crypto module loaded [ 24.630777] camellia_aesni_avx2: CPU-optimized crypto module loaded [ 24.957134] blowfish_x86_64: CPU-optimized crypto module loaded [ 25.063537] aegis128_aesni: CPU-optimized crypto module loaded [ 25.174560] chacha_x86_64: CPU-optimized crypto module loaded (AVX2=yes, AVX512=yes) [ 25.270084] sha1_ssse3: CPU-optimized crypto module loaded (SSSE3=no, AVX=no, AVX2=yes, SHA-NI=no) [ 25.531724] ghash_clmulni_intel: CPU-optimized crypto module loaded [ 25.596316] crc32c_intel: CPU-optimized crypto module loaded (PCLMULQDQ=yes) [ 25.661693] crc32_pclmul: CPU-optimized crypto module loaded [ 25.696388] crct10dif_pclmul: CPU-optimized crypto module loaded [ 25.742040] poly1305_x86_64: CPU-optimized crypto module loaded (AVX=yes, AVX2=yes, AVX512=no) [ 25.841364] nhpoly1305_avx2: CPU-optimized crypto module loaded [ 25.856401] curve25519_x86_64: CPU-optimized crypto module loaded (ADX=yes) [ 25.866615] sm3_avx_x86_64: CPU-optimized crypto module loaded This commit covers modules that did not create rcu stall issues due to kernel_fpu_begin/kernel_fpu_end calls. Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 9 +++++++++ arch/x86/crypto/aesni-intel_glue.c | 7 +++---- arch/x86/crypto/blake2s-glue.c | 11 ++++++++++- arch/x86/crypto/blowfish_glue.c | 10 ++++++++++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 12 ++++++++++++ arch/x86/crypto/camellia_aesni_avx_glue.c | 11 +++++++++++ arch/x86/crypto/camellia_glue.c | 9 +++++++++ arch/x86/crypto/cast5_avx_glue.c | 10 ++++++++++ arch/x86/crypto/cast6_avx_glue.c | 10 ++++++++++ arch/x86/crypto/chacha_glue.c | 12 ++++++++++-- arch/x86/crypto/curve25519-x86_64.c | 12 +++++++++++- arch/x86/crypto/des3_ede_glue.c | 10 ++++++++++ arch/x86/crypto/nhpoly1305-avx2-glue.c | 10 ++++++++++ arch/x86/crypto/nhpoly1305-sse2-glue.c | 10 ++++++++++ arch/x86/crypto/poly1305_glue.c | 12 ++++++++++++ arch/x86/crypto/serpent_avx2_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_avx_glue.c | 10 ++++++++++ arch/x86/crypto/serpent_sse2_glue.c | 10 ++++++++++ arch/x86/crypto/sm4_aesni_avx2_glue.c | 12 ++++++++++++ arch/x86/crypto/sm4_aesni_avx_glue.c | 11 +++++++++++ arch/x86/crypto/twofish_avx_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue.c | 10 ++++++++++ arch/x86/crypto/twofish_glue_3way.c | 10 ++++++++++ 23 files changed, 230 insertions(+), 8 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 4623189000d8..9e4ba031704d 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -263,10 +263,19 @@ static struct aead_alg crypto_aegis128_aesni_alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_aead_alg *simd_alg; static int __init crypto_aegis128_aesni_module_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2) || !boot_cpu_has(X86_FEATURE_AES) || !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index a5b0cb3efeba..4a530a558436 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -36,7 +36,6 @@ #include #include - #define AESNI_ALIGN 16 #define AESNI_ALIGN_ATTR __attribute__ ((__aligned__(AESNI_ALIGN))) #define AES_BLOCK_MASK (~(AES_BLOCK_SIZE - 1)) @@ -1228,17 +1227,17 @@ static struct aead_alg aesni_aeads[0]; static struct simd_aead_alg *aesni_simd_aeads[ARRAY_SIZE(aesni_aeads)]; -static const struct x86_cpu_id aesni_cpu_id[] = { +static const struct x86_cpu_id module_cpu_ids[] = { X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), {} }; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init aesni_init(void) { int err; - if (!x86_match_cpu(aesni_cpu_id)) + if (!x86_match_cpu(module_cpu_ids)) return -ENODEV; #ifdef CONFIG_X86_64 if (boot_cpu_has(X86_FEATURE_AVX2)) { diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index 3054ee7fa219..5153bb423dbe 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -10,7 +10,7 @@ #include #include #include - +#include #include #include #include @@ -56,8 +56,17 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block, } EXPORT_SYMBOL(blake2s_compress); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blake2s_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_SSSE3)) static_branch_enable(&blake2s_use_ssse3); diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 019c64c1340a..4c0ead71b198 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include /* regular block cipher functions */ asmlinkage void __blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src, @@ -303,10 +304,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init blowfish_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "blowfish-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index e7e4d64e9577..8e3ac5be7cf6 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,12 +99,23 @@ static struct skcipher_alg camellia_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index c7ccf63e741e..54fcd86160ff 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "camellia.h" #include "ecb_cbc_helpers.h" @@ -98,12 +99,22 @@ static struct skcipher_alg camellia_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *camellia_simd_algs[ARRAY_SIZE(camellia_algs)]; static int __init camellia_aesni_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index d45e9c0c42ac..e21d2d5b68f9 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -1377,10 +1377,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init camellia_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "camellia-x86_64: performance on this CPU " diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index 3976a87f92ad..bdc3c763334c 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast5_algs[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast5_simd_algs[ARRAY_SIZE(cast5_algs)]; static int __init cast5_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index 7e2aea372349..addca34b3511 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "ecb_cbc_helpers.h" @@ -93,12 +94,21 @@ static struct skcipher_alg cast6_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *cast6_simd_algs[ARRAY_SIZE(cast6_algs)]; static int __init cast6_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 0d7e172862db..7275cae3380d 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -278,10 +279,17 @@ static struct skcipher_alg algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init chacha_simd_mod_init(void) { - if (!boot_cpu_has(X86_FEATURE_SSSE3)) - return 0; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + static_branch_enable(&chacha_use_simd); diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index d55fa9e9b9e6..7fe395dfa79d 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -12,7 +12,7 @@ #include #include #include - +#include #include #include @@ -1697,9 +1697,19 @@ static struct kpp_alg curve25519_alg = { .max_size = curve25519_max_size, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ADX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); static int __init curve25519_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_BMI2) && boot_cpu_has(X86_FEATURE_ADX)) static_branch_enable(&curve25519_use_bmi2_adx); else diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index abb8b1fe123b..168cac5c6ca6 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include struct des3_ede_x86_ctx { struct des3_ede_ctx enc; @@ -354,10 +355,19 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init des3_ede_x86_init(void) { int err; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { pr_info("des3_ede-x86_64: performance on this CPU would be suboptimal: disabling des3_ede-x86_64.\n"); return -ENODEV; diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index 59615ae95e86..a8046334ddca 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -57,8 +58,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index bf91c375821a..cdbe5df00927 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #define FPU_BYTES 4096U /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -57,8 +58,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2)) return -ENODEV; diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 3764301bdf1b..3e6ff505cd26 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -260,8 +261,19 @@ static struct shash_alg alg = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX512F, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init poly1305_simd_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_AVX) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) static_branch_enable(&poly1305_use_avx); diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 347e97f4b713..24741d33edaf 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -94,12 +95,21 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_avx2_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { pr_info("AVX2 instructions are not detected.\n"); return -ENODEV; diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 6c248e1ea4ef..0db18d99da50 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "serpent-avx.h" #include "ecb_cbc_helpers.h" @@ -100,12 +101,21 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index d78f37e9b2cf..5288441cc223 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "serpent-sse2.h" #include "ecb_cbc_helpers.h" @@ -103,10 +104,19 @@ static struct skcipher_alg serpent_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *serpent_simd_algs[ARRAY_SIZE(serpent_algs)]; static int __init serpent_sse2_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2)) { printk(KERN_INFO "SSE2 instructions are not detected.\n"); return -ENODEV; diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 84bc718f49a3..2e9fe76056b8 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -126,6 +127,14 @@ static struct skcipher_alg sm4_aesni_avx2_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)]; @@ -133,6 +142,9 @@ static int __init sm4_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AES) || diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index 7800f77d68ad..f730822f203a 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -445,6 +446,13 @@ static struct skcipher_alg sm4_aesni_avx_skciphers[] = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AES, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg * simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)]; @@ -452,6 +460,9 @@ static int __init sm4_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX) || !boot_cpu_has(X86_FEATURE_AES) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) { diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 3eb3440b477a..4657e6efc35d 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -103,12 +104,21 @@ static struct skcipher_alg twofish_algs[] = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static struct simd_skcipher_alg *twofish_simd_algs[ARRAY_SIZE(twofish_algs)]; static int __init twofish_init(void) { const char *feature_name; + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, &feature_name)) { pr_info("CPU feature '%s' is not supported.\n", feature_name); return -ENODEV; diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index f9c4adc27404..ade98aef3402 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -43,6 +43,7 @@ #include #include #include +#include asmlinkage void twofish_enc_blk(struct twofish_ctx *ctx, u8 *dst, const u8 *src); @@ -81,8 +82,17 @@ static struct crypto_alg alg = { } }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_glue_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + return crypto_register_alg(&alg); } diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 90454cf18e0d..790e5a59a9a7 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "twofish.h" #include "ecb_cbc_helpers.h" @@ -140,8 +141,17 @@ static int force; module_param(force, int, 0); MODULE_PARM_DESC(force, "Force module load, ignore CPU blacklist"); +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_ANY, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init twofish_3way_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!force && is_blacklisted_cpu()) { printk(KERN_INFO "twofish-x86_64-3way: performance on this CPU " From patchwork Wed Oct 12 21:59:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 614898 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 757C2C4332F for ; Wed, 12 Oct 2022 22:02:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiJLWCY (ORCPT ); Wed, 12 Oct 2022 18:02:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230040AbiJLWBQ (ORCPT ); Wed, 12 Oct 2022 18:01:16 -0400 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94C3E71997; Wed, 12 Oct 2022 15:00:20 -0700 (PDT) Received: from pps.filterd (m0134424.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29CL6Inh011403; Wed, 12 Oct 2022 22:00:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pps0720; bh=vfe3JaXy5t1uPLNmSMs9D1BrEZcB/b63Kn1+U3AJmdI=; b=Oz8xN/q2X3vq6EDXxqiBUc4Xqu6bY4QmotP8gAhS52I4uz99gDtz+hl3wr9CajHLdoxN fP5r9Auacaxm9MJ2bafrCeOthd+M6hV9x2hVGQ0IWVNS/cDnrhH+8yKO4KK7uUZWrQJq h9hHlWWEAuGSAuO/n0glABkqIBQz3jhRTEAhIGuItvaXM5M7XAG9VGJzuoYgZdaeTsdj D9QYfWFyo+JHdLGiApKRVB95rQv3nA/qgSOmFmvO9Nmy/OiEfrs0O9yCTTA5SzBPDJ+l 5KNX3WCUQZpbyuuTMZ2YMgY0DdOuFXxcOM1Ej+4zsEhyj9zOiqkhfy7VHxlfNPejQeCV 7A== Received: from p1lg14878.it.hpe.com (p1lg14878.it.hpe.com [16.230.97.204]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 3k6471rtrh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 12 Oct 2022 22:00:07 +0000 Received: from p1lg14885.dc01.its.hpecorp.net (unknown [10.119.18.236]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14878.it.hpe.com (Postfix) with ESMTPS id DFDEA13959; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14885.dc01.its.hpecorp.net (Postfix) with ESMTP id 863EB8036BF; Wed, 12 Oct 2022 22:00:06 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v2 15/19] crypto: x86 - add pr_fmt to all modules Date: Wed, 12 Oct 2022 16:59:27 -0500 Message-Id: <20221012215931.3896-16-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221012215931.3896-1-elliott@hpe.com> References: <20221006223151.22159-1-elliott@hpe.com> <20221012215931.3896-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: Dpqcuqi8bN92J1TWWdtx_xSR_XapDlNC X-Proofpoint-ORIG-GUID: Dpqcuqi8bN92J1TWWdtx_xSR_XapDlNC X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-12_11,2022-10-12_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2210120138 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Add pr_fmt to all the modules so prints are prefixed by the Kconfig system module name (which is usually similar to, but not an exact match, for the runtime module name). Signed-off-by: Robert Elliott --- arch/x86/crypto/aegis128-aesni-glue.c | 2 ++ arch/x86/crypto/aesni-intel_glue.c | 2 ++ arch/x86/crypto/aria_aesni_avx_glue.c | 2 ++ arch/x86/crypto/blake2s-glue.c | 2 ++ arch/x86/crypto/blowfish_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx2_glue.c | 2 ++ arch/x86/crypto/camellia_aesni_avx_glue.c | 2 ++ arch/x86/crypto/camellia_glue.c | 3 +++ arch/x86/crypto/cast5_avx_glue.c | 2 ++ arch/x86/crypto/cast6_avx_glue.c | 2 ++ arch/x86/crypto/chacha_glue.c | 2 ++ arch/x86/crypto/crc32-pclmul_glue.c | 3 +++ arch/x86/crypto/crc32c-intel_glue.c | 3 +++ arch/x86/crypto/crct10dif-pclmul_glue.c | 2 ++ arch/x86/crypto/curve25519-x86_64.c | 2 ++ arch/x86/crypto/des3_ede_glue.c | 2 ++ arch/x86/crypto/ghash-clmulni-intel_glue.c | 2 ++ arch/x86/crypto/nhpoly1305-avx2-glue.c | 2 ++ arch/x86/crypto/nhpoly1305-sse2-glue.c | 2 ++ arch/x86/crypto/poly1305_glue.c | 2 ++ arch/x86/crypto/polyval-clmulni_glue.c | 2 ++ arch/x86/crypto/serpent_avx2_glue.c | 2 ++ arch/x86/crypto/serpent_avx_glue.c | 2 ++ arch/x86/crypto/serpent_sse2_glue.c | 2 ++ arch/x86/crypto/sha256_ssse3_glue.c | 1 - arch/x86/crypto/sm4_aesni_avx2_glue.c | 2 ++ arch/x86/crypto/sm4_aesni_avx_glue.c | 2 ++ arch/x86/crypto/twofish_avx_glue.c | 2 ++ arch/x86/crypto/twofish_glue.c | 2 ++ arch/x86/crypto/twofish_glue_3way.c | 2 ++ 30 files changed, 61 insertions(+), 1 deletion(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 9e4ba031704d..122bfd04ee47 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -7,6 +7,8 @@ * Copyright (C) 2017-2018 Red Hat, Inc. All rights reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index 4a530a558436..df93cb44b4eb 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -15,6 +15,8 @@ * Copyright (c) 2010, Intel Corporation. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/aria_aesni_avx_glue.c b/arch/x86/crypto/aria_aesni_avx_glue.c index c561ea4fefa5..589097728bd1 100644 --- a/arch/x86/crypto/aria_aesni_avx_glue.c +++ b/arch/x86/crypto/aria_aesni_avx_glue.c @@ -5,6 +5,8 @@ * Copyright (c) 2022 Taehee Yoo */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index 5153bb423dbe..ac7fb7a9922b 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -3,6 +3,8 @@ * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 4c0ead71b198..5cfcbb91c4ca 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index 8e3ac5be7cf6..851f2a29963c 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -5,6 +5,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 54fcd86160ff..8846493c92fb 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -5,6 +5,8 @@ * Copyright © 2012-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/camellia_glue.c b/arch/x86/crypto/camellia_glue.c index e21d2d5b68f9..3c14a904af00 100644 --- a/arch/x86/crypto/camellia_glue.c +++ b/arch/x86/crypto/camellia_glue.c @@ -8,6 +8,9 @@ * Copyright (C) 2006 NTT (Nippon Telegraph and Telephone Corporation) */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include #include #include #include diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index bdc3c763334c..fdeec0849ab5 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -6,6 +6,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index addca34b3511..9258082408eb 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c index 7275cae3380d..8e5cadc808b4 100644 --- a/arch/x86/crypto/chacha_glue.c +++ b/arch/x86/crypto/chacha_glue.c @@ -6,6 +6,8 @@ * Copyright (C) 2015 Martin Willi */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crc32-pclmul_glue.c b/arch/x86/crypto/crc32-pclmul_glue.c index d49a19dcee37..bc2b31b04e05 100644 --- a/arch/x86/crypto/crc32-pclmul_glue.c +++ b/arch/x86/crypto/crc32-pclmul_glue.c @@ -26,6 +26,9 @@ * * Wrappers for kernel crypto shash api to pclmulqdq crc32 implementation. */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crc32c-intel_glue.c b/arch/x86/crypto/crc32c-intel_glue.c index 980c62929256..ebf530934a3e 100644 --- a/arch/x86/crypto/crc32c-intel_glue.c +++ b/arch/x86/crypto/crc32c-intel_glue.c @@ -11,6 +11,9 @@ * Authors: Austin Zhang * Kent Liu */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c b/arch/x86/crypto/crct10dif-pclmul_glue.c index 3b8e9394c40d..03e35a1b7677 100644 --- a/arch/x86/crypto/crct10dif-pclmul_glue.c +++ b/arch/x86/crypto/crct10dif-pclmul_glue.c @@ -22,6 +22,8 @@ * */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/curve25519-x86_64.c b/arch/x86/crypto/curve25519-x86_64.c index 7fe395dfa79d..f9a1adb0c183 100644 --- a/arch/x86/crypto/curve25519-x86_64.c +++ b/arch/x86/crypto/curve25519-x86_64.c @@ -4,6 +4,8 @@ * Copyright (c) 2016-2020 INRIA, CMU and Microsoft Corporation */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index 168cac5c6ca6..83e686a6c2f3 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/ghash-clmulni-intel_glue.c b/arch/x86/crypto/ghash-clmulni-intel_glue.c index 69945e41bc41..3ad55144da48 100644 --- a/arch/x86/crypto/ghash-clmulni-intel_glue.c +++ b/arch/x86/crypto/ghash-clmulni-intel_glue.c @@ -7,6 +7,8 @@ * Author: Huang Ying */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index a8046334ddca..40f49107e5a9 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -6,6 +6,8 @@ * Copyright 2018 Google LLC */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index cdbe5df00927..bb40fed92c92 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -6,6 +6,8 @@ * Copyright 2018 Google LLC */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 3e6ff505cd26..a2a7cb39cdec 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -3,6 +3,8 @@ * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/polyval-clmulni_glue.c b/arch/x86/crypto/polyval-clmulni_glue.c index 2502964afef6..5a345db20ca9 100644 --- a/arch/x86/crypto/polyval-clmulni_glue.c +++ b/arch/x86/crypto/polyval-clmulni_glue.c @@ -16,6 +16,8 @@ * operations. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 24741d33edaf..5944bf5ead2e 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -5,6 +5,8 @@ * Copyright © 2012-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 0db18d99da50..45713c7a4cb9 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2011-2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index 5288441cc223..d8aa0d3fbf15 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -12,6 +12,8 @@ * Copyright (c) 2006 Herbert Xu */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 42e8cb1a6708..8a0fb308fbba 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -26,7 +26,6 @@ * SOFTWARE. */ - #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #include diff --git a/arch/x86/crypto/sm4_aesni_avx2_glue.c b/arch/x86/crypto/sm4_aesni_avx2_glue.c index 2e9fe76056b8..3fe9e170b880 100644 --- a/arch/x86/crypto/sm4_aesni_avx2_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx2_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2021 Tianjia Zhang */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/sm4_aesni_avx_glue.c b/arch/x86/crypto/sm4_aesni_avx_glue.c index f730822f203a..14ae012948ae 100644 --- a/arch/x86/crypto/sm4_aesni_avx_glue.c +++ b/arch/x86/crypto/sm4_aesni_avx_glue.c @@ -8,6 +8,8 @@ * Copyright (c) 2021 Tianjia Zhang */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index 4657e6efc35d..044e4f92e2c0 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -8,6 +8,8 @@ * Copyright © 2013 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_glue.c b/arch/x86/crypto/twofish_glue.c index ade98aef3402..031ed290c755 100644 --- a/arch/x86/crypto/twofish_glue.c +++ b/arch/x86/crypto/twofish_glue.c @@ -38,6 +38,8 @@ * Third Edition. */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include diff --git a/arch/x86/crypto/twofish_glue_3way.c b/arch/x86/crypto/twofish_glue_3way.c index 790e5a59a9a7..7e2a18e3abe7 100644 --- a/arch/x86/crypto/twofish_glue_3way.c +++ b/arch/x86/crypto/twofish_glue_3way.c @@ -5,6 +5,8 @@ * Copyright (c) 2011 Jussi Kivilinna */ +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + #include #include #include From patchwork Thu Nov 3 04:27:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Elliott, Robert \(Servers\)" X-Patchwork-Id: 621140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 843F8C4332F for ; Thu, 3 Nov 2022 04:30:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230408AbiKCEaB (ORCPT ); Thu, 3 Nov 2022 00:30:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230398AbiKCE3E (ORCPT ); Thu, 3 Nov 2022 00:29:04 -0400 Received: from mx0a-002e3701.pphosted.com (mx0a-002e3701.pphosted.com [148.163.147.86]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E24617A8E; Wed, 2 Nov 2022 21:28:35 -0700 (PDT) Received: from pps.filterd (m0148663.ppops.net [127.0.0.1]) by mx0a-002e3701.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A33VbuI014797; Thu, 3 Nov 2022 04:28:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pps0720; bh=G7u1zUhzEy2Z3v9+f98yVaQ73+U2QkY+yVwx1BjgOZE=; b=HLKouWUMYSWw7quRf3qPuFiXZblLqMTPqpxKEJhH0YseFfrq34bXknRLDWqsk2YFuB+M r3shCI+TQ9I7+vybntd4lkHcUHwctFcTE52WoI6ijEdXqtWLuiQo+S4Z1zC/q5lo6MtL 2aMmzVdHPLqV1i39WWg1306vsR1DnRkjLSzu6y2v9v2LISIEYRy5qGBKnZZLRSryqnVj tz1k9pOQMgJkVmhp2ftwtindcywpbrb1BvZd5pB0eRI3CiZVOhw186Ika4TBA8lh3dho rq8yEmuyC/BtmT3tFb+qJpOdU0kKcQ0vOCeUNdleUbnofc9pfesaCptRk+KyPvgGgXf2 EQ== Received: from p1lg14881.it.hpe.com (p1lg14881.it.hpe.com [16.230.97.202]) by mx0a-002e3701.pphosted.com (PPS) with ESMTPS id 3km5tn8a1a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 03 Nov 2022 04:28:28 +0000 Received: from p1lg14886.dc01.its.hpecorp.net (unknown [10.119.18.237]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14881.it.hpe.com (Postfix) with ESMTPS id 055FD80472B; Thu, 3 Nov 2022 04:28:28 +0000 (UTC) Received: from adevxp033-sys.us.rdlabs.hpecorp.net (unknown [16.231.227.36]) by p1lg14886.dc01.its.hpecorp.net (Postfix) with ESMTP id 85C0D808EB8; Thu, 3 Nov 2022 04:28:27 +0000 (UTC) From: Robert Elliott To: herbert@gondor.apana.org.au, davem@davemloft.net, tim.c.chen@linux.intel.com, ap420073@gmail.com, ardb@kernel.org, Jason@zx2c4.com, David.Laight@ACULAB.COM, ebiggers@kernel.org, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Robert Elliott Subject: [PATCH v3 17/17] crypto: x86/nhpoly1305, poly1305 - load based on CPU features Date: Wed, 2 Nov 2022 23:27:40 -0500 Message-Id: <20221103042740.6556-18-elliott@hpe.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221103042740.6556-1-elliott@hpe.com> References: <20221012215931.3896-1-elliott@hpe.com> <20221103042740.6556-1-elliott@hpe.com> MIME-Version: 1.0 X-Proofpoint-GUID: LvU7jicmIURC_DCUpBkpWM2HsYBXp38r X-Proofpoint-ORIG-GUID: LvU7jicmIURC_DCUpBkpWM2HsYBXp38r X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-02_15,2022-11-02_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 phishscore=0 adultscore=0 priorityscore=1501 clxscore=1015 mlxscore=0 spamscore=0 mlxlogscore=999 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211030031 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Like commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features"), add module aliases for x86-optimized crypto modules: nhpoly1305, poly1305 based on CPU feature bits so udev gets a chance to load them later in the boot process when the filesystems are all running. Signed-off-by: Robert Elliott --- arch/x86/crypto/nhpoly1305-avx2-glue.c | 10 ++++++++++ arch/x86/crypto/nhpoly1305-sse2-glue.c | 10 ++++++++++ arch/x86/crypto/poly1305_glue.c | 12 ++++++++++++ 3 files changed, 32 insertions(+) diff --git a/arch/x86/crypto/nhpoly1305-avx2-glue.c b/arch/x86/crypto/nhpoly1305-avx2-glue.c index f7dc9c563bb5..15f98b53bfda 100644 --- a/arch/x86/crypto/nhpoly1305-avx2-glue.c +++ b/arch/x86/crypto/nhpoly1305-avx2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -60,8 +61,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_OSXSAVE)) return -ENODEV; diff --git a/arch/x86/crypto/nhpoly1305-sse2-glue.c b/arch/x86/crypto/nhpoly1305-sse2-glue.c index daffcc7019ad..533db3e0e06f 100644 --- a/arch/x86/crypto/nhpoly1305-sse2-glue.c +++ b/arch/x86/crypto/nhpoly1305-sse2-glue.c @@ -11,6 +11,7 @@ #include #include #include +#include #include /* avoid kernel_fpu_begin/end scheduler/rcu stalls */ @@ -60,8 +61,17 @@ static struct shash_alg nhpoly1305_alg = { .descsize = sizeof(struct nhpoly1305_state), }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_XMM2, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init nhpoly1305_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (!boot_cpu_has(X86_FEATURE_XMM2)) return -ENODEV; diff --git a/arch/x86/crypto/poly1305_glue.c b/arch/x86/crypto/poly1305_glue.c index 16831c036d71..2ff4358e4b3f 100644 --- a/arch/x86/crypto/poly1305_glue.c +++ b/arch/x86/crypto/poly1305_glue.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -268,8 +269,19 @@ static struct shash_alg alg = { }, }; +static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX512F, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int __init poly1305_simd_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (boot_cpu_has(X86_FEATURE_AVX) && cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) static_branch_enable(&poly1305_use_avx);