From patchwork Wed Dec 23 08:09:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E192EC433E6 for ; Wed, 23 Dec 2020 08:13:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 98D662070B for ; Wed, 23 Dec 2020 08:13:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727605AbgLWINf (ORCPT ); Wed, 23 Dec 2020 03:13:35 -0500 Received: from mail.kernel.org ([198.145.29.99]:46628 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727712AbgLWINe (ORCPT ); Wed, 23 Dec 2020 03:13:34 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 96C5922482; Wed, 23 Dec 2020 08:12:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711173; bh=/JOeDVsRKecFgqhYuedJ9WzTU2siPkilHRHczO28l8M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Wl+Ffinee7BFY+nx97KgW6rOgIk0E4ofkHa8ND6e9IkVmfxdQeHGdZC7eCYC43wjZ hIMltVJ7HwHBms18VmDXcKdN2IgzHSha0X20Eq9ywO9UPIsEV4fQSn4A8F4IE+Ofog fex3tPySB78CEZYDRsGlAQlhSIsYk2r4WqsrZAraEdV3vQ1C0W6WKFH8Qw9GkEb35s GKvTuyARJzf0hufuL03J4tohT78ekFCq2c6QF6qursb7vYCiXUv7qI5uv29lc1LiQp jFpdNRpGbT3NdYqxwAP8PSfbmclIo1BR6FHipG3Yag9RH1LOr6wSJmhnNghAV8EnZP iIQOMTDTlvsZg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 01/14] crypto: blake2s - define shash_alg structs using macros Date: Wed, 23 Dec 2020 00:09:50 -0800 Message-Id: <20201223081003.373663-2-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers The shash_alg structs for the four variants of BLAKE2s are identical except for the algorithm name, driver name, and digest size. So, avoid code duplication by using a macro to define these structs. Acked-by: Ard Biesheuvel Signed-off-by: Eric Biggers --- crypto/blake2s_generic.c | 88 ++++++++++++---------------------------- 1 file changed, 27 insertions(+), 61 deletions(-) diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c index 005783ff45ad0..e3aa6e7ff3d83 100644 --- a/crypto/blake2s_generic.c +++ b/crypto/blake2s_generic.c @@ -83,67 +83,33 @@ static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) return 0; } -static struct shash_alg blake2s_algs[] = {{ - .base.cra_name = "blake2s-128", - .base.cra_driver_name = "blake2s-128-generic", - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), - .base.cra_priority = 200, - .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, - .base.cra_module = THIS_MODULE, - - .digestsize = BLAKE2S_128_HASH_SIZE, - .setkey = crypto_blake2s_setkey, - .init = crypto_blake2s_init, - .update = crypto_blake2s_update, - .final = crypto_blake2s_final, - .descsize = sizeof(struct blake2s_state), -}, { - .base.cra_name = "blake2s-160", - .base.cra_driver_name = "blake2s-160-generic", - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), - .base.cra_priority = 200, - .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, - .base.cra_module = THIS_MODULE, - - .digestsize = BLAKE2S_160_HASH_SIZE, - .setkey = crypto_blake2s_setkey, - .init = crypto_blake2s_init, - .update = crypto_blake2s_update, - .final = crypto_blake2s_final, - .descsize = sizeof(struct blake2s_state), -}, { - .base.cra_name = "blake2s-224", - .base.cra_driver_name = "blake2s-224-generic", - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), - .base.cra_priority = 200, - .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, - .base.cra_module = THIS_MODULE, - - .digestsize = BLAKE2S_224_HASH_SIZE, - .setkey = crypto_blake2s_setkey, - .init = crypto_blake2s_init, - .update = crypto_blake2s_update, - .final = crypto_blake2s_final, - .descsize = sizeof(struct blake2s_state), -}, { - .base.cra_name = "blake2s-256", - .base.cra_driver_name = "blake2s-256-generic", - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, - .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), - .base.cra_priority = 200, - .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, - .base.cra_module = THIS_MODULE, - - .digestsize = BLAKE2S_256_HASH_SIZE, - .setkey = crypto_blake2s_setkey, - .init = crypto_blake2s_init, - .update = crypto_blake2s_update, - .final = crypto_blake2s_final, - .descsize = sizeof(struct blake2s_state), -}}; +#define BLAKE2S_ALG(name, driver_name, digest_size) \ + { \ + .base.cra_name = name, \ + .base.cra_driver_name = driver_name, \ + .base.cra_priority = 100, \ + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \ + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \ + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \ + .base.cra_module = THIS_MODULE, \ + .digestsize = digest_size, \ + .setkey = crypto_blake2s_setkey, \ + .init = crypto_blake2s_init, \ + .update = crypto_blake2s_update, \ + .final = crypto_blake2s_final, \ + .descsize = sizeof(struct blake2s_state), \ + } + +static struct shash_alg blake2s_algs[] = { + BLAKE2S_ALG("blake2s-128", "blake2s-128-generic", + BLAKE2S_128_HASH_SIZE), + BLAKE2S_ALG("blake2s-160", "blake2s-160-generic", + BLAKE2S_160_HASH_SIZE), + BLAKE2S_ALG("blake2s-224", "blake2s-224-generic", + BLAKE2S_224_HASH_SIZE), + BLAKE2S_ALG("blake2s-256", "blake2s-256-generic", + BLAKE2S_256_HASH_SIZE), +}; static int __init blake2s_mod_init(void) { From patchwork Wed Dec 23 08:09:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351514 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2810CC433E6 for ; Wed, 23 Dec 2020 08:13:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EAE43224B0 for ; Wed, 23 Dec 2020 08:13:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727712AbgLWINg (ORCPT ); Wed, 23 Dec 2020 03:13:36 -0500 Received: from mail.kernel.org ([198.145.29.99]:46686 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727719AbgLWINg (ORCPT ); Wed, 23 Dec 2020 03:13:36 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id ECCD62076D; Wed, 23 Dec 2020 08:12:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711175; bh=b6+wc7dgRM5d8Iyuv0PHYC4QdSJRATl+ajSSE/yxgHA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HRa2zcMw6Rm3uDgBsMUwzZH+xzMA7oJbWzMAHDqbnjsjtDxbsJphlhG7V6I0+Z0UY tpN2je5lhjEp+cvK49vFX+iEEna4bEBxYO19XVUOFQ1mjF3pejOqfRK11nLxcb/WGx J6mR468qOnvJK9VCgorYeKWYsEw3AIiEQc1Wfl9EdHMg7W37kvvbqz2aeGdPdP3Ulf UMU9NaDuh5fUIqqM1/Kxeuf3q+Ew8H2mZwSbx6Lbt2EeJ4JgbuBKcyIwCUvwaoV4p0 dyGoLp8q6APyFZ4+pCn1opbF4JFTNeRUm6G6BQ36JgdK1ONwcwNcDKnjzpezJN901E 4w5WZrKuAQleA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 04/14] crypto: blake2s - move update and final logic to internal/blake2s.h Date: Wed, 23 Dec 2020 00:09:53 -0800 Message-Id: <20201223081003.373663-5-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers Move most of blake2s_update() and blake2s_final() into new inline functions __blake2s_update() and __blake2s_final() in include/crypto/internal/blake2s.h so that this logic can be shared by the shash helper functions. This will avoid duplicating this logic between the library and shash implementations. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- include/crypto/internal/blake2s.h | 41 ++++++++++++++++++++++++++ lib/crypto/blake2s.c | 48 ++++++------------------------- 2 files changed, 49 insertions(+), 40 deletions(-) diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h index 6e376ae6b6b58..42deba4b8ceef 100644 --- a/include/crypto/internal/blake2s.h +++ b/include/crypto/internal/blake2s.h @@ -4,6 +4,7 @@ #define BLAKE2S_INTERNAL_H #include +#include struct blake2s_tfm_ctx { u8 key[BLAKE2S_KEY_SIZE]; @@ -23,4 +24,44 @@ static inline void blake2s_set_lastblock(struct blake2s_state *state) state->f[0] = -1; } +typedef void (*blake2s_compress_t)(struct blake2s_state *state, + const u8 *block, size_t nblocks, u32 inc); + +static inline void __blake2s_update(struct blake2s_state *state, + const u8 *in, size_t inlen, + blake2s_compress_t compress) +{ + const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; + + if (unlikely(!inlen)) + return; + if (inlen > fill) { + memcpy(state->buf + state->buflen, in, fill); + (*compress)(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); + state->buflen = 0; + in += fill; + inlen -= fill; + } + if (inlen > BLAKE2S_BLOCK_SIZE) { + const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); + /* Hash one less (full) block than strictly possible */ + (*compress)(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); + in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); + inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); + } + memcpy(state->buf + state->buflen, in, inlen); + state->buflen += inlen; +} + +static inline void __blake2s_final(struct blake2s_state *state, u8 *out, + blake2s_compress_t compress) +{ + blake2s_set_lastblock(state); + memset(state->buf + state->buflen, 0, + BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ + (*compress)(state, state->buf, 1, state->buflen); + cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); + memcpy(out, state->h, state->outlen); +} + #endif /* BLAKE2S_INTERNAL_H */ diff --git a/lib/crypto/blake2s.c b/lib/crypto/blake2s.c index 6a4b6b78d630f..c64ac8bfb6a97 100644 --- a/lib/crypto/blake2s.c +++ b/lib/crypto/blake2s.c @@ -15,55 +15,23 @@ #include #include #include -#include + +#if IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S) +# define blake2s_compress blake2s_compress_arch +#else +# define blake2s_compress blake2s_compress_generic +#endif void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen) { - const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; - - if (unlikely(!inlen)) - return; - if (inlen > fill) { - memcpy(state->buf + state->buflen, in, fill); - if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) - blake2s_compress_arch(state, state->buf, 1, - BLAKE2S_BLOCK_SIZE); - else - blake2s_compress_generic(state, state->buf, 1, - BLAKE2S_BLOCK_SIZE); - state->buflen = 0; - in += fill; - inlen -= fill; - } - if (inlen > BLAKE2S_BLOCK_SIZE) { - const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); - /* Hash one less (full) block than strictly possible */ - if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) - blake2s_compress_arch(state, in, nblocks - 1, - BLAKE2S_BLOCK_SIZE); - else - blake2s_compress_generic(state, in, nblocks - 1, - BLAKE2S_BLOCK_SIZE); - in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); - inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); - } - memcpy(state->buf + state->buflen, in, inlen); - state->buflen += inlen; + __blake2s_update(state, in, inlen, blake2s_compress); } EXPORT_SYMBOL(blake2s_update); void blake2s_final(struct blake2s_state *state, u8 *out) { WARN_ON(IS_ENABLED(DEBUG) && !out); - blake2s_set_lastblock(state); - memset(state->buf + state->buflen, 0, - BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ - if (IS_ENABLED(CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S)) - blake2s_compress_arch(state, state->buf, 1, state->buflen); - else - blake2s_compress_generic(state, state->buf, 1, state->buflen); - cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); - memcpy(out, state->h, state->outlen); + __blake2s_final(state, out, blake2s_compress); memzero_explicit(state, sizeof(*state)); } EXPORT_SYMBOL(blake2s_final); From patchwork Wed Dec 23 08:09:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6298EC4332E for ; Wed, 23 Dec 2020 08:14:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4982A2070B for ; Wed, 23 Dec 2020 08:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727889AbgLWIOR (ORCPT ); Wed, 23 Dec 2020 03:14:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:46930 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727168AbgLWIOP (ORCPT ); Wed, 23 Dec 2020 03:14:15 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 6A3E0207D2; Wed, 23 Dec 2020 08:12:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711175; bh=WKu9KmHZC5G9NukKoENOJnSDAufpaL5SSKOah6FCBwc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=f8GePN2uB3p2OU8kKCvb9acrjfwUDhc9Px6iQRG267EudQrUobwiSmqxl2rs+jR7h zuY7RVcLV9U1EcI4JJoy0rnOGBw1xXvwXfsIVFRzoaJkOtdH6hg/oqJ3pRk/9mJkcL asdSkhODZiuVcxKWfcsLM8v2atxSP52nisrrkZMU37do8INWsMI7cZEWOVX5iiGX19 s9QAiPeULcmxAev6wCMhEwCDzF2eC5qG6yKh0l0mkY6tMiGAYfasQ8CAXdugJ3So/w 5yFxbt1XRB/EUJUfJDkSS6l0ZoDFgMPvSwldxe3UcH8d6zpcQT5nhgSKIThYajnH11 EH6YrKrom79BA== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 05/14] crypto: blake2s - share the "shash" API boilerplate code Date: Wed, 23 Dec 2020 00:09:54 -0800 Message-Id: <20201223081003.373663-6-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers Add helper functions for shash implementations of BLAKE2s to include/crypto/internal/blake2s.h, taking advantage of __blake2s_update() and __blake2s_final() that were added by the previous patch to share more code between the library and shash implementations. crypto_blake2s_setkey() and crypto_blake2s_init() are usable as shash_alg::setkey and shash_alg::init directly, while crypto_blake2s_update() and crypto_blake2s_final() take an extra 'blake2s_compress_t' function pointer parameter. This allows the implementation of the compression function to be overridden, which is the only part that optimized implementations really care about. The new functions are inline functions (similar to those in sha1_base.h, sha256_base.h, and sm3_base.h) because this avoids needing to add a new module blake2s_helpers.ko, they aren't *too* long, and this avoids indirect calls which are expensive these days. Note that they can't go in blake2s_generic.ko, as that would require selecting CRYPTO_BLAKE2S from CRYPTO_BLAKE2S_X86, which would cause a recursive dependency. Finally, use these new helper functions in the x86 implementation of BLAKE2s. (This part should be a separate patch, but unfortunately the x86 implementation used the exact same function names like "crypto_blake2s_update()", so it had to be updated at the same time.) Signed-off-by: Eric Biggers --- arch/x86/crypto/blake2s-glue.c | 74 +++--------------------------- crypto/blake2s_generic.c | 76 ++++--------------------------- include/crypto/internal/blake2s.h | 65 ++++++++++++++++++++++++-- 3 files changed, 76 insertions(+), 139 deletions(-) diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c index 4dcb2ee89efc9..a40365ab301ee 100644 --- a/arch/x86/crypto/blake2s-glue.c +++ b/arch/x86/crypto/blake2s-glue.c @@ -58,75 +58,15 @@ void blake2s_compress_arch(struct blake2s_state *state, } EXPORT_SYMBOL(blake2s_compress_arch); -static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key, - unsigned int keylen) +static int crypto_blake2s_update_x86(struct shash_desc *desc, + const u8 *in, unsigned int inlen) { - struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm); - - if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE) - return -EINVAL; - - memcpy(tctx->key, key, keylen); - tctx->keylen = keylen; - - return 0; -} - -static int crypto_blake2s_init(struct shash_desc *desc) -{ - struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); - struct blake2s_state *state = shash_desc_ctx(desc); - const int outlen = crypto_shash_digestsize(desc->tfm); - - if (tctx->keylen) - blake2s_init_key(state, outlen, tctx->key, tctx->keylen); - else - blake2s_init(state, outlen); - - return 0; -} - -static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in, - unsigned int inlen) -{ - struct blake2s_state *state = shash_desc_ctx(desc); - const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; - - if (unlikely(!inlen)) - return 0; - if (inlen > fill) { - memcpy(state->buf + state->buflen, in, fill); - blake2s_compress_arch(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); - state->buflen = 0; - in += fill; - inlen -= fill; - } - if (inlen > BLAKE2S_BLOCK_SIZE) { - const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); - /* Hash one less (full) block than strictly possible */ - blake2s_compress_arch(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); - in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); - inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); - } - memcpy(state->buf + state->buflen, in, inlen); - state->buflen += inlen; - - return 0; + return crypto_blake2s_update(desc, in, inlen, blake2s_compress_arch); } -static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) +static int crypto_blake2s_final_x86(struct shash_desc *desc, u8 *out) { - struct blake2s_state *state = shash_desc_ctx(desc); - - blake2s_set_lastblock(state); - memset(state->buf + state->buflen, 0, - BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ - blake2s_compress_arch(state, state->buf, 1, state->buflen); - cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); - memcpy(out, state->h, state->outlen); - memzero_explicit(state, sizeof(*state)); - - return 0; + return crypto_blake2s_final(desc, out, blake2s_compress_arch); } #define BLAKE2S_ALG(name, driver_name, digest_size) \ @@ -141,8 +81,8 @@ static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) .digestsize = digest_size, \ .setkey = crypto_blake2s_setkey, \ .init = crypto_blake2s_init, \ - .update = crypto_blake2s_update, \ - .final = crypto_blake2s_final, \ + .update = crypto_blake2s_update_x86, \ + .final = crypto_blake2s_final_x86, \ .descsize = sizeof(struct blake2s_state), \ } diff --git a/crypto/blake2s_generic.c b/crypto/blake2s_generic.c index b89536c3671cf..72fe480f9bd67 100644 --- a/crypto/blake2s_generic.c +++ b/crypto/blake2s_generic.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR MIT /* + * shash interface to the generic implementation of BLAKE2s + * * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. */ @@ -10,75 +12,15 @@ #include #include -static int crypto_blake2s_setkey(struct crypto_shash *tfm, const u8 *key, - unsigned int keylen) +static int crypto_blake2s_update_generic(struct shash_desc *desc, + const u8 *in, unsigned int inlen) { - struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm); - - if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE) - return -EINVAL; - - memcpy(tctx->key, key, keylen); - tctx->keylen = keylen; - - return 0; + return crypto_blake2s_update(desc, in, inlen, blake2s_compress_generic); } -static int crypto_blake2s_init(struct shash_desc *desc) +static int crypto_blake2s_final_generic(struct shash_desc *desc, u8 *out) { - struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); - struct blake2s_state *state = shash_desc_ctx(desc); - const int outlen = crypto_shash_digestsize(desc->tfm); - - if (tctx->keylen) - blake2s_init_key(state, outlen, tctx->key, tctx->keylen); - else - blake2s_init(state, outlen); - - return 0; -} - -static int crypto_blake2s_update(struct shash_desc *desc, const u8 *in, - unsigned int inlen) -{ - struct blake2s_state *state = shash_desc_ctx(desc); - const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; - - if (unlikely(!inlen)) - return 0; - if (inlen > fill) { - memcpy(state->buf + state->buflen, in, fill); - blake2s_compress_generic(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); - state->buflen = 0; - in += fill; - inlen -= fill; - } - if (inlen > BLAKE2S_BLOCK_SIZE) { - const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); - /* Hash one less (full) block than strictly possible */ - blake2s_compress_generic(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); - in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); - inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); - } - memcpy(state->buf + state->buflen, in, inlen); - state->buflen += inlen; - - return 0; -} - -static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) -{ - struct blake2s_state *state = shash_desc_ctx(desc); - - blake2s_set_lastblock(state); - memset(state->buf + state->buflen, 0, - BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ - blake2s_compress_generic(state, state->buf, 1, state->buflen); - cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); - memcpy(out, state->h, state->outlen); - memzero_explicit(state, sizeof(*state)); - - return 0; + return crypto_blake2s_final(desc, out, blake2s_compress_generic); } #define BLAKE2S_ALG(name, driver_name, digest_size) \ @@ -93,8 +35,8 @@ static int crypto_blake2s_final(struct shash_desc *desc, u8 *out) .digestsize = digest_size, \ .setkey = crypto_blake2s_setkey, \ .init = crypto_blake2s_init, \ - .update = crypto_blake2s_update, \ - .final = crypto_blake2s_final, \ + .update = crypto_blake2s_update_generic, \ + .final = crypto_blake2s_final_generic, \ .descsize = sizeof(struct blake2s_state), \ } diff --git a/include/crypto/internal/blake2s.h b/include/crypto/internal/blake2s.h index 42deba4b8ceef..2ea0a8f5e7f41 100644 --- a/include/crypto/internal/blake2s.h +++ b/include/crypto/internal/blake2s.h @@ -1,16 +1,16 @@ /* SPDX-License-Identifier: GPL-2.0 OR MIT */ +/* + * Helper functions for BLAKE2s implementations. + * Keep this in sync with the corresponding BLAKE2b header. + */ #ifndef BLAKE2S_INTERNAL_H #define BLAKE2S_INTERNAL_H #include +#include #include -struct blake2s_tfm_ctx { - u8 key[BLAKE2S_KEY_SIZE]; - unsigned int keylen; -}; - void blake2s_compress_generic(struct blake2s_state *state,const u8 *block, size_t nblocks, const u32 inc); @@ -27,6 +27,8 @@ static inline void blake2s_set_lastblock(struct blake2s_state *state) typedef void (*blake2s_compress_t)(struct blake2s_state *state, const u8 *block, size_t nblocks, u32 inc); +/* Helper functions for BLAKE2s shared by the library and shash APIs */ + static inline void __blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen, blake2s_compress_t compress) @@ -64,4 +66,57 @@ static inline void __blake2s_final(struct blake2s_state *state, u8 *out, memcpy(out, state->h, state->outlen); } +/* Helper functions for shash implementations of BLAKE2s */ + +struct blake2s_tfm_ctx { + u8 key[BLAKE2S_KEY_SIZE]; + unsigned int keylen; +}; + +static inline int crypto_blake2s_setkey(struct crypto_shash *tfm, + const u8 *key, unsigned int keylen) +{ + struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(tfm); + + if (keylen == 0 || keylen > BLAKE2S_KEY_SIZE) + return -EINVAL; + + memcpy(tctx->key, key, keylen); + tctx->keylen = keylen; + + return 0; +} + +static inline int crypto_blake2s_init(struct shash_desc *desc) +{ + const struct blake2s_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct blake2s_state *state = shash_desc_ctx(desc); + unsigned int outlen = crypto_shash_digestsize(desc->tfm); + + if (tctx->keylen) + blake2s_init_key(state, outlen, tctx->key, tctx->keylen); + else + blake2s_init(state, outlen); + return 0; +} + +static inline int crypto_blake2s_update(struct shash_desc *desc, + const u8 *in, unsigned int inlen, + blake2s_compress_t compress) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + + __blake2s_update(state, in, inlen, compress); + return 0; +} + +static inline int crypto_blake2s_final(struct shash_desc *desc, u8 *out, + blake2s_compress_t compress) +{ + struct blake2s_state *state = shash_desc_ctx(desc); + + __blake2s_final(state, out, compress); + return 0; +} + #endif /* BLAKE2S_INTERNAL_H */ From patchwork Wed Dec 23 08:09:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351512 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BB5BC433DB for ; Wed, 23 Dec 2020 08:14:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F2B3224B0 for ; Wed, 23 Dec 2020 08:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727813AbgLWIOP (ORCPT ); Wed, 23 Dec 2020 03:14:15 -0500 Received: from mail.kernel.org ([198.145.29.99]:46934 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727827AbgLWIOP (ORCPT ); Wed, 23 Dec 2020 03:14:15 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 346EF224D3; Wed, 23 Dec 2020 08:12:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711176; bh=Kg5dRTba2a6zJZ3NuTGgwpK24+5oriIRi29coh0tJW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XYRMu+FBeoVMUcupJYwwZOqU7b3daxt5OZvbwW9/YVdasPjxbBiMVzwBWPD20425A F0qckfosbCHfAIJtvgLvuvKtjKx8+eBrYBRRC+vuLx7b0DkgjpWDt+XFUzsYNIJh6O sozVo5XLKJqpFQYkyNl5bBl157MTdk4GC3zN2FHXEesCYgC+L/xYoRq72ngY16rHec a/pXSSAsN/fDc36zkWWSSruVA4tu7SmOU5y6uAzUFME0Aw9RjsEB9rr8xZ4V8sf3Pl YQe93nZ9/BtOidJkcPJbBmQqHTDcM0lbSTzT6OpXBQTnH6yLkw85LXjXGORDsMQ/mV 3UUGpJz9yy+3Q== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 07/14] crypto: blake2s - add comment for blake2s_state fields Date: Wed, 23 Dec 2020 00:09:56 -0800 Message-Id: <20201223081003.373663-8-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers The first three fields of 'struct blake2s_state' are used in assembly code, which isn't immediately obvious, so add a comment to this effect. Signed-off-by: Eric Biggers --- include/crypto/blake2s.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/crypto/blake2s.h b/include/crypto/blake2s.h index 734ed22b7a6aa..f1c8330a61a91 100644 --- a/include/crypto/blake2s.h +++ b/include/crypto/blake2s.h @@ -24,6 +24,7 @@ enum blake2s_lengths { }; struct blake2s_state { + /* 'h', 't', and 'f' are used in assembly code, so keep them as-is. */ u32 h[8]; u32 t[2]; u32 f[2]; From patchwork Wed Dec 23 08:09:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B70C433E9 for ; Wed, 23 Dec 2020 08:14:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 84518224B8 for ; Wed, 23 Dec 2020 08:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728000AbgLWIOP (ORCPT ); Wed, 23 Dec 2020 03:14:15 -0500 Received: from mail.kernel.org ([198.145.29.99]:46938 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727851AbgLWIOP (ORCPT ); Wed, 23 Dec 2020 03:14:15 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 21E37224DF; Wed, 23 Dec 2020 08:12:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711177; bh=hdS/nZMIHIRlLDU4jx0otHFHReXXhN43ibKqK9h7cxA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OgkNBHtTnftbt2Yj+NrWKD07vzQfqRiCOm5RYBPIAWygx4wk49tKLg9uxWVW8BapX QShd3x1h26NWOwlSVXGibc8YCZ0VlCSh/ACErt1bh3qJE2TM9Rzj0GDkTT2iPKPC92 vV4ecO7uvjODQkAhWBRUsdx5JyDOgWOocYmeUd6sIjDo85udnTQ5ZTge6Tpv04BA8K RoSanYrtI6Wecg48nt84HBLC+VFo6foFGLpq2geneq80fRwN7zzzsxDo6K/kr6MqCz AJO/xTaRkwxSrNmZOindPqu18cqbH4w2kq1Nzke6SNKv5ak9h19bupG+w+7w7hgGYx HQpo2hXw7N95w== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 09/14] crypto: blake2s - include instead of Date: Wed, 23 Dec 2020 00:09:58 -0800 Message-Id: <20201223081003.373663-10-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers Address the following checkpatch warning: WARNING: Use #include instead of Signed-off-by: Eric Biggers --- include/crypto/blake2s.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/crypto/blake2s.h b/include/crypto/blake2s.h index 3f06183c2d804..bc3fb59442ce5 100644 --- a/include/crypto/blake2s.h +++ b/include/crypto/blake2s.h @@ -6,12 +6,11 @@ #ifndef _CRYPTO_BLAKE2S_H #define _CRYPTO_BLAKE2S_H +#include #include #include #include -#include - enum blake2s_lengths { BLAKE2S_BLOCK_SIZE = 64, BLAKE2S_HASH_SIZE = 32, From patchwork Wed Dec 23 08:09:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351508 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 518C8C43331 for ; Wed, 23 Dec 2020 08:14:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2028A224B0 for ; Wed, 23 Dec 2020 08:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727827AbgLWIOR (ORCPT ); Wed, 23 Dec 2020 03:14:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:46940 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727889AbgLWIOQ (ORCPT ); Wed, 23 Dec 2020 03:14:16 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 96E70224B8; Wed, 23 Dec 2020 08:12:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711178; bh=XhsksrwDIqT70f9H88kGLZnr2BQDqu2gAm2TeXt92xE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gHeU1THF9rOmFqv8ARIAMf0QgzcMxOv7qxE6pPY+bYiOFC/msmEPu3xxF7lWo4v7B kk1p3keeqx/5kkV2rDwTEqWK6/feGEAgqWJfHh7YDBBdGRMZA2zcJ2SVu6mj8nf0Sd 0TEzoRtLE/DjfevBVbxRH/zhKo+0Wv3r7LOMIMsr5qJHBHm62biXRQod9mlTp9zqoy BGFQ3/lGJiYh5nNzqUNq0CeeYANbAVUQDxYGZ+tAx9hWFd7MHO3kOBk8/u3L7MBBJ0 wv0p2ThhxCb3bhNQn7QVT7tkqQy2Q4+qVW6zXY2P7UMczMlbvZ+VwB6QSlWjFif/vZ yN05hRV6kEmVg== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 10/14] crypto: arm/blake2s - add ARM scalar optimized BLAKE2s Date: Wed, 23 Dec 2020 00:09:59 -0800 Message-Id: <20201223081003.373663-11-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers Add an ARM scalar optimized implementation of BLAKE2s. NEON isn't very useful for BLAKE2s because the BLAKE2s block size is too small for NEON to help. Each NEON instruction would depend on the previous one, resulting in poor performance. With scalar instructions, on the other hand, we can take advantage of ARM's "free" rotations (like I did in chacha-scalar-core.S) to get an implementation get runs much faster than the C implementation. Performance results on Cortex-A7 in cycles per byte using the shash API: 4096-byte messages: blake2s-256-arm: 18.8 blake2s-256-generic: 26.0 500-byte messages: blake2s-256-arm: 20.3 blake2s-256-generic: 27.9 100-byte messages: blake2s-256-arm: 29.7 blake2s-256-generic: 39.2 32-byte messages: blake2s-256-arm: 50.6 blake2s-256-generic: 66.2 Except on very short messages, this is still slower than the NEON implementation of BLAKE2b which I've written; that is 14.0, 16.4, 25.8, and 76.1 cpb on 4096, 500, 100, and 32-byte messages, respectively. However, optimized BLAKE2s is useful for cases where BLAKE2s is used instead of BLAKE2b, such as WireGuard. This new implementation is added in the form of a new module blake2s-arm.ko, which is analogous to blake2s-x86_64.ko in that it provides blake2s_compress_arch() for use by the library API as well as optionally register the algorithms with the shash API. Acked-by: Ard Biesheuvel Signed-off-by: Eric Biggers Tested-by: Ard Biesheuvel --- arch/arm/crypto/Kconfig | 9 ++ arch/arm/crypto/Makefile | 2 + arch/arm/crypto/blake2s-core.S | 285 +++++++++++++++++++++++++++++++++ arch/arm/crypto/blake2s-glue.c | 78 +++++++++ 4 files changed, 374 insertions(+) create mode 100644 arch/arm/crypto/blake2s-core.S create mode 100644 arch/arm/crypto/blake2s-glue.c diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index c9bf2df85cb90..281c829c12d0b 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -62,6 +62,15 @@ config CRYPTO_SHA512_ARM SHA-512 secure hash standard (DFIPS 180-2) implemented using optimized ARM assembler and NEON, when available. +config CRYPTO_BLAKE2S_ARM + tristate "BLAKE2s digest algorithm (ARM)" + select CRYPTO_ARCH_HAVE_LIB_BLAKE2S + help + BLAKE2s digest algorithm optimized with ARM scalar instructions. This + is faster than the generic implementations of BLAKE2s and BLAKE2b, but + slower than the NEON implementation of BLAKE2b. (There is no NEON + implementation of BLAKE2s, since NEON doesn't really help with it.) + config CRYPTO_AES_ARM tristate "Scalar AES cipher for ARM" select CRYPTO_ALGAPI diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile index b745c17d356fe..5ad1e985a718b 100644 --- a/arch/arm/crypto/Makefile +++ b/arch/arm/crypto/Makefile @@ -9,6 +9,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o +obj-$(CONFIG_CRYPTO_BLAKE2S_ARM) += blake2s-arm.o obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha-neon.o obj-$(CONFIG_CRYPTO_POLY1305_ARM) += poly1305-arm.o obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o @@ -29,6 +30,7 @@ sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y) sha512-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha512-neon-glue.o sha512-arm-y := sha512-core.o sha512-glue.o $(sha512-arm-neon-y) +blake2s-arm-y := blake2s-core.o blake2s-glue.o sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o diff --git a/arch/arm/crypto/blake2s-core.S b/arch/arm/crypto/blake2s-core.S new file mode 100644 index 0000000000000..bed897e9a181a --- /dev/null +++ b/arch/arm/crypto/blake2s-core.S @@ -0,0 +1,285 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * BLAKE2s digest algorithm, ARM scalar implementation + * + * Copyright 2020 Google LLC + * + * Author: Eric Biggers + */ + +#include + + // Registers used to hold message words temporarily. There aren't + // enough ARM registers to hold the whole message block, so we have to + // load the words on-demand. + M_0 .req r12 + M_1 .req r14 + +// The BLAKE2s initialization vector +.Lblake2s_IV: + .word 0x6A09E667, 0xBB67AE85, 0x3C6EF372, 0xA54FF53A + .word 0x510E527F, 0x9B05688C, 0x1F83D9AB, 0x5BE0CD19 + +.macro __ldrd a, b, src, offset +#if __LINUX_ARM_ARCH__ >= 6 + ldrd \a, \b, [\src, #\offset] +#else + ldr \a, [\src, #\offset] + ldr \b, [\src, #\offset + 4] +#endif +.endm + +.macro __strd a, b, dst, offset +#if __LINUX_ARM_ARCH__ >= 6 + strd \a, \b, [\dst, #\offset] +#else + str \a, [\dst, #\offset] + str \b, [\dst, #\offset + 4] +#endif +.endm + +// Execute a quarter-round of BLAKE2s by mixing two columns or two diagonals. +// (a0, b0, c0, d0) and (a1, b1, c1, d1) give the registers containing the two +// columns/diagonals. s0-s1 are the word offsets to the message words the first +// column/diagonal needs, and likewise s2-s3 for the second column/diagonal. +// M_0 and M_1 are free to use, and the message block can be found at sp + 32. +// +// Note that to save instructions, the rotations don't happen when the +// pseudocode says they should, but rather they are delayed until the values are +// used. See the comment above _blake2s_round(). +.macro _blake2s_quarterround a0, b0, c0, d0, a1, b1, c1, d1, s0, s1, s2, s3 + + ldr M_0, [sp, #32 + 4 * \s0] + ldr M_1, [sp, #32 + 4 * \s2] + + // a += b + m[blake2s_sigma[r][2*i + 0]]; + add \a0, \a0, \b0, ror #brot + add \a1, \a1, \b1, ror #brot + add \a0, \a0, M_0 + add \a1, \a1, M_1 + + // d = ror32(d ^ a, 16); + eor \d0, \a0, \d0, ror #drot + eor \d1, \a1, \d1, ror #drot + + // c += d; + add \c0, \c0, \d0, ror #16 + add \c1, \c1, \d1, ror #16 + + // b = ror32(b ^ c, 12); + eor \b0, \c0, \b0, ror #brot + eor \b1, \c1, \b1, ror #brot + + ldr M_0, [sp, #32 + 4 * \s1] + ldr M_1, [sp, #32 + 4 * \s3] + + // a += b + m[blake2s_sigma[r][2*i + 1]]; + add \a0, \a0, \b0, ror #12 + add \a1, \a1, \b1, ror #12 + add \a0, \a0, M_0 + add \a1, \a1, M_1 + + // d = ror32(d ^ a, 8); + eor \d0, \a0, \d0, ror#16 + eor \d1, \a1, \d1, ror#16 + + // c += d; + add \c0, \c0, \d0, ror#8 + add \c1, \c1, \d1, ror#8 + + // b = ror32(b ^ c, 7); + eor \b0, \c0, \b0, ror#12 + eor \b1, \c1, \b1, ror#12 +.endm + +// Execute one round of BLAKE2s by updating the state matrix v[0..15]. v[0..9] +// are in r0..r9. The stack pointer points to 8 bytes of scratch space for +// spilling v[8..9], then to v[9..15], then to the message block. r10-r12 and +// r14 are free to use. The macro arguments s0-s15 give the order in which the +// message words are used in this round. +// +// All rotates are performed using the implicit rotate operand accepted by the +// 'add' and 'eor' instructions. This is faster than using explicit rotate +// instructions. To make this work, we allow the values in the second and last +// rows of the BLAKE2s state matrix (rows 'b' and 'd') to temporarily have the +// wrong rotation amount. The rotation amount is then fixed up just in time +// when the values are used. 'brot' is the number of bits the values in row 'b' +// need to be rotated right to arrive at the correct values, and 'drot' +// similarly for row 'd'. (brot, drot) start out as (0, 0) but we make it such +// that they end up as (7, 8) after every round. +.macro _blake2s_round s0, s1, s2, s3, s4, s5, s6, s7, \ + s8, s9, s10, s11, s12, s13, s14, s15 + + // Mix first two columns: + // (v[0], v[4], v[8], v[12]) and (v[1], v[5], v[9], v[13]). + __ldrd r10, r11, sp, 16 // load v[12] and v[13] + _blake2s_quarterround r0, r4, r8, r10, r1, r5, r9, r11, \ + \s0, \s1, \s2, \s3 + __strd r8, r9, sp, 0 + __strd r10, r11, sp, 16 + + // Mix second two columns: + // (v[2], v[6], v[10], v[14]) and (v[3], v[7], v[11], v[15]). + __ldrd r8, r9, sp, 8 // load v[10] and v[11] + __ldrd r10, r11, sp, 24 // load v[14] and v[15] + _blake2s_quarterround r2, r6, r8, r10, r3, r7, r9, r11, \ + \s4, \s5, \s6, \s7 + str r10, [sp, #24] // store v[14] + // v[10], v[11], and v[15] are used below, so no need to store them yet. + + .set brot, 7 + .set drot, 8 + + // Mix first two diagonals: + // (v[0], v[5], v[10], v[15]) and (v[1], v[6], v[11], v[12]). + ldr r10, [sp, #16] // load v[12] + _blake2s_quarterround r0, r5, r8, r11, r1, r6, r9, r10, \ + \s8, \s9, \s10, \s11 + __strd r8, r9, sp, 8 + str r11, [sp, #28] + str r10, [sp, #16] + + // Mix second two diagonals: + // (v[2], v[7], v[8], v[13]) and (v[3], v[4], v[9], v[14]). + __ldrd r8, r9, sp, 0 // load v[8] and v[9] + __ldrd r10, r11, sp, 20 // load v[13] and v[14] + _blake2s_quarterround r2, r7, r8, r10, r3, r4, r9, r11, \ + \s12, \s13, \s14, \s15 + __strd r10, r11, sp, 20 +.endm + +// +// void blake2s_compress_arch(struct blake2s_state *state, +// const u8 *block, size_t nblocks, u32 inc); +// +// Only the first three fields of struct blake2s_state are used: +// u32 h[8]; (inout) +// u32 t[2]; (inout) +// u32 f[2]; (in) +// + .align 5 +ENTRY(blake2s_compress_arch) + push {r0-r2,r4-r11,lr} // keep this an even number + +.Lnext_block: + // r0 is 'state' + // r1 is 'block' + // r3 is 'inc' + + // Load and increment the counter t[0..1]. + __ldrd r10, r11, r0, 32 + adds r10, r10, r3 + adc r11, r11, #0 + __strd r10, r11, r0, 32 + + // _blake2s_round is very short on registers, so copy the message block + // to the stack to save a register during the rounds. This also has the + // advantage that misalignment only needs to be dealt with in one place. + sub sp, sp, #64 + mov r12, sp + tst r1, #3 + bne .Lcopy_block_misaligned + ldmia r1!, {r2-r9} + stmia r12!, {r2-r9} + ldmia r1!, {r2-r9} + stmia r12, {r2-r9} +.Lcopy_block_done: + str r1, [sp, #68] // Update message pointer + + // Calculate v[8..15]. Push v[9..15] onto the stack, and leave space + // for spilling v[8..9]. Leave v[8..9] in r8-r9. + mov r14, r0 // r14 = state + adr r12, .Lblake2s_IV + ldmia r12!, {r8-r9} // load IV[0..1] + __ldrd r0, r1, r14, 40 // load f[0..1] + ldm r12, {r2-r7} // load IV[3..7] + eor r4, r4, r10 // v[12] = IV[4] ^ t[0] + eor r5, r5, r11 // v[13] = IV[5] ^ t[1] + eor r6, r6, r0 // v[14] = IV[6] ^ f[0] + eor r7, r7, r1 // v[15] = IV[7] ^ f[1] + push {r2-r7} // push v[9..15] + sub sp, sp, #8 // leave space for v[8..9] + + // Load h[0..7] == v[0..7]. + ldm r14, {r0-r7} + + // Execute the rounds. Each round is provided the order in which it + // needs to use the message words. + .set brot, 0 + .set drot, 0 + _blake2s_round 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 + _blake2s_round 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 + _blake2s_round 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 + _blake2s_round 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 + _blake2s_round 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 + _blake2s_round 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 + _blake2s_round 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 + _blake2s_round 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 + _blake2s_round 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 + _blake2s_round 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 + + // Fold the final state matrix into the hash chaining value: + // + // for (i = 0; i < 8; i++) + // h[i] ^= v[i] ^ v[i + 8]; + // + ldr r14, [sp, #96] // r14 = &h[0] + add sp, sp, #8 // v[8..9] are already loaded. + pop {r10-r11} // load v[10..11] + eor r0, r0, r8 + eor r1, r1, r9 + eor r2, r2, r10 + eor r3, r3, r11 + ldm r14, {r8-r11} // load h[0..3] + eor r0, r0, r8 + eor r1, r1, r9 + eor r2, r2, r10 + eor r3, r3, r11 + stmia r14!, {r0-r3} // store new h[0..3] + ldm r14, {r0-r3} // load old h[4..7] + pop {r8-r11} // load v[12..15] + eor r0, r0, r4, ror #brot + eor r1, r1, r5, ror #brot + eor r2, r2, r6, ror #brot + eor r3, r3, r7, ror #brot + eor r0, r0, r8, ror #drot + eor r1, r1, r9, ror #drot + eor r2, r2, r10, ror #drot + eor r3, r3, r11, ror #drot + add sp, sp, #64 // skip copy of message block + stm r14, {r0-r3} // store new h[4..7] + + // Advance to the next block, if there is one. Note that if there are + // multiple blocks, then 'inc' (the counter increment amount) must be + // 64. So we can simply set it to 64 without re-loading it. + ldm sp, {r0, r1, r2} // load (state, block, nblocks) + mov r3, #64 // set 'inc' + subs r2, r2, #1 // nblocks-- + str r2, [sp, #8] + bne .Lnext_block // nblocks != 0? + + pop {r0-r2,r4-r11,pc} + + // The next message block (pointed to by r1) isn't 4-byte aligned, so it + // can't be loaded using ldmia. Copy it to the stack buffer (pointed to + // by r12) using an alternative method. r2-r9 are free to use. +.Lcopy_block_misaligned: + mov r2, #64 +1: +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + ldr r3, [r1], #4 +#else + ldrb r3, [r1, #0] + ldrb r4, [r1, #1] + ldrb r5, [r1, #2] + ldrb r6, [r1, #3] + add r1, r1, #4 + orr r3, r3, r4, lsl #8 + orr r3, r3, r5, lsl #16 + orr r3, r3, r6, lsl #24 +#endif + subs r2, r2, #4 + str r3, [r12], #4 + bne 1b + b .Lcopy_block_done +ENDPROC(blake2s_compress_arch) diff --git a/arch/arm/crypto/blake2s-glue.c b/arch/arm/crypto/blake2s-glue.c new file mode 100644 index 0000000000000..f2cc1e5fc9ec1 --- /dev/null +++ b/arch/arm/crypto/blake2s-glue.c @@ -0,0 +1,78 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * BLAKE2s digest algorithm, ARM scalar implementation + * + * Copyright 2020 Google LLC + */ + +#include +#include + +#include + +/* defined in blake2s-core.S */ +EXPORT_SYMBOL(blake2s_compress_arch); + +static int crypto_blake2s_update_arm(struct shash_desc *desc, + const u8 *in, unsigned int inlen) +{ + return crypto_blake2s_update(desc, in, inlen, blake2s_compress_arch); +} + +static int crypto_blake2s_final_arm(struct shash_desc *desc, u8 *out) +{ + return crypto_blake2s_final(desc, out, blake2s_compress_arch); +} + +#define BLAKE2S_ALG(name, driver_name, digest_size) \ + { \ + .base.cra_name = name, \ + .base.cra_driver_name = driver_name, \ + .base.cra_priority = 200, \ + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \ + .base.cra_blocksize = BLAKE2S_BLOCK_SIZE, \ + .base.cra_ctxsize = sizeof(struct blake2s_tfm_ctx), \ + .base.cra_module = THIS_MODULE, \ + .digestsize = digest_size, \ + .setkey = crypto_blake2s_setkey, \ + .init = crypto_blake2s_init, \ + .update = crypto_blake2s_update_arm, \ + .final = crypto_blake2s_final_arm, \ + .descsize = sizeof(struct blake2s_state), \ + } + +static struct shash_alg blake2s_arm_algs[] = { + BLAKE2S_ALG("blake2s-128", "blake2s-128-arm", BLAKE2S_128_HASH_SIZE), + BLAKE2S_ALG("blake2s-160", "blake2s-160-arm", BLAKE2S_160_HASH_SIZE), + BLAKE2S_ALG("blake2s-224", "blake2s-224-arm", BLAKE2S_224_HASH_SIZE), + BLAKE2S_ALG("blake2s-256", "blake2s-256-arm", BLAKE2S_256_HASH_SIZE), +}; + +static int __init blake2s_arm_mod_init(void) +{ + return IS_REACHABLE(CONFIG_CRYPTO_HASH) ? + crypto_register_shashes(blake2s_arm_algs, + ARRAY_SIZE(blake2s_arm_algs)) : 0; +} + +static void __exit blake2s_arm_mod_exit(void) +{ + if (IS_REACHABLE(CONFIG_CRYPTO_HASH)) + crypto_unregister_shashes(blake2s_arm_algs, + ARRAY_SIZE(blake2s_arm_algs)); +} + +module_init(blake2s_arm_mod_init); +module_exit(blake2s_arm_mod_exit); + +MODULE_DESCRIPTION("BLAKE2s digest algorithm, ARM scalar implementation"); +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Eric Biggers "); +MODULE_ALIAS_CRYPTO("blake2s-128"); +MODULE_ALIAS_CRYPTO("blake2s-128-arm"); +MODULE_ALIAS_CRYPTO("blake2s-160"); +MODULE_ALIAS_CRYPTO("blake2s-160-arm"); +MODULE_ALIAS_CRYPTO("blake2s-224"); +MODULE_ALIAS_CRYPTO("blake2s-224-arm"); +MODULE_ALIAS_CRYPTO("blake2s-256"); +MODULE_ALIAS_CRYPTO("blake2s-256-arm"); From patchwork Wed Dec 23 08:10:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 351510 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E368C4332B for ; Wed, 23 Dec 2020 08:14:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E421E224B0 for ; Wed, 23 Dec 2020 08:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728011AbgLWIOQ (ORCPT ); Wed, 23 Dec 2020 03:14:16 -0500 Received: from mail.kernel.org ([198.145.29.99]:46960 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727976AbgLWIOQ (ORCPT ); Wed, 23 Dec 2020 03:14:16 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1C3B122510; Wed, 23 Dec 2020 08:12:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1608711179; bh=qS5odlNvKzcQ1NLPrL+sF+R4JtS6E6dFmblxMXhMlEI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SkQPMYnRUb4z6xQnbpCKDKenc6E58KB7clteA90Q9gUGF2XAZWw1e17yNcZEio9Oy khoIW0jbfw+D84scqPCSzdlBZYShgWWoWBrs8rYtF6BVE1qXlHWlRZk9w9q3xIwkqQ D0wUj8Sq9xkyoJfts6qR4ExNDBrKDQAVQQRKtOJy4CzWj2Y0vnugoZKQaBFBpNlRRh q7hIitMn8yMM3kTcQB+3ggVnXPkYD+91ZSQF//wpWXFOIufcEQLqycJIzNiLnIvtTy pkGjyUxdDrkYHQzbSVr9jRK1L1pAm6VPd3V63bPewbpemh+KlY8QW2qEz8I0fMf9P2 mqsBm/b4t5P9Q== From: Eric Biggers To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Herbert Xu , David Sterba , "Jason A . Donenfeld" , Paul Crowley Subject: [PATCH v3 13/14] crypto: blake2b - update file comment Date: Wed, 23 Dec 2020 00:10:02 -0800 Message-Id: <20201223081003.373663-14-ebiggers@kernel.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201223081003.373663-1-ebiggers@kernel.org> References: <20201223081003.373663-1-ebiggers@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Eric Biggers The file comment for blake2b_generic.c makes it sound like it's the reference implementation of BLAKE2b with only minor changes. But it's actually been changed a lot. Update the comment to make this clearer. Reviewed-by: David Sterba Acked-by: Ard Biesheuvel Signed-off-by: Eric Biggers --- crypto/blake2b_generic.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/crypto/blake2b_generic.c b/crypto/blake2b_generic.c index 963f7fe0e4ea8..6704c03558896 100644 --- a/crypto/blake2b_generic.c +++ b/crypto/blake2b_generic.c @@ -1,21 +1,18 @@ // SPDX-License-Identifier: (GPL-2.0-only OR Apache-2.0) /* - * BLAKE2b reference source code package - reference C implementations + * Generic implementation of the BLAKE2b digest algorithm. Based on the BLAKE2b + * reference implementation, but it has been heavily modified for use in the + * kernel. The reference implementation was: * - * Copyright 2012, Samuel Neves . You may use this under the - * terms of the CC0, the OpenSSL Licence, or the Apache Public License 2.0, at - * your option. The terms of these licenses can be found at: + * Copyright 2012, Samuel Neves . You may use this under + * the terms of the CC0, the OpenSSL Licence, or the Apache Public License + * 2.0, at your option. The terms of these licenses can be found at: * - * - CC0 1.0 Universal : http://creativecommons.org/publicdomain/zero/1.0 - * - OpenSSL license : https://www.openssl.org/source/license.html - * - Apache 2.0 : https://www.apache.org/licenses/LICENSE-2.0 + * - CC0 1.0 Universal : http://creativecommons.org/publicdomain/zero/1.0 + * - OpenSSL license : https://www.openssl.org/source/license.html + * - Apache 2.0 : https://www.apache.org/licenses/LICENSE-2.0 * - * More information about the BLAKE2 hash function can be found at - * https://blake2.net. - * - * Note: the original sources have been modified for inclusion in linux kernel - * in terms of coding style, using generic helpers and simplifications of error - * handling. + * More information about BLAKE2 can be found at https://blake2.net. */ #include