From patchwork Fri Oct 18 07:53:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 836846 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6CFB18C33E for ; Fri, 18 Oct 2024 07:53:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238039; cv=none; b=BjSFSJlmYZYE8+V8M0GTWKdAuO5TZf+oKef5RCersl9Gd1dgaZuLPfGM3Ag6eQ3AxvmxHZrzetMBuIoHYgTSJNW5i6jNghHhG1BVQKoEAMVu/498dtql5iMKGhDQeWsc08KeYepcCIrNTG4LIIjtTzePAyWpKJ9yT8cLKBXCXpo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238039; c=relaxed/simple; bh=oIlFT7Jxrt62NzF5vNKKgkbIj4vzF5UYXMHHr7146nM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Q/Ek/HcWfa5IoQS0r9XX9XCLl6/d0abQkY53AjPBgV3gvQgz0Bj/yf3GhVtDl/u6KZOjBFjWrrh5h+Yv6TmgcFzU/uUbEwRszx2pqq3zjP7PdaNwdp2ky8NjzkJYMhnboTmH6HU9kt+QAmkgZgC8fajoymGWRlkJ5skGLCt2CEQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tqnscNkx; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tqnscNkx" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e0082c1dd0so40850107b3.3 for ; Fri, 18 Oct 2024 00:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729238037; x=1729842837; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kezPyuM4InCRolJl6LAlFN5HCzOB9cU6j7/gSS+7X54=; b=tqnscNkxoKI7pHEp+tN7HOUo9G6UP9ddAN4dJasXtG7wjAx+32H0chBZrmv9oVGmZg HxtMZjSEVmg6fIliOu1wGhg7aMFM6fxP8q1xsBs+Lp6apSXjiynOfzX9NuzF7zys6N1F CaRD+yAov6BNymHHKRIzz/Bax6Ov3Aq/wkb14CMtWWSH4DDQzrYLhBXLdCEVPr2RiT6T SLUDzcdC2tcsgOA2mgvOZFHxqHSZsbdzFwtd7OrBGabdbqnPJ0k3VekmOc2DdlT6qJBf YVKwEGo+x9HKzsyF0C3ZCU2QMmSJznG1ZBBs/nuaZBHAnQ9q/Pc3UDH9dYMC4nN6fzCq z9Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729238037; x=1729842837; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kezPyuM4InCRolJl6LAlFN5HCzOB9cU6j7/gSS+7X54=; b=pw75KNLXm3IQW4p17CgMUcFdk0/AjfPB+ygrVas5Q788QH0wmOk9QiMGKt5QDcuNoQ EzDY3MFeUNwuehR8UQv9wy0XjO9nNtZh/XP9aPZ870TryTtRofQ5kC41XI4gh9QTqT9Z JNpPKN7KV7HrZ+hm8yXo2zPsmdBX1vdOsxzQpFHxpOGZxd3txbYpg4S6G547Hyr0N+Ez Ac9ZX74qqDeRWrt0wnS83YByrdXLYkW3wRX+NrO/ml0EYTdXRaK39jgLzErgEuyRxABy B2xCjxmDWbyXKhn9moMuFWkJK0idAJIQgeOshtekJNJIrIjdoRMuUK6jHtsohiHcX9CI Pqew== X-Forwarded-Encrypted: i=1; AJvYcCX08Myuv2rb/J7POLWwzHfTkryfZvMkp/eSZjqCugebs9oBPEvBsScnon8YtFYdjVFqUJFA5oJD2Dw2VRY=@vger.kernel.org X-Gm-Message-State: AOJu0YzlEiFbz3trKWmpblCwg3LK+17sgn2Mp5aZbgWaMmGxbMu8+nXM d5s7b+N3lVP+g1P8ZBUCpECJGLq7Ur7BsI+wUMdLMwRUUGFrde8fXukUDWp80Ohc2sxfPA== X-Google-Smtp-Source: AGHT+IEcEu05+Y+9rilOMhpbogGpXWKCrEoZFUDBzcsp9tfYjFlk5BqaMhJ/zAVq1ZWuCm5bsQ87jtBt X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:7b:198d:ac11:8138]) (user=ardb job=sendgmr) by 2002:a05:690c:4487:b0:6db:54ae:fd0f with SMTP id 00721157ae682-6e5bfd8eba6mr49127b3.7.1729238036892; Fri, 18 Oct 2024 00:53:56 -0700 (PDT) Date: Fri, 18 Oct 2024 09:53:49 +0200 In-Reply-To: <20241018075347.2821102-5-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241018075347.2821102-5-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3254; i=ardb@kernel.org; h=from:subject; bh=BrBF0xPC0fiK7oW+r+6zBvbfuoNaGmKqDl154e11vl8=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIV1IhDeWv2XSDrWNN11YuBeoGyR84Pv0d3pduKiC3t01H vO3+i/vKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABOpE2dkWHy3ObZ23s28S9pc meZeV/awTZi+Ib5kkueRWYmVHxYv6mdkuJqRzbFtsfe7S+lvAvZPaGpmf238pj+Q9d71x0zL/20 4zg0A X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241018075347.2821102-6-ardb+git@google.com> Subject: [PATCH v4 1/3] arm64/lib: Handle CRC-32 alternative in C code From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, will@kernel.org, catalin.marinas@arm.com, Ard Biesheuvel , Eric Biggers , Kees Cook , Eric Biggers From: Ard Biesheuvel In preparation for adding another code path for performing CRC-32, move the alternative patching for ARM64_HAS_CRC32 into C code. The logic for deciding whether to use this new code path will be implemented in C too. Reviewed-by: Eric Biggers Signed-off-by: Ard Biesheuvel --- arch/arm64/lib/Makefile | 2 +- arch/arm64/lib/crc32-glue.c | 34 ++++++++++++++++++++ arch/arm64/lib/crc32.S | 22 ++++--------- 3 files changed, 41 insertions(+), 17 deletions(-) diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile index 13e6a2829116..8e882f479d98 100644 --- a/arch/arm64/lib/Makefile +++ b/arch/arm64/lib/Makefile @@ -13,7 +13,7 @@ endif lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o -obj-$(CONFIG_CRC32) += crc32.o +obj-$(CONFIG_CRC32) += crc32.o crc32-glue.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o diff --git a/arch/arm64/lib/crc32-glue.c b/arch/arm64/lib/crc32-glue.c new file mode 100644 index 000000000000..0b51761d4b75 --- /dev/null +++ b/arch/arm64/lib/crc32-glue.c @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +#include + +asmlinkage u32 crc32_le_arm64(u32 crc, unsigned char const *p, size_t len); +asmlinkage u32 crc32c_le_arm64(u32 crc, unsigned char const *p, size_t len); +asmlinkage u32 crc32_be_arm64(u32 crc, unsigned char const *p, size_t len); + +u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) +{ + if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) + return crc32_le_base(crc, p, len); + + return crc32_le_arm64(crc, p, len); +} + +u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len) +{ + if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) + return __crc32c_le_base(crc, p, len); + + return crc32c_le_arm64(crc, p, len); +} + +u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) +{ + if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) + return crc32_be_base(crc, p, len); + + return crc32_be_arm64(crc, p, len); +} diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S index 8340dccff46f..22139691c7ae 100644 --- a/arch/arm64/lib/crc32.S +++ b/arch/arm64/lib/crc32.S @@ -6,7 +6,6 @@ */ #include -#include #include .arch armv8-a+crc @@ -136,25 +135,16 @@ CPU_BE( rev16 \reg, \reg ) .endm .align 5 -SYM_FUNC_START(crc32_le) -alternative_if_not ARM64_HAS_CRC32 - b crc32_le_base -alternative_else_nop_endif +SYM_FUNC_START(crc32_le_arm64) __crc32 -SYM_FUNC_END(crc32_le) +SYM_FUNC_END(crc32_le_arm64) .align 5 -SYM_FUNC_START(__crc32c_le) -alternative_if_not ARM64_HAS_CRC32 - b __crc32c_le_base -alternative_else_nop_endif +SYM_FUNC_START(crc32c_le_arm64) __crc32 c -SYM_FUNC_END(__crc32c_le) +SYM_FUNC_END(crc32c_le_arm64) .align 5 -SYM_FUNC_START(crc32_be) -alternative_if_not ARM64_HAS_CRC32 - b crc32_be_base -alternative_else_nop_endif +SYM_FUNC_START(crc32_be_arm64) __crc32 be=1 -SYM_FUNC_END(crc32_be) +SYM_FUNC_END(crc32_be_arm64) From patchwork Fri Oct 18 07:53:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 837219 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2237918C92A for ; Fri, 18 Oct 2024 07:53:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238041; cv=none; b=Y8VyRqpOUA2AiKpuJp3WOZofFiP+iTwiEEhvBUBKEkpa9SCazL/R3C3OUOMYMumqaqdkX488p2WohJbfmH+hzylRrCyKCVWphDPQhJzzqXYRNbv9oWu48J2JXU6gYQYcLAgkWfKKcp5IH0hJDf1c18olAuW952lEzkSeTd3/USY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238041; c=relaxed/simple; bh=Jlg+6uPR7/7OaiOrAMoZ+JGT2+27o/oDT1dKUQsADAU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=O5oTn3F8PNr2HYg8p1JDjt90CQ7yyT0soarMONUjF2MfWfYiImBianUN3XjCFgv+IsT2IqquB0eCYdxa9CNBU9aCcMk2ogINRbPiXADNMXWJgvUccAXYJP1/KPqV6xm6iFtRWg5DoPjKUR/ZVtaS3s8IUuncKtpVt2ZNKIlylCg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=B80xh7EL; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="B80xh7EL" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e3529d80e5so31142497b3.0 for ; Fri, 18 Oct 2024 00:53:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729238039; x=1729842839; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tNwrKgHpfeEPpiezt/Iqbr3KcELj8yx0Qy4sFreW/+w=; b=B80xh7EL+H5JqWvcEjOL/xTlJxqlhqFEze2WeWbMzugZvJTSR45KRLK0ardUcNc1AC dP7kQN1bB+z071Qcg43BSKExTiIEAQesIaR/G6BS1KZD+xaqX+BKL8AV8FhsHOjiyaCG aVEjhpU/BuEjWgSOgHIn56xfIPAi3xNiV1+MFkEBEaDtJ5j6wG2U3HIEWcqwqhPrHSSA e7LgGsO6IUbrNfutruE0en8S7PsaYoZhQmYCzMYeDUlN/iW9EwpTksFuYGz4fCcv/kGg 76pGfzD3CAp7fJUfalyaxLI+hzdkHtBPQYgee5jwTMg1A/Br1fuQC175JgAtb9zKNyeo X+Yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729238039; x=1729842839; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tNwrKgHpfeEPpiezt/Iqbr3KcELj8yx0Qy4sFreW/+w=; b=ZEdiXCJzLKweQYglbXOJxIsCiv8CkF9RC7N1ENJcA8w03pR13qZc/6FGVrVk7oHjZ3 V7R51RAD5snXwcwN6LAJj7YFokns17qr6uRpJe1ERw1EWCjyXifsFRqHDTbUey43ZSwb jMIfdhPtwphlNEPALztoJRiXb2Pp/Foi/FZtDxITrlo/nQER9beBWAG7f77lQnkhrpKB JwbDKrzhRE3UgFkclTCA7ybxQY2UTOe6iio9z6Xftcrox1CLcYDAKY3XzNvYvWtVQ8T0 NlXQLZ43K5/Q4lvH0Ww25d8VkwdAogjpdfetiEC1bUopoXadWveuCiNEcmoohUOpaCxo uenQ== X-Forwarded-Encrypted: i=1; AJvYcCVxg+vPrh9sjv59M//npnX1NE8BHv17dCzKcBmCox3WG/VL+WXcACHHIdfi8gM7Ff939xE8lUa/Q6OeNAk=@vger.kernel.org X-Gm-Message-State: AOJu0YxN9hSNn7YtZPG2h9ZgrcZKODwozTj8x97kQ3vMcxe1J+UPFqVG VZgPCRUGhG9Vmvfh+CWSpdb/Ai2G3aUmSidhky+gbNjhoPjhWarRWrytK/+fbb39b2ypyA== X-Google-Smtp-Source: AGHT+IEe29XOvyn9WDi9xAEfZbcgksmYHhNGnVdSc/NhMxBAycrneIeqDFrR9WDdNM7rvT4ebtva2h+o X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:7b:198d:ac11:8138]) (user=ardb job=sendgmr) by 2002:a05:690c:368c:b0:6e2:1b8c:39bf with SMTP id 00721157ae682-6e5bfbe2e35mr374097b3.2.1729238039135; Fri, 18 Oct 2024 00:53:59 -0700 (PDT) Date: Fri, 18 Oct 2024 09:53:50 +0200 In-Reply-To: <20241018075347.2821102-5-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241018075347.2821102-5-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3357; i=ardb@kernel.org; h=from:subject; bh=mLKqxqTEQhRSUiFWy6NyWFEr4AEp2njT3OggUKTWlcU=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIV1IhK+mUG3X1MInF46ZVaW39MzS7Dc+PettVdqLe0vYD kyfxqzZUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACby6jXDP91/7D2zGxc7ilVI lTPPOHkmLbhXvmWVjXix0Tk7hmkNcxj+h/rmCu6d8Jl5zcKE+y/Uj9VPSmN6dH2N0Z5/wtzTT7m 2cQMA X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241018075347.2821102-7-ardb+git@google.com> Subject: [PATCH v4 2/3] arm64/crc32: Reorganize bit/byte ordering macros From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, will@kernel.org, catalin.marinas@arm.com, Ard Biesheuvel , Eric Biggers , Kees Cook From: Ard Biesheuvel In preparation for a new user, reorganize the bit/byte ordering macros that are used to parameterize the crc32 template code and instantiate CRC-32, CRC-32c and 'big endian' CRC-32. Signed-off-by: Ard Biesheuvel --- arch/arm64/lib/crc32.S | 91 +++++++++----------- 1 file changed, 39 insertions(+), 52 deletions(-) diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S index 22139691c7ae..f9920492f135 100644 --- a/arch/arm64/lib/crc32.S +++ b/arch/arm64/lib/crc32.S @@ -10,44 +10,48 @@ .arch armv8-a+crc - .macro byteorder, reg, be - .if \be -CPU_LE( rev \reg, \reg ) - .else -CPU_BE( rev \reg, \reg ) - .endif + .macro bitle, reg .endm - .macro byteorder16, reg, be - .if \be -CPU_LE( rev16 \reg, \reg ) - .else -CPU_BE( rev16 \reg, \reg ) - .endif + .macro bitbe, reg + rbit \reg, \reg .endm - .macro bitorder, reg, be - .if \be - rbit \reg, \reg - .endif + .macro bytele, reg .endm - .macro bitorder16, reg, be - .if \be + .macro bytebe, reg rbit \reg, \reg - lsr \reg, \reg, #16 - .endif + lsr \reg, \reg, #24 + .endm + + .macro hwordle, reg +CPU_BE( rev16 \reg, \reg ) .endm - .macro bitorder8, reg, be - .if \be + .macro hwordbe, reg +CPU_LE( rev \reg, \reg ) rbit \reg, \reg - lsr \reg, \reg, #24 - .endif +CPU_BE( lsr \reg, \reg, #16 ) + .endm + + .macro le, regs:vararg + .irp r, \regs +CPU_BE( rev \r, \r ) + .endr + .endm + + .macro be, regs:vararg + .irp r, \regs +CPU_LE( rev \r, \r ) + .endr + .irp r, \regs + rbit \r, \r + .endr .endm - .macro __crc32, c, be=0 - bitorder w0, \be + .macro __crc32, c, order=le + bit\order w0 cmp x2, #16 b.lt 8f // less than 16 bytes @@ -60,14 +64,7 @@ CPU_BE( rev16 \reg, \reg ) add x8, x8, x1 add x1, x1, x7 ldp x5, x6, [x8] - byteorder x3, \be - byteorder x4, \be - byteorder x5, \be - byteorder x6, \be - bitorder x3, \be - bitorder x4, \be - bitorder x5, \be - bitorder x6, \be + \order x3, x4, x5, x6 tst x7, #8 crc32\c\()x w8, w0, x3 @@ -95,42 +92,32 @@ CPU_BE( rev16 \reg, \reg ) 32: ldp x3, x4, [x1], #32 sub x2, x2, #32 ldp x5, x6, [x1, #-16] - byteorder x3, \be - byteorder x4, \be - byteorder x5, \be - byteorder x6, \be - bitorder x3, \be - bitorder x4, \be - bitorder x5, \be - bitorder x6, \be + \order x3, x4, x5, x6 crc32\c\()x w0, w0, x3 crc32\c\()x w0, w0, x4 crc32\c\()x w0, w0, x5 crc32\c\()x w0, w0, x6 cbnz x2, 32b -0: bitorder w0, \be +0: bit\order w0 ret 8: tbz x2, #3, 4f ldr x3, [x1], #8 - byteorder x3, \be - bitorder x3, \be + \order x3 crc32\c\()x w0, w0, x3 4: tbz x2, #2, 2f ldr w3, [x1], #4 - byteorder w3, \be - bitorder w3, \be + \order w3 crc32\c\()w w0, w0, w3 2: tbz x2, #1, 1f ldrh w3, [x1], #2 - byteorder16 w3, \be - bitorder16 w3, \be + hword\order w3 crc32\c\()h w0, w0, w3 1: tbz x2, #0, 0f ldrb w3, [x1] - bitorder8 w3, \be + byte\order w3 crc32\c\()b w0, w0, w3 -0: bitorder w0, \be +0: bit\order w0 ret .endm @@ -146,5 +133,5 @@ SYM_FUNC_END(crc32c_le_arm64) .align 5 SYM_FUNC_START(crc32_be_arm64) - __crc32 be=1 + __crc32 order=be SYM_FUNC_END(crc32_be_arm64) From patchwork Fri Oct 18 07:53:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 836845 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 495BD18CBFE for ; Fri, 18 Oct 2024 07:54:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238045; cv=none; b=MubunVZgTcv0epA5Gb74A8dd33ZIqwkeQ2XskD7metqgnG6kmavH/1V+KTJFe0sxTRWnOSsXudcy1pkx2gKE5QrOEa1V510rIMUikHecdgAQz6glAefpjTmPRX8qteDkyl4dvc0a1d4/3M46mua5OFFlVIE/3EaJYJsZp5wf3+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729238045; c=relaxed/simple; bh=8zYKmJNjygwOMuAiCDaGQaWCy++42Kn8YoUjmioLzcE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XN3IgrGlXbWwJr2N8RszI3g2bjmtx3glJU9KIz7cqfixwthMPNkzWfQAm+4y19ozl/JuarbfkAKvgFvwM1R/gsKZoroLWZf789RJo8+JHkxNXLwQ93IKa98yPQAQTotv2zGSJCps6Sn+lh2pu/AkDNMBhKmwbvqdINmqUT6XI/o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3OpXelTP; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3OpXelTP" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43151a9ea95so13359005e9.1 for ; Fri, 18 Oct 2024 00:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729238041; x=1729842841; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=p2DfE++i5UbLneVMcP8Emct3Ey/R8+tu/KecJg2ueJ8=; b=3OpXelTP2AnzKVDRqokWQUinVCyZe4q3jNrebnXcaTLaAMWkBmu3KT1be3qX0Pf8gs ubpwXwgBTA/3W9M7IMdbpedy7sC+sJu2qvaoGub+59P2Z19A1pzGQ6GXWz6x3QzgpGVH 5mjdsTVglEwVCMz6dCEAo2d8xe39dwvSg1RyDi/0wgguHOF7y5ccIBSnIZPhGfJyhgdG 4m/rZx9g2nslnG5NceulPU99Yq0Ow2WeCDOWY2dz7LLkqML8hgdsJTgR/ZxBUztH0hwk 3Rx+5w+wMasZKoIwFsqaO8a11ed9oH38BroCxjEJrIwS+2vipMOFS2VraOL07vNe64kQ YzKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729238041; x=1729842841; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=p2DfE++i5UbLneVMcP8Emct3Ey/R8+tu/KecJg2ueJ8=; b=JZWoFXKtnSRaXJsh3ZeGCCQJnc+sO2AOgwe72aCQvDx3JB+NpX27wHBx7K/+T2SXax nxe5Iu0tKXGr7clFU/5hcLbPirgMk7/CP1EN+n3nTwWAZLQmL596BtMAdxrXd0HYCKgM 4R7j9INEdg4XbruPm5r1hHvtP6VHqNBdqymcFN2RGN6JiLQ5nyt8QDZtwCqMroNJie3t zDEP3zXimjjTAZFolXnmyxVrxhODEssLI65K6BEaYRC6G+RzAi6g5gVS0cVAs7FSoiHg Bceq4E7ae9OzZOYzjXJf0b6UQNfdj6QMg9gAL266XB6mRzA4tB3vInR2FOEzsJkTnL/i hm+g== X-Forwarded-Encrypted: i=1; AJvYcCXiH2oMR40mqPbfzM2gNK4x0aqUkmpY9aUWiV8cSXZeEt/U+gMEv3QLqWbMzrb9tk4MrI/ZXso6UCtjTt8=@vger.kernel.org X-Gm-Message-State: AOJu0Yw1mYfXm2IFf9fbtA7skJvu4NqFfcGGBldMmVmH5DLLnuhhhbkk N7XUTJgqPXBhkznPE+x9R3s4Kx6ZKmv715b15JtnIuWghqb4793pCUCjbKH+txqCVAK6cg== X-Google-Smtp-Source: AGHT+IGprioily34Rw5wEVc8KJpJ/e2dL5gSfhlYAIlyD8U4HYi+og3DaBnLXkiTFJBK5a3Eq0Ly+UPi X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:7b:198d:ac11:8138]) (user=ardb job=sendgmr) by 2002:adf:b609:0:b0:374:badf:9b16 with SMTP id ffacd0b85a97d-37ecee1197emr1427f8f.0.1729238041450; Fri, 18 Oct 2024 00:54:01 -0700 (PDT) Date: Fri, 18 Oct 2024 09:53:51 +0200 In-Reply-To: <20241018075347.2821102-5-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241018075347.2821102-5-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=12381; i=ardb@kernel.org; h=from:subject; bh=je4FqzSfL7MXWIoBlKaqML5+ykG0g5UEV0cb/756WYo=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIV1IhO8/0+1cX2PLA/OrWDiSC6tO6L02eyJc0y23KiTlw 9Hg6qUdpSwMYhwMsmKKLAKz/77beXqiVK3zLFmYOaxMIEMYuDgFYCL5fYwMKy9Pn8rh8tXcY+IG W0/bg1U6+htbjFOybsjFCq8odjjrzMjQeTcxzletlNPR58gKTQ1dWwt2dY7nzOqT3tcJ7nlkFsw BAA== X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241018075347.2821102-8-ardb+git@google.com> Subject: [PATCH v4 3/3] arm64/crc32: Implement 4-way interleave using PMULL From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, will@kernel.org, catalin.marinas@arm.com, Ard Biesheuvel , Eric Biggers , Kees Cook From: Ard Biesheuvel Now that kernel mode NEON no longer disables preemption, using FP/SIMD in library code which is not obviously part of the crypto subsystem is no longer problematic, as it will no longer incur unexpected latencies. So accelerate the CRC-32 library code on arm64 to use a 4-way interleave, using PMULL instructions to implement the folding. On Apple M2, this results in a speedup of 2 - 2.8x when using input sizes of 1k - 8k. For smaller sizes, the overhead of preserving and restoring the FP/SIMD register file may not be worth it, so 1k is used as a threshold for choosing this code path. The coefficient tables were generated using code provided by Eric. [0] [0] https://github.com/ebiggers/libdeflate/blob/master/scripts/gen_crc32_multipliers.c Cc: Eric Biggers Signed-off-by: Ard Biesheuvel --- arch/arm64/lib/crc32-glue.c | 48 ++++ arch/arm64/lib/crc32.S | 231 +++++++++++++++++++- 2 files changed, 276 insertions(+), 3 deletions(-) diff --git a/arch/arm64/lib/crc32-glue.c b/arch/arm64/lib/crc32-glue.c index 0b51761d4b75..295ae3e6b997 100644 --- a/arch/arm64/lib/crc32-glue.c +++ b/arch/arm64/lib/crc32-glue.c @@ -4,16 +4,40 @@ #include #include +#include +#include +#include + +#include + +// The minimum input length to consider the 4-way interleaved code path +static const size_t min_len = 1024; asmlinkage u32 crc32_le_arm64(u32 crc, unsigned char const *p, size_t len); asmlinkage u32 crc32c_le_arm64(u32 crc, unsigned char const *p, size_t len); asmlinkage u32 crc32_be_arm64(u32 crc, unsigned char const *p, size_t len); +asmlinkage u32 crc32_le_arm64_4way(u32 crc, unsigned char const *p, size_t len); +asmlinkage u32 crc32c_le_arm64_4way(u32 crc, unsigned char const *p, size_t len); +asmlinkage u32 crc32_be_arm64_4way(u32 crc, unsigned char const *p, size_t len); + u32 __pure crc32_le(u32 crc, unsigned char const *p, size_t len) { if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) return crc32_le_base(crc, p, len); + if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) { + kernel_neon_begin(); + crc = crc32_le_arm64_4way(crc, p, len); + kernel_neon_end(); + + p += round_down(len, 64); + len %= 64; + + if (!len) + return crc; + } + return crc32_le_arm64(crc, p, len); } @@ -22,6 +46,18 @@ u32 __pure __crc32c_le(u32 crc, unsigned char const *p, size_t len) if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) return __crc32c_le_base(crc, p, len); + if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) { + kernel_neon_begin(); + crc = crc32c_le_arm64_4way(crc, p, len); + kernel_neon_end(); + + p += round_down(len, 64); + len %= 64; + + if (!len) + return crc; + } + return crc32c_le_arm64(crc, p, len); } @@ -30,5 +66,17 @@ u32 __pure crc32_be(u32 crc, unsigned char const *p, size_t len) if (!alternative_has_cap_likely(ARM64_HAS_CRC32)) return crc32_be_base(crc, p, len); + if (len >= min_len && cpu_have_named_feature(PMULL) && crypto_simd_usable()) { + kernel_neon_begin(); + crc = crc32_be_arm64_4way(crc, p, len); + kernel_neon_end(); + + p += round_down(len, 64); + len %= 64; + + if (!len) + return crc; + } + return crc32_be_arm64(crc, p, len); } diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S index f9920492f135..68825317460f 100644 --- a/arch/arm64/lib/crc32.S +++ b/arch/arm64/lib/crc32.S @@ -1,14 +1,17 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * Accelerated CRC32(C) using AArch64 CRC instructions + * Accelerated CRC32(C) using AArch64 CRC and PMULL instructions * - * Copyright (C) 2016 - 2018 Linaro Ltd + * Copyright (C) 2016 - 2018 Linaro Ltd. + * Copyright (C) 2024 Google LLC + * + * Author: Ard Biesheuvel */ #include #include - .arch armv8-a+crc + .cpu generic+crc+crypto .macro bitle, reg .endm @@ -135,3 +138,225 @@ SYM_FUNC_END(crc32c_le_arm64) SYM_FUNC_START(crc32_be_arm64) __crc32 order=be SYM_FUNC_END(crc32_be_arm64) + + in .req x1 + len .req x2 + + /* + * w0: input CRC at entry, output CRC at exit + * x1: pointer to input buffer + * x2: length of input in bytes + */ + .macro crc4way, insn, table, order=le + bit\order w0 + lsr len, len, #6 // len := # of 64-byte blocks + + /* Process up to 64 blocks of 64 bytes at a time */ +.La\@: mov x3, #64 + cmp len, #64 + csel x3, x3, len, hi // x3 := min(len, 64) + sub len, len, x3 + + /* Divide the input into 4 contiguous blocks */ + add x4, x3, x3, lsl #1 // x4 := 3 * x3 + add x7, in, x3, lsl #4 // x7 := in + 16 * x3 + add x8, in, x3, lsl #5 // x8 := in + 32 * x3 + add x9, in, x4, lsl #4 // x9 := in + 16 * x4 + + /* Load the folding coefficients from the lookup table */ + adr_l x5, \table - 12 // entry 0 omitted + add x5, x5, x4, lsl #2 // x5 += 12 * x3 + ldp s0, s1, [x5] + ldr s2, [x5, #8] + + /* Zero init partial CRCs for this iteration */ + mov w4, wzr + mov w5, wzr + mov w6, wzr + mov x17, xzr + +.Lb\@: sub x3, x3, #1 + \insn w6, w6, x17 + ldp x10, x11, [in], #16 + ldp x12, x13, [x7], #16 + ldp x14, x15, [x8], #16 + ldp x16, x17, [x9], #16 + + \order x10, x11, x12, x13, x14, x15, x16, x17 + + /* Apply the CRC transform to 4 16-byte blocks in parallel */ + \insn w0, w0, x10 + \insn w4, w4, x12 + \insn w5, w5, x14 + \insn w6, w6, x16 + \insn w0, w0, x11 + \insn w4, w4, x13 + \insn w5, w5, x15 + cbnz x3, .Lb\@ + + /* Combine the 4 partial results into w0 */ + mov v3.d[0], x0 + mov v4.d[0], x4 + mov v5.d[0], x5 + pmull v0.1q, v0.1d, v3.1d + pmull v1.1q, v1.1d, v4.1d + pmull v2.1q, v2.1d, v5.1d + eor v0.8b, v0.8b, v1.8b + eor v0.8b, v0.8b, v2.8b + mov x5, v0.d[0] + eor x5, x5, x17 + \insn w0, w6, x5 + + mov in, x9 + cbnz len, .La\@ + + bit\order w0 + ret + .endm + + .align 5 +SYM_FUNC_START(crc32c_le_arm64_4way) + crc4way crc32cx, .L0 +SYM_FUNC_END(crc32c_le_arm64_4way) + + .align 5 +SYM_FUNC_START(crc32_le_arm64_4way) + crc4way crc32x, .L1 +SYM_FUNC_END(crc32_le_arm64_4way) + + .align 5 +SYM_FUNC_START(crc32_be_arm64_4way) + crc4way crc32x, .L1, be +SYM_FUNC_END(crc32_be_arm64_4way) + + .section .rodata, "a", %progbits + .align 6 +.L0: .long 0xddc0152b, 0xba4fc28e, 0x493c7d27 + .long 0x0715ce53, 0x9e4addf8, 0xba4fc28e + .long 0xc96cfdc0, 0x0715ce53, 0xddc0152b + .long 0xab7aff2a, 0x0d3b6092, 0x9e4addf8 + .long 0x299847d5, 0x878a92a7, 0x39d3b296 + .long 0xb6dd949b, 0xab7aff2a, 0x0715ce53 + .long 0xa60ce07b, 0x83348832, 0x47db8317 + .long 0xd270f1a2, 0xb9e02b86, 0x0d3b6092 + .long 0x65863b64, 0xb6dd949b, 0xc96cfdc0 + .long 0xb3e32c28, 0xbac2fd7b, 0x878a92a7 + .long 0xf285651c, 0xce7f39f4, 0xdaece73e + .long 0x271d9844, 0xd270f1a2, 0xab7aff2a + .long 0x6cb08e5c, 0x2b3cac5d, 0x2162d385 + .long 0xcec3662e, 0x1b03397f, 0x83348832 + .long 0x8227bb8a, 0xb3e32c28, 0x299847d5 + .long 0xd7a4825c, 0xdd7e3b0c, 0xb9e02b86 + .long 0xf6076544, 0x10746f3c, 0x18b33a4e + .long 0x98d8d9cb, 0x271d9844, 0xb6dd949b + .long 0x57a3d037, 0x93a5f730, 0x78d9ccb7 + .long 0x3771e98f, 0x6b749fb2, 0xbac2fd7b + .long 0xe0ac139e, 0xcec3662e, 0xa60ce07b + .long 0x6f345e45, 0xe6fc4e6a, 0xce7f39f4 + .long 0xa2b73df1, 0xb0cd4768, 0x61d82e56 + .long 0x86d8e4d2, 0xd7a4825c, 0xd270f1a2 + .long 0xa90fd27a, 0x0167d312, 0xc619809d + .long 0xca6ef3ac, 0x26f6a60a, 0x2b3cac5d + .long 0x4597456a, 0x98d8d9cb, 0x65863b64 + .long 0xc9c8b782, 0x68bce87a, 0x1b03397f + .long 0x62ec6c6d, 0x6956fc3b, 0xebb883bd + .long 0x2342001e, 0x3771e98f, 0xb3e32c28 + .long 0xe8b6368b, 0x2178513a, 0x064f7f26 + .long 0x9ef68d35, 0x170076fa, 0xdd7e3b0c + .long 0x0b0bf8ca, 0x6f345e45, 0xf285651c + .long 0x02ee03b2, 0xff0dba97, 0x10746f3c + .long 0x135c83fd, 0xf872e54c, 0xc7a68855 + .long 0x00bcf5f6, 0x86d8e4d2, 0x271d9844 + .long 0x58ca5f00, 0x5bb8f1bc, 0x8e766a0c + .long 0xded288f8, 0xb3af077a, 0x93a5f730 + .long 0x37170390, 0xca6ef3ac, 0x6cb08e5c + .long 0xf48642e9, 0xdd66cbbb, 0x6b749fb2 + .long 0xb25b29f2, 0xe9e28eb4, 0x1393e203 + .long 0x45cddf4e, 0xc9c8b782, 0xcec3662e + .long 0xdfd94fb2, 0x93e106a4, 0x96c515bb + .long 0x021ac5ef, 0xd813b325, 0xe6fc4e6a + .long 0x8e1450f7, 0x2342001e, 0x8227bb8a + .long 0xe0cdcf86, 0x6d9a4957, 0xb0cd4768 + .long 0x613eee91, 0xd2c3ed1a, 0x39c7ff35 + .long 0xbedc6ba1, 0x9ef68d35, 0xd7a4825c + .long 0x0cd1526a, 0xf2271e60, 0x0ab3844b + .long 0xd6c3a807, 0x2664fd8b, 0x0167d312 + .long 0x1d31175f, 0x02ee03b2, 0xf6076544 + .long 0x4be7fd90, 0x363bd6b3, 0x26f6a60a + .long 0x6eeed1c9, 0x5fabe670, 0xa741c1bf + .long 0xb3a6da94, 0x00bcf5f6, 0x98d8d9cb + .long 0x2e7d11a7, 0x17f27698, 0x49c3cc9c + .long 0x889774e1, 0xaa7c7ad5, 0x68bce87a + .long 0x8a074012, 0xded288f8, 0x57a3d037 + .long 0xbd0bb25f, 0x6d390dec, 0x6956fc3b + .long 0x3be3c09b, 0x6353c1cc, 0x42d98888 + .long 0x465a4eee, 0xf48642e9, 0x3771e98f + .long 0x2e5f3c8c, 0xdd35bc8d, 0xb42ae3d9 + .long 0xa52f58ec, 0x9a5ede41, 0x2178513a + .long 0x47972100, 0x45cddf4e, 0xe0ac139e + .long 0x359674f7, 0xa51b6135, 0x170076fa + +.L1: .long 0xaf449247, 0x81256527, 0xccaa009e + .long 0x57c54819, 0x1d9513d7, 0x81256527 + .long 0x3f41287a, 0x57c54819, 0xaf449247 + .long 0xf5e48c85, 0x910eeec1, 0x1d9513d7 + .long 0x1f0c2cdd, 0x9026d5b1, 0xae0b5394 + .long 0x71d54a59, 0xf5e48c85, 0x57c54819 + .long 0x1c63267b, 0xfe807bbd, 0x0cbec0ed + .long 0xd31343ea, 0xe95c1271, 0x910eeec1 + .long 0xf9d9c7ee, 0x71d54a59, 0x3f41287a + .long 0x9ee62949, 0xcec97417, 0x9026d5b1 + .long 0xa55d1514, 0xf183c71b, 0xd1df2327 + .long 0x21aa2b26, 0xd31343ea, 0xf5e48c85 + .long 0x9d842b80, 0xeea395c4, 0x3c656ced + .long 0xd8110ff1, 0xcd669a40, 0xfe807bbd + .long 0x3f9e9356, 0x9ee62949, 0x1f0c2cdd + .long 0x1d6708a0, 0x0c30f51d, 0xe95c1271 + .long 0xef82aa68, 0xdb3935ea, 0xb918a347 + .long 0xd14bcc9b, 0x21aa2b26, 0x71d54a59 + .long 0x99cce860, 0x356d209f, 0xff6f2fc2 + .long 0xd8af8e46, 0xc352f6de, 0xcec97417 + .long 0xf1996890, 0xd8110ff1, 0x1c63267b + .long 0x631bc508, 0xe95c7216, 0xf183c71b + .long 0x8511c306, 0x8e031a19, 0x9b9bdbd0 + .long 0xdb3839f3, 0x1d6708a0, 0xd31343ea + .long 0x7a92fffb, 0xf7003835, 0x4470ac44 + .long 0x6ce68f2a, 0x00eba0c8, 0xeea395c4 + .long 0x4caaa263, 0xd14bcc9b, 0xf9d9c7ee + .long 0xb46f7cff, 0x9a1b53c8, 0xcd669a40 + .long 0x60290934, 0x81b6f443, 0x6d40f445 + .long 0x8e976a7d, 0xd8af8e46, 0x9ee62949 + .long 0xdcf5088a, 0x9dbdc100, 0x145575d5 + .long 0x1753ab84, 0xbbf2f6d6, 0x0c30f51d + .long 0x255b139e, 0x631bc508, 0xa55d1514 + .long 0xd784eaa8, 0xce26786c, 0xdb3935ea + .long 0x6d2c864a, 0x8068c345, 0x2586d334 + .long 0x02072e24, 0xdb3839f3, 0x21aa2b26 + .long 0x06689b0a, 0x5efd72f5, 0xe0575528 + .long 0x1e52f5ea, 0x4117915b, 0x356d209f + .long 0x1d3d1db6, 0x6ce68f2a, 0x9d842b80 + .long 0x3796455c, 0xb8e0e4a8, 0xc352f6de + .long 0xdf3a4eb3, 0xc55a2330, 0xb84ffa9c + .long 0x28ae0976, 0xb46f7cff, 0xd8110ff1 + .long 0x9764bc8d, 0xd7e7a22c, 0x712510f0 + .long 0x13a13e18, 0x3e9a43cd, 0xe95c7216 + .long 0xb8ee242e, 0x8e976a7d, 0x3f9e9356 + .long 0x0c540e7b, 0x753c81ff, 0x8e031a19 + .long 0x9924c781, 0xb9220208, 0x3edcde65 + .long 0x3954de39, 0x1753ab84, 0x1d6708a0 + .long 0xf32238b5, 0xbec81497, 0x9e70b943 + .long 0xbbd2cd2c, 0x0925d861, 0xf7003835 + .long 0xcc401304, 0xd784eaa8, 0xef82aa68 + .long 0x4987e684, 0x6044fbb0, 0x00eba0c8 + .long 0x3aa11427, 0x18fe3b4a, 0x87441142 + .long 0x297aad60, 0x02072e24, 0xd14bcc9b + .long 0xf60c5e51, 0x6ef6f487, 0x5b7fdd0a + .long 0x632d78c5, 0x3fc33de4, 0x9a1b53c8 + .long 0x25b8822a, 0x1e52f5ea, 0x99cce860 + .long 0xd4fc84bc, 0x1af62fb8, 0x81b6f443 + .long 0x5690aa32, 0xa91fdefb, 0x688a110e + .long 0x1357a093, 0x3796455c, 0xd8af8e46 + .long 0x798fdd33, 0xaaa18a37, 0x357b9517 + .long 0xc2815395, 0x54d42691, 0x9dbdc100 + .long 0x21cfc0f7, 0x28ae0976, 0xf1996890 + .long 0xa0decef3, 0x7b4aa8b7, 0xbbf2f6d6