From patchwork Sat Feb 1 16:39:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 861284 Delivered-To: patch@linaro.org Received: by 2002:adf:fb05:0:b0:385:e875:8a9e with SMTP id c5csp1264948wrr; Sat, 1 Feb 2025 08:49:35 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCW6WLN3vmaFoTlmfyCyAHst1Fxqg0vamBHMjzQhscx076DId1zCYPWgojwv4eHcetsGyerK9A==@linaro.org X-Google-Smtp-Source: AGHT+IGdvjCiktUgRgicjC+8cUdEmlGs/Dbz2bx/29s6r8axS4ZBM6Y8k1EfFDkFV/Us4JD+xQ1w X-Received: by 2002:a05:620a:439c:b0:7b6:c93a:7f2f with SMTP id af79cd13be357-7bffcce4a1amr2164501385a.14.1738428575144; Sat, 01 Feb 2025 08:49:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1738428575; cv=none; d=google.com; s=arc-20240605; b=Kesqjb/N8Ptts132O2pw4m4V6qjFHNJP/921ZihMBhGc5sbPnqpa+7Wfrg4Dt176UB o2iojk7r6On2BQ0tlzOYmoWixQxq1rfJ/PtoOJL/k3hErJrpSzyJEG/o3GSqxj2Dq50L GzVX76TQaklNx7RqABUHpBRRYeaoN1ObqonTCYos4UNWCcK7rLaSet2Lm8is/CJPm/Yt j70S96BqB9nsbq897PeyT/bAQWq0HkqGXjCKyj3xg9KqeQtVT22VW4eXJCWUVUcy/PSv 93WjVfZ7Rr9GAJX3i1mMI2Pu7JICXMUL4m5erCjAZcA0hSYcWTeY+m/PAUZ4iLhsRwNs bVUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=LvouebXgAmh9MnCCB4e/qopj9nP+KT21pPolqZOZc0o=; fh=aOlcJgFEbfhxgJKrFp7TvaDiwQk45snrNLXYVVSbLLg=; b=TbiFNe+E9De/uKiOIOeFy7c+ykvR9j2QKkHs2KQpSBfE/N7ZO0yG4gcNfKDRlSCg7o em4i3f31O2Wdmp4HiNvQf132ZSId6q2Wl9hoe1yye4RlhTxN69h/3dsSOtrW2p0ZT2jT SXrx35AvzrhBq+TheSlfq56N0FLGWeO0eXDXUbpTJEs/WcdvZQGSpOJ6ucoAtcfYcWB2 rHQu0GB+Jbhz9bLKiVmL9Iyo2HHkFLNoRrOlBiv+7DBTk/AcYeZ4Y0LqXYfbRRGXfJaE iXfvh7mFeQstVolZ/2x+/Y+2Gzht7QE5Y8Q2iyWxudTlE3DvcrgUQAhMBj0BMnOK51LQ rrtg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eJXG5D3c; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id af79cd13be357-7c00a922e46si605189885a.488.2025.02.01.08.49.34 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 01 Feb 2025 08:49:35 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eJXG5D3c; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1teGYR-0006iq-6d; Sat, 01 Feb 2025 11:41:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1teGYN-0006cu-NQ for qemu-devel@nongnu.org; Sat, 01 Feb 2025 11:41:07 -0500 Received: from mail-wm1-x332.google.com ([2a00:1450:4864:20::332]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1teGYL-0001EF-Qe for qemu-devel@nongnu.org; Sat, 01 Feb 2025 11:41:07 -0500 Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-4361f664af5so35016715e9.1 for ; Sat, 01 Feb 2025 08:41:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1738428064; x=1739032864; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=LvouebXgAmh9MnCCB4e/qopj9nP+KT21pPolqZOZc0o=; b=eJXG5D3cu4x53i5kFWV5wPmJOHDaljjVLy1/Q5KqJt9CnT1DgFfOU8ICpo4f/2bNwX Ij5x6G48y89IC0uIYLddBiTw1ReqWq+Op6MtxjhBkcymy+9M5AdualvxC9W+KbAUin2t dwkeancKyjFtn8ZrZZiLuhrYZ5po7Ggc2lx97LrU7AbHkYCamTsKqbIZuko2JFxNVDk6 IhJaMt1Z/tW27J3opRW6rf3g/uLuQT8dvX+CLy6VbM2rfXMcegLtDUHBzifkaXLCZSex 4jm2W7mzPr4qQN1ZH3PygvJyZAqqBOOlFT93sY7yh+6pKsrevkm0tbLzh1POcO7kGfIG LTlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738428064; x=1739032864; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LvouebXgAmh9MnCCB4e/qopj9nP+KT21pPolqZOZc0o=; b=MPlOLTT1G9b5yw8u521PLz4clXLVuhORTp/lK2FkmRCHgbfKLT+kSJOd289bYu6U/a anS/aACjxAuLwY2Vm9LFLn+BhqWFq85Jg3BcWcXDH5tgMu4UQTieS2ANIb7eRx7xBH9H Rne9hz5m2/YSvAfT0MsNkNh+1RsZMvFnTNWB3d6z3QhynCMg6InT8uE60FLZ4cD13Ujn +1EeKOgaAsGf4v8t72Mfwaej8cx1Z5BdiGFNzfrwW95cVV2MY2OrR575MvatmOjkf3aw oVY2rhBzfKEY6nIM8yUNpNWC41RRbbIDjHLgW2ruPakldKONBZ6Y2lwV8MKOPBSGV6fr t91Q== X-Forwarded-Encrypted: i=1; AJvYcCUnDRG8fDuEgBnShorror+/6Hl4GnXNFA9I7HJAINVN66O6KDWTS1zgSOUms4PdH70PmTUaB7lMKTqw@nongnu.org X-Gm-Message-State: AOJu0Yz2QU1wP9q6vXML2XY+jI3mCRt3RPzP0OPAhPqqo69fobBnsr8A fBaslICPpWp3dHhwIVlYvwPXr3VQJczz3x0SiQKLvSL4apyYikcaMHHGv+FjVMxOLqQaeRKW4Kc S X-Gm-Gg: ASbGnctgFfVTZiMUTFfmz/27GeQcO6juKntLNQFfPJBv95b8sHMSEnHz/ZLdosyiORf TBlG/8NvSW/V35z/iLabBucVhv8Z6wJnL0e620UAvDgP7HENYL8ztubcXuiEqxF8ihQiZdS5MBV 9CuGKEpO8iE8sEDfy8Pqxa+S6wP2bCuet043/sfaiwQ3JF6bfEJJ5CtvIuDxRtgOKdNbz7BsJeR cr5EEMYk2YpJJe0CjLs/jWt3cD53Qtha331uqcgDLM1W9wvN4ybYzoUSlqw78XMnX+bh+Js/zut WtRwmOsNB1vHieWWlFD2 X-Received: by 2002:a05:600c:1f10:b0:435:192:63fb with SMTP id 5b1f17b1804b1-438dc3ae816mr129087045e9.3.1738428064277; Sat, 01 Feb 2025 08:41:04 -0800 (PST) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438dcc81d74sm127401525e9.37.2025.02.01.08.41.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Feb 2025 08:41:03 -0800 (PST) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH v2 37/69] target/arm: Handle FPCR.AH in negation steps in SVE FCADD Date: Sat, 1 Feb 2025 16:39:40 +0000 Message-Id: <20250201164012.1660228-38-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250201164012.1660228-1-peter.maydell@linaro.org> References: <20250201164012.1660228-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::332; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x332.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org The negation steps in FCADD must honour FPCR.AH's "don't change the sign of a NaN" semantics. Implement this in the same way we did for the base ASIMD FCADD, by encoding FPCR.AH into the SIMD data field passed to the helper and using that to decide whether to negate the values. The construction of neg_imag and neg_real were done to make it easy to apply both in parallel with two simple logical operations. This changed with FPCR.AH, which is more complex than that. Switch to an approach that follows the pseudocode more closely, by extracting the 'rot=1' parameter from the SIMD data field and changing the sign of the appropriate input value. Note that there was a naming issue with neg_imag and neg_real. They were named backward, with neg_imag being non-zero for rot=1, and vice versa. This was combined with reversed usage within the loop, so that the negation in the end turned out correct. Signed-off-by: Peter Maydell --- v2: Squashed in changes from RTH's patchset --- target/arm/tcg/vec_internal.h | 17 ++++++++++++++ target/arm/tcg/sve_helper.c | 42 ++++++++++++++++++++++++---------- target/arm/tcg/translate-sve.c | 2 +- 3 files changed, 48 insertions(+), 13 deletions(-) diff --git a/target/arm/tcg/vec_internal.h b/target/arm/tcg/vec_internal.h index 094f5c169ca..826791523a6 100644 --- a/target/arm/tcg/vec_internal.h +++ b/target/arm/tcg/vec_internal.h @@ -20,6 +20,8 @@ #ifndef TARGET_ARM_VEC_INTERNAL_H #define TARGET_ARM_VEC_INTERNAL_H +#include "fpu/softfloat.h" + /* * Note that vector data is stored in host-endian 64-bit chunks, * so addressing units smaller than that needs a host-endian fixup. @@ -265,4 +267,19 @@ float32 bfdotadd_ebf(float32 sum, uint32_t e1, uint32_t e2, */ bool is_ebf(CPUARMState *env, float_status *statusp, float_status *oddstatusp); +static inline float16 float16_maybe_ah_chs(float16 a, bool fpcr_ah) +{ + return fpcr_ah && float16_is_any_nan(a) ? a : float16_chs(a); +} + +static inline float32 float32_maybe_ah_chs(float32 a, bool fpcr_ah) +{ + return fpcr_ah && float32_is_any_nan(a) ? a : float32_chs(a); +} + +static inline float64 float64_maybe_ah_chs(float64 a, bool fpcr_ah) +{ + return fpcr_ah && float64_is_any_nan(a) ? a : float64_chs(a); +} + #endif /* TARGET_ARM_VEC_INTERNAL_H */ diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c index 8527a7495a6..770945a2c6a 100644 --- a/target/arm/tcg/sve_helper.c +++ b/target/arm/tcg/sve_helper.c @@ -5131,8 +5131,8 @@ void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg, { intptr_t j, i = simd_oprsz(desc); uint64_t *g = vg; - float16 neg_imag = float16_set_sign(0, simd_data(desc)); - float16 neg_real = float16_chs(neg_imag); + bool rot = extract32(desc, SIMD_DATA_SHIFT, 1); + bool fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 1, 1); do { uint64_t pg = g[(i - 1) >> 6]; @@ -5144,9 +5144,15 @@ void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg, i -= 2 * sizeof(float16); e0 = *(float16 *)(vn + H1_2(i)); - e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real; + e1 = *(float16 *)(vm + H1_2(j)); e2 = *(float16 *)(vn + H1_2(j)); - e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag; + e3 = *(float16 *)(vm + H1_2(i)); + + if (rot) { + e3 = float16_maybe_ah_chs(e3, fpcr_ah); + } else { + e1 = float16_maybe_ah_chs(e1, fpcr_ah); + } if (likely((pg >> (i & 63)) & 1)) { *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, s); @@ -5163,8 +5169,8 @@ void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg, { intptr_t j, i = simd_oprsz(desc); uint64_t *g = vg; - float32 neg_imag = float32_set_sign(0, simd_data(desc)); - float32 neg_real = float32_chs(neg_imag); + bool rot = extract32(desc, SIMD_DATA_SHIFT, 1); + bool fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 1, 1); do { uint64_t pg = g[(i - 1) >> 6]; @@ -5176,9 +5182,15 @@ void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg, i -= 2 * sizeof(float32); e0 = *(float32 *)(vn + H1_2(i)); - e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real; + e1 = *(float32 *)(vm + H1_2(j)); e2 = *(float32 *)(vn + H1_2(j)); - e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag; + e3 = *(float32 *)(vm + H1_2(i)); + + if (rot) { + e3 = float32_maybe_ah_chs(e3, fpcr_ah); + } else { + e1 = float32_maybe_ah_chs(e1, fpcr_ah); + } if (likely((pg >> (i & 63)) & 1)) { *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, s); @@ -5195,8 +5207,8 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, { intptr_t j, i = simd_oprsz(desc); uint64_t *g = vg; - float64 neg_imag = float64_set_sign(0, simd_data(desc)); - float64 neg_real = float64_chs(neg_imag); + bool rot = extract32(desc, SIMD_DATA_SHIFT, 1); + bool fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 1, 1); do { uint64_t pg = g[(i - 1) >> 6]; @@ -5208,9 +5220,15 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, i -= 2 * sizeof(float64); e0 = *(float64 *)(vn + H1_2(i)); - e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real; + e1 = *(float64 *)(vm + H1_2(j)); e2 = *(float64 *)(vn + H1_2(j)); - e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag; + e3 = *(float64 *)(vm + H1_2(i)); + + if (rot) { + e3 = float64_maybe_ah_chs(e3, fpcr_ah); + } else { + e1 = float64_maybe_ah_chs(e1, fpcr_ah); + } if (likely((pg >> (i & 63)) & 1)) { *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, s); diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c index 0d8bd1a49c4..7816b5801af 100644 --- a/target/arm/tcg/translate-sve.c +++ b/target/arm/tcg/translate-sve.c @@ -3916,7 +3916,7 @@ static gen_helper_gvec_4_ptr * const fcadd_fns[] = { gen_helper_sve_fcadd_s, gen_helper_sve_fcadd_d, }; TRANS_FEAT(FCADD, aa64_sve, gen_gvec_fpst_zzzp, fcadd_fns[a->esz], - a->rd, a->rn, a->rm, a->pg, a->rot, + a->rd, a->rn, a->rm, a->pg, a->rot | (s->fpcr_ah << 1), a->esz == MO_16 ? FPST_A64_F16 : FPST_A64) #define DO_FMLA(NAME, name) \