[60/76] target/arm: Handle FPCR.AH in FMLSL

Message ID	20250124162836.2332150-61-peter.maydell@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Peter Maydell <peter.maydell@linaro.org> To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 60/76] target/arm: Handle FPCR.AH in FMLSL Date: Fri, 24 Jan 2025 16:28:20 +0000 Message-Id: <20250124162836.2332150-61-peter.maydell@linaro.org> In-Reply-To: <20250124162836.2332150-1-peter.maydell@linaro.org> References: <20250124162836.2332150-1-peter.maydell@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	target/arm: Implement FEAT_AFP and FEAT_RPRES \| expand [00/76] target/arm: Implement FEAT_AFP and FEAT_RPRES [01/76] target/i386: Do not raise Invalid for 0 * Inf + QNaN [02/76] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases [03/76] target/arm: arm_reset_sve_state() should set FPSR, not FPCR [04/76] target/arm: Use FPSR_ constants in vfp_exceptbits_from_host() [05/76] target/arm: Use uint32_t in vfp_exceptbits_from_host() [06/76] target/arm: Define new fp_status_a32 and fp_status_a64 [07/76] target/arm: Use vfp.fp_status_a64 in A64-only helper functions [08/76] target/arm: Use fp_status_a32 in vjvct helper [09/76] target/arm: Use fp_status_a32 in vfp_cmp helpers [10/76] target/arm: Use FPST_FPCR_A32 in A32 decoder [11/76] target/arm: Use FPST_FPCR_A64 in A64 decoder [12/76] target/arm: Remove now-unused vfp.fp_status and FPST_FPCR [13/76] target/arm: Define new fp_status_f16_a32 and fp_status_f16_a64 [14/76] target/arm: Use fp_status_f16_a32 in AArch32-only helpers [15/76] target/arm: Use fp_status_f16_a64 in AArch64-only helpers [16/76] target/arm: Use FPST_FPCR_F16_A32 in A32 decoder [17/76] target/arm: Use FPST_FPCR_F16_A64 in A64 decoder [18/76] target/arm: Remove now-unused vfp.fp_status_f16 and FPST_FPCR_F16 [19/76] fpu: Rename float_flag_input_denormal to float_flag_input_denormal_flushed [20/76] fpu: Rename float_flag_output_denormal to float_flag_output_denormal_flushed [21/76] fpu: Fix a comment in softfloat-types.h [22/76] fpu: Add float_class_denormal [23/76] fpu: Implement float_flag_input_denormal_used [24/76] fpu: allow flushing of output denormals to be after rounding [25/76] target/arm: Remove redundant advsimd float16 helpers [26/76] target/arm: Use FPST_FPCR_F16_A64 for halfprec-to-other conversions [27/76] target/arm: Define FPCR AH, FIZ, NEP bits [28/76] target/arm: Implement FPCR.FIZ handling [29/76] target/arm: Adjust FP behaviour for FPCR.AH = 1 [30/76] target/arm: Adjust exception flag handling for AH = 1 [31/76] target/arm: Add FPCR.AH to tbflags [32/76] target/arm: Set up float_status to use for FPCR.AH=1 behaviour [33/76] target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS [34/76] target/arm: Use FPST_FPCR_AH for BFCVT* insns [35/76] target/arm: Use FPST_FPCR_AH for BFMLAL, BFMLSL insns [36/76] target/arm: Add FPCR.NEP to TBFLAGS [37/76] target/arm: Define and use new write_fp_*reg_merging() functions [38/76] target/arm: Handle FPCR.NEP for 3-input scalar operations [39/76] target/arm: Handle FPCR.NEP for BFCVT scalar [40/76] target/arm: Handle FPCR.NEP for 1-input scalar operations [41/76] target/arm: Handle FPCR.NEP in do_cvtf_scalar() [42/76] target/arm: Handle FPCR.NEP for scalar FABS and FNEG [43/76] target/arm: Handle FPCR.NEP for FCVTXN (scalar) [44/76] target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by element [45/76] target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAX [46/76] target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX [47/76] target/arm: Implement FPCR.AH semantics for FMAXV and FMINV [48/76] target/arm: Implement FPCR.AH semantics for FMINP and FMAXP [49/76] target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINV [50/76] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate [51/76] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vector [52/76] target/arm: Implement FPCR.AH handling of negation of NaN [53/76] target/arm: Implement FPCR.AH handling for scalar FABS and FABD [54/76] target/arm: Handle FPCR.AH in vector FABD [55/76] target/arm: Handle FPCR.AH in SVE FNEG [56/76] target/arm: Handle FPCR.AH in SVE FABS [57/76] target/arm: Handle FPCR.AH in SVE FABD [58/76] target/arm: Handle FPCR.AH in negation steps in FCADD [59/76] target/arm: Handle FPCR.AH in negation steps in SVE FCADD [60/76] target/arm: Handle FPCR.AH in FMLSL [61/76] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns [62/76] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns [63/76] target/arm: Handle FPCR.AH in negation step in FMLS (indexed) [64/76] target/arm: Handle FPCR.AH in negation in FMLS (vector) [65/76] target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector) [66/76] target/arm: Handle FPCR.AH in SVE FTSSEL [67/76] target/arm: Handle FPCR.AH in SVE FTMAD [68/76] target/arm: Enable FEAT_AFP for '-cpu max' [69/76] target/arm: Plumb FEAT_RPRES frecpe and frsqrte through to new helper [70/76] target/arm: Implement increased precision FRECPE [71/76] target/arm: Implement increased precision FRSQRTE [72/76] target/arm: Enable FEAT_RPRES for -cpu max [73/76] target/i386: Detect flush-to-zero after rounding [74/76] target/i386: Use correct type for get_float_exception_flags() values [75/76] target/i386: Wire up MXCSR.DE and FPUS.DE correctly [76/76] tests/tcg/x86_64/fma: add test for exact-denormal output

Message ID

20250124162836.2332150-61-peter.maydell@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-arm@nongnu.org,
	qemu-devel@nongnu.org
Subject: [PATCH 60/76] target/arm: Handle FPCR.AH in FMLSL
Date: Fri, 24 Jan 2025 16:28:20 +0000
Message-Id: <20250124162836.2332150-61-peter.maydell@linaro.org>
In-Reply-To: <20250124162836.2332150-1-peter.maydell@linaro.org>
References: <20250124162836.2332150-1-peter.maydell@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2a00:1450:4864:20::335;
 envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001,
 T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

target/arm: Implement FEAT_AFP and FEAT_RPRES | expand

Commit Message

Peter Maydell Jan. 24, 2025, 4:28 p.m. UTC

Honour the FPCR.AH "don't negate the sign of a NaN" semantics in
FMLSL. We pass in the value of FPCR.AH in the SIMD data field, and
use this to determine whether we should suppress the negation for
NaN inputs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/tcg/translate-a64.c |  4 ++--
 target/arm/tcg/vec_helper.c    | 28 ++++++++++++++++++++++++----
 2 files changed, 26 insertions(+), 6 deletions(-)

Comments

Richard Henderson Jan. 26, 2025, 1:13 p.m. UTC | #1

On 1/24/25 08:28, Peter Maydell wrote:
> Honour the FPCR.AH "don't negate the sign of a NaN" semantics in
> FMLSL. We pass in the value of FPCR.AH in the SIMD data field, and
> use this to determine whether we should suppress the negation for
> NaN inputs.
> 
> Signed-off-by: Peter Maydell<peter.maydell@linaro.org>
> ---
>   target/arm/tcg/translate-a64.c |  4 ++--
>   target/arm/tcg/vec_helper.c    | 28 ++++++++++++++++++++++++----
>   2 files changed, 26 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 0827dff16b2..e22c2a148ab 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -5968,7 +5968,7 @@  TRANS(FMINNMP_v, do_fp3_vector, a, 0, f_vector_fminnmp)
 static bool do_fmlal(DisasContext *s, arg_qrrr_e *a, bool is_s, bool is_2)
 {
     if (fp_access_check(s)) {
-        int data = (is_2 << 1) | is_s;
+        int data = (s->fpcr_ah << 2) | (is_2 << 1) | is_s;
         tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
                            vec_full_reg_offset(s, a->rn),
                            vec_full_reg_offset(s, a->rm), tcg_env,
@@ -6738,7 +6738,7 @@  TRANS(FMLS_vi, do_fmla_vector_idx, a, true)
 static bool do_fmlal_idx(DisasContext *s, arg_qrrx_e *a, bool is_s, bool is_2)
 {
     if (fp_access_check(s)) {
-        int data = (a->idx << 2) | (is_2 << 1) | is_s;
+        int data = (s->fpcr_ah << 5) | (a->idx << 2) | (is_2 << 1) | is_s;
         tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd),
                            vec_full_reg_offset(s, a->rn),
                            vec_full_reg_offset(s, a->rm), tcg_env,
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 382b5da4a9c..aa42c50f9fe 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2083,6 +2083,26 @@  static uint64_t load4_f16(uint64_t *ptr, int is_q, int is_2)
     return ptr[is_q & is_2] >> ((is_2 & ~is_q) << 5);
 }
 
+static uint64_t neg4_f16(uint64_t v, bool fpcr_ah)
+{
+    /*
+     * Negate all inputs for FMLSL at once. This is slightly complicated
+     * by the need to avoid flipping the sign of a NaN when FPCR.AH == 1
+     */
+    uint64_t mask = 0x8000800080008000ull;
+    if (fpcr_ah) {
+        uint64_t tmp = v, signbit = 0x8000;
+        for (int i = 0; i < 4; i++) {
+            if (float16_is_any_nan(extract64(tmp, 0, 16))) {
+                mask ^= signbit;
+            }
+            tmp >>= 16;
+            signbit <<= 16;
+        }
+    }
+    return v ^ mask;
+}
+
 /*
  * Note that FMLAL requires oprsz == 8 or oprsz == 16,
  * as there is not yet SVE versions that might use blocking.
@@ -2094,6 +2114,7 @@  static void do_fmlal(float32 *d, void *vn, void *vm, float_status *fpst,
     intptr_t i, oprsz = simd_oprsz(desc);
     int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
     int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
+    bool fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 2, 1);
     int is_q = oprsz == 16;
     uint64_t n_4, m_4;
 
@@ -2101,9 +2122,8 @@  static void do_fmlal(float32 *d, void *vn, void *vm, float_status *fpst,
     n_4 = load4_f16(vn, is_q, is_2);
     m_4 = load4_f16(vm, is_q, is_2);
 
-    /* Negate all inputs for FMLSL at once.  */
     if (is_s) {
-        n_4 ^= 0x8000800080008000ull;
+        n_4 = neg4_f16(n_4, fpcr_ah);
     }
 
     for (i = 0; i < oprsz / 4; i++) {
@@ -2155,6 +2175,7 @@  static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst,
     int is_s = extract32(desc, SIMD_DATA_SHIFT, 1);
     int is_2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1);
     int index = extract32(desc, SIMD_DATA_SHIFT + 2, 3);
+    bool fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 5, 1);
     int is_q = oprsz == 16;
     uint64_t n_4;
     float32 m_1;
@@ -2162,9 +2183,8 @@  static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst,
     /* Pre-load all of the f16 data, avoiding overlap issues.  */
     n_4 = load4_f16(vn, is_q, is_2);
 
-    /* Negate all inputs for FMLSL at once.  */
     if (is_s) {
-        n_4 ^= 0x8000800080008000ull;
+        n_4 = neg4_f16(n_4, fpcr_ah);
     }
 
     m_1 = float16_to_float32_by_bits(((float16 *)vm)[H2(index)], fz16);

[60/76] target/arm: Handle FPCR.AH in FMLSL

Commit Message

Comments

Patch