[PULL,67/72] target/hexagon: Expand GEN_XF_ROUND

Message ID	20241224200521.310066-68-richard.henderson@linaro.org
State	Accepted
Commit	795d6a2c4960325c514323147e13a22d5fe21ddf
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: Brian Cain <brian.cain@oss.qualcomm.com> Subject: [PULL 67/72] target/hexagon: Expand GEN_XF_ROUND Date: Tue, 24 Dec 2024 12:05:16 -0800 Message-ID: <20241224200521.310066-68-richard.henderson@linaro.org> In-Reply-To: <20241224200521.310066-1-richard.henderson@linaro.org> References: <20241224200521.310066-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	[PULL,01/72] tests/tcg: Do not use inttypes.h in multiarch/system/memory.c \| expand [PULL,01/72] tests/tcg: Do not use inttypes.h in multiarch/system/memory.c [PULL,02/72] plugins: optimize cpu_index code generation [PULL,03/72] tcg/optimize: Split out finish_bb, finish_ebb [PULL,04/72] tcg/optimize: Split out fold_affected_mask [PULL,05/72] tcg/optimize: Copy mask writeback to fold_masks [PULL,06/72] tcg/optimize: Split out fold_masks_zs [PULL,07/72] tcg/optimize: Augment s_mask from z_mask in fold_masks_zs [PULL,08/72] tcg/optimize: Change representation of s_mask [PULL,09/72] tcg/optimize: Use finish_folding in fold_add, fold_add_vec, fold_addsub2 [PULL,10/72] tcg/optimize: Introduce const value accessors for TempOptInfo [PULL,11/72] tcg/optimize: Use fold_masks_zs in fold_and [PULL,12/72] tcg/optimize: Use fold_masks_zs in fold_andc [PULL,13/72] tcg/optimize: Use fold_masks_zs in fold_bswap [PULL,14/72] tcg/optimize: Use fold_masks_zs in fold_count_zeros [PULL,15/72] tcg/optimize: Use fold_masks_z in fold_ctpop [PULL,16/72] tcg/optimize: Use fold_and and fold_masks_z in fold_deposit [PULL,17/72] tcg/optimize: Compute sign mask in fold_deposit [PULL,18/72] tcg/optimize: Use finish_folding in fold_divide [PULL,19/72] tcg/optimize: Use finish_folding in fold_dup, fold_dup2 [PULL,20/72] tcg/optimize: Use fold_masks_s in fold_eqv [PULL,21/72] tcg/optimize: Use fold_masks_z in fold_extract [PULL,22/72] tcg/optimize: Use finish_folding in fold_extract2 [PULL,23/72] tcg/optimize: Use fold_masks_zs in fold_exts [PULL,24/72] tcg/optimize: Use fold_masks_z in fold_extu [PULL,25/72] tcg/optimize: Use fold_masks_zs in fold_movcond [PULL,26/72] tcg/optimize: Use finish_folding in fold_mul* [PULL,27/72] tcg/optimize: Use fold_masks_s in fold_nand [PULL,28/72] tcg/optimize: Use fold_masks_z in fold_neg_no_const [PULL,29/72] tcg/optimize: Use fold_masks_s in fold_nor [PULL,30/72] tcg/optimize: Use fold_masks_s in fold_not [PULL,31/72] tcg/optimize: Use fold_masks_zs in fold_or [PULL,32/72] tcg/optimize: Use fold_masks_zs in fold_orc [PULL,33/72] tcg/optimize: Use fold_masks_zs in fold_qemu_ld [PULL,34/72] tcg/optimize: Return true from fold_qemu_st, fold_tcg_st [PULL,35/72] tcg/optimize: Use finish_folding in fold_remainder [PULL,36/72] tcg/optimize: Distinguish simplification in fold_setcond_zmask [PULL,37/72] tcg/optimize: Use fold_masks_z in fold_setcond [PULL,38/72] tcg/optimize: Use fold_masks_s in fold_negsetcond [PULL,39/72] tcg/optimize: Use fold_masks_z in fold_setcond2 [PULL,40/72] tcg/optimize: Use finish_folding in fold_cmp_vec [PULL,41/72] tcg/optimize: Use finish_folding in fold_cmpsel_vec [PULL,42/72] tcg/optimize: Use fold_masks_zs in fold_sextract [PULL,43/72] tcg/optimize: Use fold_masks_zs, fold_masks_s in fold_shift [PULL,44/72] tcg/optimize: Simplify sign bit test in fold_shift [PULL,45/72] tcg/optimize: Use finish_folding in fold_sub, fold_sub_vec [PULL,46/72] tcg/optimize: Use fold_masks_zs in fold_tcg_ld [PULL,47/72] tcg/optimize: Use finish_folding in fold_tcg_ld_memcopy [PULL,48/72] tcg/optimize: Use fold_masks_zs in fold_xor [PULL,49/72] tcg/optimize: Use finish_folding in fold_bitsel_vec [PULL,50/72] tcg/optimize: Use finish_folding as default in tcg_optimize [PULL,51/72] tcg/optimize: Remove z_mask, s_mask from OptContext [PULL,52/72] tcg/optimize: Re-enable sign-mask optimizations [PULL,53/72] tcg/optimize: Move fold_bitsel_vec into alphabetic sort [PULL,54/72] tcg/optimize: Move fold_cmp_vec, fold_cmpsel_vec into alphabetic sort [PULL,55/72] softfloat: Add float{16,32,64}_muladd_scalbn [PULL,56/72] target/arm: Use float_muladd_scalbn [PULL,57/72] target/sparc: Use float_muladd_scalbn [PULL,58/72] softfloat: Remove float_muladd_halve_result [PULL,59/72] softfloat: Add float_round_nearest_even_max [PULL,60/72] softfloat: Add float_muladd_suppress_add_product_zero [PULL,61/72] target/hexagon: Use float32_mul in helper_sfmpy [PULL,62/72] target/hexagon: Use float32_muladd for helper_sffma [PULL,63/72] target/hexagon: Use float32_muladd for helper_sffms [PULL,64/72] target/hexagon: Use float32_muladd_scalbn for helper_sffma_sc [PULL,65/72] target/hexagon: Use float32_muladd for helper_sffm[as]_lib [PULL,66/72] target/hexagon: Remove internal_fmafx [PULL,67/72] target/hexagon: Expand GEN_XF_ROUND [PULL,68/72] target/hexagon: Remove Float [PULL,69/72] target/hexagon: Remove Double [PULL,70/72] target/hexagon: Use mulu64 for int128_mul_6464 [PULL,71/72] target/hexagon: Simplify internal_mpyhh setup [PULL,72/72] accel/tcg: Move gen_intermediate_code to TCGCPUOps.translate_core

Message ID

20241224200521.310066-68-richard.henderson@linaro.org

State

Accepted

Commit

795d6a2c4960325c514323147e13a22d5fe21ddf

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Brian Cain <brian.cain@oss.qualcomm.com>
Subject: [PULL 67/72] target/hexagon: Expand GEN_XF_ROUND
Date: Tue, 24 Dec 2024 12:05:16 -0800
Message-ID: <20241224200521.310066-68-richard.henderson@linaro.org>
In-Reply-To: <20241224200521.310066-1-richard.henderson@linaro.org>
References: <20241224200521.310066-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::62c;
 envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

[PULL,01/72] tests/tcg: Do not use inttypes.h in multiarch/system/memory.c | expand

Commit Message

Richard Henderson Dec. 24, 2024, 8:05 p.m. UTC

This massive macro is now only used once.
Expand it for use only by float64.

Reviewed-by: Brian Cain <brian.cain@oss.qualcomm.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/hexagon/fma_emu.c | 255 +++++++++++++++++++--------------------
 1 file changed, 127 insertions(+), 128 deletions(-)

diff --git a/target/hexagon/fma_emu.c b/target/hexagon/fma_emu.c
index 0c7c7f636c..0769de43de 100644
--- a/target/hexagon/fma_emu.c
+++ b/target/hexagon/fma_emu.c
@@ -354,136 +354,135 @@  float32 infinite_float32(uint8_t sign)
 }
 
 /* Return a maximum finite value with the requested sign */
-#define GEN_XF_ROUND(SUFFIX, MANTBITS, INF_EXP, INTERNAL_TYPE) \
-static SUFFIX accum_round_##SUFFIX(Accum a, float_status * fp_status) \
-{ \
-    if ((int128_gethi(a.mant) == 0) && (int128_getlo(a.mant) == 0) \
-        && ((a.guard | a.round | a.sticky) == 0)) { \
-        /* result zero */ \
-        switch (fp_status->float_rounding_mode) { \
-        case float_round_down: \
-            return zero_##SUFFIX(1); \
-        default: \
-            return zero_##SUFFIX(0); \
-        } \
-    } \
-    /* Normalize right */ \
-    /* We want MANTBITS bits of mantissa plus the leading one. */ \
-    /* That means that we want MANTBITS+1 bits, or 0x000000000000FF_FFFF */ \
-    /* So we need to normalize right while the high word is non-zero and \
-    * while the low word is nonzero when masked with 0xffe0_0000_0000_0000 */ \
-    while ((int128_gethi(a.mant) != 0) || \
-           ((int128_getlo(a.mant) >> (MANTBITS + 1)) != 0)) { \
-        a = accum_norm_right(a, 1); \
-    } \
-    /* \
-     * OK, now normalize left \
-     * We want to normalize left until we have a leading one in bit 24 \
-     * Theoretically, we only need to shift a maximum of one to the left if we \
-     * shifted out lots of bits from B, or if we had no shift / 1 shift sticky \
-     * should be 0  \
-     */ \
-    while ((int128_getlo(a.mant) & (1ULL << MANTBITS)) == 0) { \
-        a = accum_norm_left(a); \
-    } \
-    /* \
-     * OK, now we might need to denormalize because of potential underflow. \
-     * We need to do this before rounding, and rounding might make us normal \
-     * again \
-     */ \
-    while (a.exp <= 0) { \
-        a = accum_norm_right(a, 1 - a.exp); \
-        /* \
-         * Do we have underflow? \
-         * That's when we get an inexact answer because we ran out of bits \
-         * in a denormal. \
-         */ \
-        if (a.guard || a.round || a.sticky) { \
-            float_raise(float_flag_underflow, fp_status); \
-        } \
-    } \
-    /* OK, we're relatively canonical... now we need to round */ \
-    if (a.guard || a.round || a.sticky) { \
-        float_raise(float_flag_inexact, fp_status); \
-        switch (fp_status->float_rounding_mode) { \
-        case float_round_to_zero: \
-            /* Chop and we're done */ \
-            break; \
-        case float_round_up: \
-            if (a.sign == 0) { \
-                a.mant = int128_add(a.mant, int128_one()); \
-            } \
-            break; \
-        case float_round_down: \
-            if (a.sign != 0) { \
-                a.mant = int128_add(a.mant, int128_one()); \
-            } \
-            break; \
-        default: \
-            if (a.round || a.sticky) { \
-                /* round up if guard is 1, down if guard is zero */ \
-                a.mant = int128_add(a.mant, int128_make64(a.guard)); \
-            } else if (a.guard) { \
-                /* exactly .5, round up if odd */ \
-                a.mant = int128_add(a.mant, int128_and(a.mant, int128_one())); \
-            } \
-            break; \
-        } \
-    } \
-    /* \
-     * OK, now we might have carried all the way up. \
-     * So we might need to shr once \
-     * at least we know that the lsb should be zero if we rounded and \
-     * got a carry out... \
-     */ \
-    if ((int128_getlo(a.mant) >> (MANTBITS + 1)) != 0) { \
-        a = accum_norm_right(a, 1); \
-    } \
-    /* Overflow? */ \
-    if (a.exp >= INF_EXP) { \
-        /* Yep, inf result */ \
-        float_raise(float_flag_overflow, fp_status); \
-        float_raise(float_flag_inexact, fp_status); \
-        switch (fp_status->float_rounding_mode) { \
-        case float_round_to_zero: \
-            return maxfinite_##SUFFIX(a.sign); \
-        case float_round_up: \
-            if (a.sign == 0) { \
-                return infinite_##SUFFIX(a.sign); \
-            } else { \
-                return maxfinite_##SUFFIX(a.sign); \
-            } \
-        case float_round_down: \
-            if (a.sign != 0) { \
-                return infinite_##SUFFIX(a.sign); \
-            } else { \
-                return maxfinite_##SUFFIX(a.sign); \
-            } \
-        default: \
-            return infinite_##SUFFIX(a.sign); \
-        } \
-    } \
-    /* Underflow? */ \
-    if (int128_getlo(a.mant) & (1ULL << MANTBITS)) { \
-        /* Leading one means: No, we're normal. So, we should be done... */ \
-        INTERNAL_TYPE ret; \
-        ret.i = 0; \
-        ret.sign = a.sign; \
-        ret.exp = a.exp; \
-        ret.mant = int128_getlo(a.mant); \
-        return ret.i; \
-    } \
-    assert(a.exp == 1); \
-    INTERNAL_TYPE ret; \
-    ret.i = 0; \
-    ret.sign = a.sign; \
-    ret.exp = 0; \
-    ret.mant = int128_getlo(a.mant); \
-    return ret.i; \
+static float64 accum_round_float64(Accum a, float_status *fp_status)
+{
+    if ((int128_gethi(a.mant) == 0) && (int128_getlo(a.mant) == 0)
+        && ((a.guard | a.round | a.sticky) == 0)) {
+        /* result zero */
+        switch (fp_status->float_rounding_mode) {
+        case float_round_down:
+            return zero_float64(1);
+        default:
+            return zero_float64(0);
+        }
+    }
+    /*
+     * Normalize right
+     * We want DF_MANTBITS bits of mantissa plus the leading one.
+     * That means that we want DF_MANTBITS+1 bits, or 0x000000000000FF_FFFF
+     * So we need to normalize right while the high word is non-zero and
+     * while the low word is nonzero when masked with 0xffe0_0000_0000_0000
+     */
+    while ((int128_gethi(a.mant) != 0) ||
+           ((int128_getlo(a.mant) >> (DF_MANTBITS + 1)) != 0)) {
+        a = accum_norm_right(a, 1);
+    }
+    /*
+     * OK, now normalize left
+     * We want to normalize left until we have a leading one in bit 24
+     * Theoretically, we only need to shift a maximum of one to the left if we
+     * shifted out lots of bits from B, or if we had no shift / 1 shift sticky
+     * should be 0
+     */
+    while ((int128_getlo(a.mant) & (1ULL << DF_MANTBITS)) == 0) {
+        a = accum_norm_left(a);
+    }
+    /*
+     * OK, now we might need to denormalize because of potential underflow.
+     * We need to do this before rounding, and rounding might make us normal
+     * again
+     */
+    while (a.exp <= 0) {
+        a = accum_norm_right(a, 1 - a.exp);
+        /*
+         * Do we have underflow?
+         * That's when we get an inexact answer because we ran out of bits
+         * in a denormal.
+         */
+        if (a.guard || a.round || a.sticky) {
+            float_raise(float_flag_underflow, fp_status);
+        }
+    }
+    /* OK, we're relatively canonical... now we need to round */
+    if (a.guard || a.round || a.sticky) {
+        float_raise(float_flag_inexact, fp_status);
+        switch (fp_status->float_rounding_mode) {
+        case float_round_to_zero:
+            /* Chop and we're done */
+            break;
+        case float_round_up:
+            if (a.sign == 0) {
+                a.mant = int128_add(a.mant, int128_one());
+            }
+            break;
+        case float_round_down:
+            if (a.sign != 0) {
+                a.mant = int128_add(a.mant, int128_one());
+            }
+            break;
+        default:
+            if (a.round || a.sticky) {
+                /* round up if guard is 1, down if guard is zero */
+                a.mant = int128_add(a.mant, int128_make64(a.guard));
+            } else if (a.guard) {
+                /* exactly .5, round up if odd */
+                a.mant = int128_add(a.mant, int128_and(a.mant, int128_one()));
+            }
+            break;
+        }
+    }
+    /*
+     * OK, now we might have carried all the way up.
+     * So we might need to shr once
+     * at least we know that the lsb should be zero if we rounded and
+     * got a carry out...
+     */
+    if ((int128_getlo(a.mant) >> (DF_MANTBITS + 1)) != 0) {
+        a = accum_norm_right(a, 1);
+    }
+    /* Overflow? */
+    if (a.exp >= DF_INF_EXP) {
+        /* Yep, inf result */
+        float_raise(float_flag_overflow, fp_status);
+        float_raise(float_flag_inexact, fp_status);
+        switch (fp_status->float_rounding_mode) {
+        case float_round_to_zero:
+            return maxfinite_float64(a.sign);
+        case float_round_up:
+            if (a.sign == 0) {
+                return infinite_float64(a.sign);
+            } else {
+                return maxfinite_float64(a.sign);
+            }
+        case float_round_down:
+            if (a.sign != 0) {
+                return infinite_float64(a.sign);
+            } else {
+                return maxfinite_float64(a.sign);
+            }
+        default:
+            return infinite_float64(a.sign);
+        }
+    }
+    /* Underflow? */
+    if (int128_getlo(a.mant) & (1ULL << DF_MANTBITS)) {
+        /* Leading one means: No, we're normal. So, we should be done... */
+        Double ret;
+        ret.i = 0;
+        ret.sign = a.sign;
+        ret.exp = a.exp;
+        ret.mant = int128_getlo(a.mant);
+        return ret.i;
+    }
+    assert(a.exp == 1);
+    Double ret;
+    ret.i = 0;
+    ret.sign = a.sign;
+    ret.exp = 0;
+    ret.mant = int128_getlo(a.mant);
+    return ret.i;
 }
 
-GEN_XF_ROUND(float64, DF_MANTBITS, DF_INF_EXP, Double)
-
 float64 internal_mpyhh(float64 a, float64 b,
                       unsigned long long int accumulated,
                       float_status *fp_status)

[PULL,67/72] target/hexagon: Expand GEN_XF_ROUND

Commit Message

Patch