[46/72] softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc

Message ID	20210508014802.892561-47-richard.henderson@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Subject: [PATCH 46/72] softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc Date: Fri, 7 May 2021 18:47:36 -0700 Message-Id: <20210508014802.892561-47-richard.henderson@linaro.org> In-Reply-To: <20210508014802.892561-1-richard.henderson@linaro.org> References: <20210508014802.892561-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Cc: alex.bennee@linaro.org, david@redhat.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patch=linaro.org@nongnu.org>
Series	Convert floatx80 and float128 to FloatParts \| expand [00/72] Convert floatx80 and float128 to FloatParts [01/72] qemu/host-utils: Use __builtin_bitreverseN [02/72] qemu/host-utils: Add wrappers for overflow builtins [03/72] qemu/host-utils: Add wrappers for carry builtins [04/72] accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c [05/72] tests/fp: add quad support to the benchmark utility [06/72] softfloat: Move the binary point to the msb [07/72] softfloat: Inline float_raise [08/72] softfloat: Use float_raise in more places [09/72] softfloat: Tidy a * b + inf return [10/72] softfloat: Add float_cmask and constants [11/72] softfloat: Use return_nan in float_to_float [12/72] softfloat: fix return_nan vs default_nan_mode [13/72] target/mips: Set set_default_nan_mode with set_snan_bit_is_one [14/72] softfloat: Do not produce a default_nan from parts_silence_nan [15/72] softfloat: Rename FloatParts to FloatParts64 [16/72] softfloat: Move type-specific pack/unpack routines [17/72] softfloat: Use pointers with parts_default_nan [18/72] softfloat: Use pointers with unpack_raw [19/72] softfloat: Use pointers with ftype_unpack_raw [20/72] softfloat: Use pointers with pack_raw [21/72] softfloat: Use pointers with ftype_pack_raw [22/72] softfloat: Use pointers with ftype_unpack_canonical [23/72] softfloat: Use pointers with ftype_round_pack_canonical [24/72] softfloat: Use pointers with parts_silence_nan [25/72] softfloat: Rearrange FloatParts64 [26/72] softfloat: Convert float128_silence_nan to parts [27/72] softfloat: Convert float128_default_nan to parts [28/72] softfloat: Move return_nan to softfloat-parts.c.inc [29/72] softfloat: Move pick_nan to softfloat-parts.c.inc [30/72] softfloat: Move pick_nan_muladd to softfloat-parts.c.inc [31/72] softfloat: Move sf_canonicalize to softfloat-parts.c.inc [32/72] softfloat: Move round_canonical to softfloat-parts.c.inc [33/72] softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h [34/72] softfloat: Move addsub_floats to softfloat-parts.c.inc [35/72] softfloat: Implement float128_add/sub via parts [36/72] softfloat: Move mul_floats to softfloat-parts.c.inc [37/72] softfloat: Move muladd_floats to softfloat-parts.c.inc [38/72] softfloat: Use mulu64 for mul64To128 [39/72] softfloat: Use add192 in mul128To256 [40/72] softfloat: Tidy mul128By64To192 [41/72] softfloat: Introduce sh[lr]_double primitives [42/72] softfloat: Move div_floats to softfloat-parts.c.inc [43/72] softfloat: Split float_to_float [44/72] softfloat: Convert float-to-float conversions with float128 [45/72] softfloat: Move round_to_int to softfloat-parts.c.inc [46/72] softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc [47/72] softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc [48/72] softfloat: Move int_to_float to softfloat-parts.c.inc [49/72] softfloat: Move uint_to_float to softfloat-parts.c.inc [50/72] softfloat: Move minmax_flags to softfloat-parts.c.inc [51/72] softfloat: Move compare_floats to softfloat-parts.c.inc [52/72] softfloat: Move scalbn_decomposed to softfloat-parts.c.inc [53/72] softfloat: Move sqrt_float to softfloat-parts.c.inc [54/72] softfloat: Split out parts_uncanon_normal [55/72] softfloat: Reduce FloatFmt [56/72] softfloat: Introduce Floatx80RoundPrec [57/72] softfloat: Adjust parts_uncanon_normal for floatx80 [58/72] tests/fp/fp-test: Reverse order of floatx80 precision tests [59/72] softfloat: Convert floatx80_add/sub to FloatParts [60/72] softfloat: Convert floatx80_mul to FloatParts [61/72] softfloat: Convert floatx80_div to FloatParts [62/72] softfloat: Convert floatx80_sqrt to FloatParts [63/72] softfloat: Convert floatx80_round to FloatParts [64/72] softfloat: Convert floatx80_round_to_int to FloatParts [65/72] softfloat: Convert integer to floatx80 to FloatParts [66/72] softfloat: Convert floatx80 float conversions to FloatParts [67/72] softfloat: Convert floatx80 to integer to FloatParts [68/72] softfloat: Convert floatx80_scalbn to FloatParts [69/72] softfloat: Convert floatx80 compare to FloatParts [70/72] softfloat: Convert float32_exp2 to FloatParts [71/72] softfloat: Move floatN_log2 to softfloat-parts.c.inc [72/72] softfloat: Convert modrem operations to FloatParts

diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ce96ea753c..ac8e726935 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -828,6 +828,16 @@ static void parts128_round_to_int(FloatParts128 *a, FloatRoundMode r, #define parts_round_to_int(A, R, C, S, F) \ PARTS_GENERIC_64_128(round_to_int, A)(A, R, C, S, F) +static int64_t parts64_float_to_sint(FloatParts64 *p, FloatRoundMode rmode, + int scale, int64_t min, int64_t max, + float_status *s); +static int64_t parts128_float_to_sint(FloatParts128 *p, FloatRoundMode rmode, + int scale, int64_t min, int64_t max, + float_status *s); + +#define parts_float_to_sint(P, R, Z, MN, MX, S) \ + PARTS_GENERIC_64_128(float_to_sint, P)(P, R, Z, MN, MX, S) + /* * Helper functions for softfloat-parts.c.inc, per-size operations. */ @@ -2351,69 +2361,8 @@ float128 float128_round_to_int(float128 a, float_status *s) } /* - * Returns the result of converting the floating-point value `a' to - * the two's complement integer format. The conversion is performed - * according to the IEC/IEEE Standard for Binary Floating-Point - * Arithmetic---which means in particular that the conversion is - * rounded according to the current rounding mode. If `a' is a NaN, - * the largest positive integer is returned. Otherwise, if the - * conversion overflows, the largest integer with the same sign as `a' - * is returned. -*/ - -static int64_t round_to_int_and_pack(FloatParts64 p, FloatRoundMode rmode, - int scale, int64_t min, int64_t max, - float_status *s) -{ - int flags = 0; - uint64_t r; - - switch (p.cls) { - case float_class_snan: - case float_class_qnan: - flags = float_flag_invalid; - r = max; - break; - - case float_class_inf: - flags = float_flag_invalid; - r = p.sign ? min : max; - break; - - case float_class_zero: - return 0; - - case float_class_normal: - /* TODO: 62 = N - 2, frac_size for rounding */ - if (parts_round_to_int_normal(&p, rmode, scale, 62)) { - flags = float_flag_inexact; - } - - if (p.exp <= DECOMPOSED_BINARY_POINT) { - r = p.frac >> (DECOMPOSED_BINARY_POINT - p.exp); - } else { - r = UINT64_MAX; - } - if (p.sign) { - if (r <= -(uint64_t)min) { - r = -r; - } else { - flags = float_flag_invalid; - r = min; - } - } else if (r > max) { - flags = float_flag_invalid; - r = max; - } - break; - - default: - g_assert_not_reached(); - } - - float_raise(flags, s); - return r; -} + * Floating-point to signed integer conversions + */ int8_t float16_to_int8_scalbn(float16 a, FloatRoundMode rmode, int scale, float_status *s) @@ -2421,7 +2370,7 @@ int8_t float16_to_int8_scalbn(float16 a, FloatRoundMode rmode, int scale, FloatParts64 p; float16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT8_MIN, INT8_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT8_MIN, INT8_MAX, s); } int16_t float16_to_int16_scalbn(float16 a, FloatRoundMode rmode, int scale, @@ -2430,7 +2379,7 @@ int16_t float16_to_int16_scalbn(float16 a, FloatRoundMode rmode, int scale, FloatParts64 p; float16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT16_MIN, INT16_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT16_MIN, INT16_MAX, s); } int32_t float16_to_int32_scalbn(float16 a, FloatRoundMode rmode, int scale, @@ -2439,7 +2388,7 @@ int32_t float16_to_int32_scalbn(float16 a, FloatRoundMode rmode, int scale, FloatParts64 p; float16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT32_MIN, INT32_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT32_MIN, INT32_MAX, s); } int64_t float16_to_int64_scalbn(float16 a, FloatRoundMode rmode, int scale, @@ -2448,7 +2397,7 @@ int64_t float16_to_int64_scalbn(float16 a, FloatRoundMode rmode, int scale, FloatParts64 p; float16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT64_MIN, INT64_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT64_MIN, INT64_MAX, s); } int16_t float32_to_int16_scalbn(float32 a, FloatRoundMode rmode, int scale, @@ -2457,7 +2406,7 @@ int16_t float32_to_int16_scalbn(float32 a, FloatRoundMode rmode, int scale, FloatParts64 p; float32_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT16_MIN, INT16_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT16_MIN, INT16_MAX, s); } int32_t float32_to_int32_scalbn(float32 a, FloatRoundMode rmode, int scale, @@ -2466,7 +2415,7 @@ int32_t float32_to_int32_scalbn(float32 a, FloatRoundMode rmode, int scale, FloatParts64 p; float32_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT32_MIN, INT32_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT32_MIN, INT32_MAX, s); } int64_t float32_to_int64_scalbn(float32 a, FloatRoundMode rmode, int scale, @@ -2475,7 +2424,7 @@ int64_t float32_to_int64_scalbn(float32 a, FloatRoundMode rmode, int scale, FloatParts64 p; float32_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT64_MIN, INT64_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT64_MIN, INT64_MAX, s); } int16_t float64_to_int16_scalbn(float64 a, FloatRoundMode rmode, int scale, @@ -2484,7 +2433,7 @@ int16_t float64_to_int16_scalbn(float64 a, FloatRoundMode rmode, int scale, FloatParts64 p; float64_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT16_MIN, INT16_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT16_MIN, INT16_MAX, s); } int32_t float64_to_int32_scalbn(float64 a, FloatRoundMode rmode, int scale, @@ -2493,7 +2442,7 @@ int32_t float64_to_int32_scalbn(float64 a, FloatRoundMode rmode, int scale, FloatParts64 p; float64_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT32_MIN, INT32_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT32_MIN, INT32_MAX, s); } int64_t float64_to_int64_scalbn(float64 a, FloatRoundMode rmode, int scale, @@ -2502,7 +2451,52 @@ int64_t float64_to_int64_scalbn(float64 a, FloatRoundMode rmode, int scale, FloatParts64 p; float64_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT64_MIN, INT64_MAX, s); + return parts_float_to_sint(&p, rmode, scale, INT64_MIN, INT64_MAX, s); +} + +int16_t bfloat16_to_int16_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, + float_status *s) +{ + FloatParts64 p; + + bfloat16_unpack_canonical(&p, a, s); + return parts_float_to_sint(&p, rmode, scale, INT16_MIN, INT16_MAX, s); +} + +int32_t bfloat16_to_int32_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, + float_status *s) +{ + FloatParts64 p; + + bfloat16_unpack_canonical(&p, a, s); + return parts_float_to_sint(&p, rmode, scale, INT32_MIN, INT32_MAX, s); +} + +int64_t bfloat16_to_int64_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, + float_status *s) +{ + FloatParts64 p; + + bfloat16_unpack_canonical(&p, a, s); + return parts_float_to_sint(&p, rmode, scale, INT64_MIN, INT64_MAX, s); +} + +static int32_t float128_to_int32_scalbn(float128 a, FloatRoundMode rmode, + int scale, float_status *s) +{ + FloatParts128 p; + + float128_unpack_canonical(&p, a, s); + return parts_float_to_sint(&p, rmode, scale, INT32_MIN, INT32_MAX, s); +} + +static int64_t float128_to_int64_scalbn(float128 a, FloatRoundMode rmode, + int scale, float_status *s) +{ + FloatParts128 p; + + float128_unpack_canonical(&p, a, s); + return parts_float_to_sint(&p, rmode, scale, INT64_MIN, INT64_MAX, s); } int8_t float16_to_int8(float16 a, float_status *s) @@ -2555,6 +2549,16 @@ int64_t float64_to_int64(float64 a, float_status *s) return float64_to_int64_scalbn(a, s->float_rounding_mode, 0, s); } +int32_t float128_to_int32(float128 a, float_status *s) +{ + return float128_to_int32_scalbn(a, s->float_rounding_mode, 0, s); +} + +int64_t float128_to_int64(float128 a, float_status *s) +{ + return float128_to_int64_scalbn(a, s->float_rounding_mode, 0, s); +} + int16_t float16_to_int16_round_to_zero(float16 a, float_status *s) { return float16_to_int16_scalbn(a, float_round_to_zero, 0, s); @@ -2600,36 +2604,14 @@ int64_t float64_to_int64_round_to_zero(float64 a, float_status *s) return float64_to_int64_scalbn(a, float_round_to_zero, 0, s); } -/* - * Returns the result of converting the floating-point value `a' to - * the two's complement integer format. - */ - -int16_t bfloat16_to_int16_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, - float_status *s) +int32_t float128_to_int32_round_to_zero(float128 a, float_status *s) { - FloatParts64 p; - - bfloat16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT16_MIN, INT16_MAX, s); + return float128_to_int32_scalbn(a, float_round_to_zero, 0, s); } -int32_t bfloat16_to_int32_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, - float_status *s) +int64_t float128_to_int64_round_to_zero(float128 a, float_status *s) { - FloatParts64 p; - - bfloat16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT32_MIN, INT32_MAX, s); -} - -int64_t bfloat16_to_int64_scalbn(bfloat16 a, FloatRoundMode rmode, int scale, - float_status *s) -{ - FloatParts64 p; - - bfloat16_unpack_canonical(&p, a, s); - return round_to_int_and_pack(p, rmode, scale, INT64_MIN, INT64_MAX, s); + return float128_to_int64_scalbn(a, float_round_to_zero, 0, s); } int16_t bfloat16_to_int16(bfloat16 a, float_status *s) @@ -6553,191 +6535,6 @@ floatx80 floatx80_sqrt(floatx80 a, float_status *status) 0, zExp, zSig0, zSig1, status); } -/*---------------------------------------------------------------------------- -| Returns the result of converting the quadruple-precision floating-point -| value `a' to the 32-bit two's complement integer format. The conversion -| is performed according to the IEC/IEEE Standard for Binary Floating-Point -| Arithmetic---which means in particular that the conversion is rounded -| according to the current rounding mode. If `a' is a NaN, the largest -| positive integer is returned. Otherwise, if the conversion overflows, the -| largest integer with the same sign as `a' is returned. -*----------------------------------------------------------------------------*/ - -int32_t float128_to_int32(float128 a, float_status *status) -{ - bool aSign; - int32_t aExp, shiftCount; - uint64_t aSig0, aSig1; - - aSig1 = extractFloat128Frac1( a ); - aSig0 = extractFloat128Frac0( a ); - aExp = extractFloat128Exp( a ); - aSign = extractFloat128Sign( a ); - if ( ( aExp == 0x7FFF ) && ( aSig0 | aSig1 ) ) aSign = 0; - if ( aExp ) aSig0 |= UINT64_C(0x0001000000000000); - aSig0 |= ( aSig1 != 0 ); - shiftCount = 0x4028 - aExp; - if ( 0 < shiftCount ) shift64RightJamming( aSig0, shiftCount, &aSig0 ); - return roundAndPackInt32(aSign, aSig0, status); - -} - -/*---------------------------------------------------------------------------- -| Returns the result of converting the quadruple-precision floating-point -| value `a' to the 32-bit two's complement integer format. The conversion -| is performed according to the IEC/IEEE Standard for Binary Floating-Point -| Arithmetic, except that the conversion is always rounded toward zero. If -| `a' is a NaN, the largest positive integer is returned. Otherwise, if the -| conversion overflows, the largest integer with the same sign as `a' is -| returned. -*----------------------------------------------------------------------------*/ - -int32_t float128_to_int32_round_to_zero(float128 a, float_status *status) -{ - bool aSign; - int32_t aExp, shiftCount; - uint64_t aSig0, aSig1, savedASig; - int32_t z; - - aSig1 = extractFloat128Frac1( a ); - aSig0 = extractFloat128Frac0( a ); - aExp = extractFloat128Exp( a ); - aSign = extractFloat128Sign( a ); - aSig0 |= ( aSig1 != 0 ); - if ( 0x401E < aExp ) { - if ( ( aExp == 0x7FFF ) && aSig0 ) aSign = 0; - goto invalid; - } - else if ( aExp < 0x3FFF ) { - if (aExp || aSig0) { - float_raise(float_flag_inexact, status); - } - return 0; - } - aSig0 |= UINT64_C(0x0001000000000000); - shiftCount = 0x402F - aExp; - savedASig = aSig0; - aSig0 >>= shiftCount; - z = aSig0; - if ( aSign ) z = - z; - if ( ( z < 0 ) ^ aSign ) { - invalid: - float_raise(float_flag_invalid, status); - return aSign ? INT32_MIN : INT32_MAX; - } - if ( ( aSig0<<shiftCount ) != savedASig ) { - float_raise(float_flag_inexact, status); - } - return z; - -} - -/*---------------------------------------------------------------------------- -| Returns the result of converting the quadruple-precision floating-point -| value `a' to the 64-bit two's complement integer format. The conversion -| is performed according to the IEC/IEEE Standard for Binary Floating-Point -| Arithmetic---which means in particular that the conversion is rounded -| according to the current rounding mode. If `a' is a NaN, the largest -| positive integer is returned. Otherwise, if the conversion overflows, the -| largest integer with the same sign as `a' is returned. -*----------------------------------------------------------------------------*/ - -int64_t float128_to_int64(float128 a, float_status *status) -{ - bool aSign; - int32_t aExp, shiftCount; - uint64_t aSig0, aSig1; - - aSig1 = extractFloat128Frac1( a ); - aSig0 = extractFloat128Frac0( a ); - aExp = extractFloat128Exp( a ); - aSign = extractFloat128Sign( a ); - if ( aExp ) aSig0 |= UINT64_C(0x0001000000000000); - shiftCount = 0x402F - aExp; - if ( shiftCount <= 0 ) { - if ( 0x403E < aExp ) { - float_raise(float_flag_invalid, status); - if ( ! aSign - || ( ( aExp == 0x7FFF ) - && ( aSig1 || ( aSig0 != UINT64_C(0x0001000000000000) ) ) - ) - ) { - return INT64_MAX; - } - return INT64_MIN; - } - shortShift128Left( aSig0, aSig1, - shiftCount, &aSig0, &aSig1 ); - } - else { - shift64ExtraRightJamming( aSig0, aSig1, shiftCount, &aSig0, &aSig1 ); - } - return roundAndPackInt64(aSign, aSig0, aSig1, status); - -} - -/*---------------------------------------------------------------------------- -| Returns the result of converting the quadruple-precision floating-point -| value `a' to the 64-bit two's complement integer format. The conversion -| is performed according to the IEC/IEEE Standard for Binary Floating-Point -| Arithmetic, except that the conversion is always rounded toward zero. -| If `a' is a NaN, the largest positive integer is returned. Otherwise, if -| the conversion overflows, the largest integer with the same sign as `a' is -| returned. -*----------------------------------------------------------------------------*/ - -int64_t float128_to_int64_round_to_zero(float128 a, float_status *status) -{ - bool aSign; - int32_t aExp, shiftCount; - uint64_t aSig0, aSig1; - int64_t z; - - aSig1 = extractFloat128Frac1( a ); - aSig0 = extractFloat128Frac0( a ); - aExp = extractFloat128Exp( a ); - aSign = extractFloat128Sign( a ); - if ( aExp ) aSig0 |= UINT64_C(0x0001000000000000); - shiftCount = aExp - 0x402F; - if ( 0 < shiftCount ) { - if ( 0x403E <= aExp ) { - aSig0 &= UINT64_C(0x0000FFFFFFFFFFFF); - if ( ( a.high == UINT64_C(0xC03E000000000000) ) - && ( aSig1 < UINT64_C(0x0002000000000000) ) ) { - if (aSig1) { - float_raise(float_flag_inexact, status); - } - } - else { - float_raise(float_flag_invalid, status); - if ( ! aSign || ( ( aExp == 0x7FFF ) && ( aSig0 | aSig1 ) ) ) { - return INT64_MAX; - } - } - return INT64_MIN; - } - z = ( aSig0<<shiftCount ) | ( aSig1>>( ( - shiftCount ) & 63 ) ); - if ( (uint64_t) ( aSig1<<shiftCount ) ) { - float_raise(float_flag_inexact, status); - } - } - else { - if ( aExp < 0x3FFF ) { - if ( aExp | aSig0 | aSig1 ) { - float_raise(float_flag_inexact, status); - } - return 0; - } - z = aSig0>>( - shiftCount ); - if ( aSig1 - || ( shiftCount && (uint64_t) ( aSig0<<( shiftCount & 63 ) ) ) ) { - float_raise(float_flag_inexact, status); - } - } - if ( aSign ) z = - z; - return z; - -} - /*---------------------------------------------------------------------------- | Returns the result of converting the quadruple-precision floating-point value | `a' to the 64-bit unsigned integer format. The conversion is diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc index b2c4624d8c..a897a5a743 100644 --- a/fpu/softfloat-parts.c.inc +++ b/fpu/softfloat-parts.c.inc @@ -751,3 +751,67 @@ static void partsN(round_to_int)(FloatPartsN *a, FloatRoundMode rmode, g_assert_not_reached(); } } + +/* + * Returns the result of converting the floating-point value `a' to + * the two's complement integer format. The conversion is performed + * according to the IEC/IEEE Standard for Binary Floating-Point + * Arithmetic---which means in particular that the conversion is + * rounded according to the current rounding mode. If `a' is a NaN, + * the largest positive integer is returned. Otherwise, if the + * conversion overflows, the largest integer with the same sign as `a' + * is returned. +*/ +static int64_t partsN(float_to_sint)(FloatPartsN *p, FloatRoundMode rmode, + int scale, int64_t min, int64_t max, + float_status *s) +{ + int flags = 0; + uint64_t r; + + switch (p->cls) { + case float_class_snan: + case float_class_qnan: + flags = float_flag_invalid; + r = max; + break; + + case float_class_inf: + flags = float_flag_invalid; + r = p->sign ? min : max; + break; + + case float_class_zero: + return 0; + + case float_class_normal: + /* TODO: N - 2 is frac_size for rounding; could use input fmt. */ + if (parts_round_to_int_normal(p, rmode, scale, N - 2)) { + flags = float_flag_inexact; + } + + if (p->exp <= DECOMPOSED_BINARY_POINT) { + r = p->frac_hi >> (DECOMPOSED_BINARY_POINT - p->exp); + } else { + r = UINT64_MAX; + } + if (p->sign) { + if (r <= -(uint64_t)min) { + r = -r; + } else { + flags = float_flag_invalid; + r = min; + } + } else if (r > max) { + flags = float_flag_invalid; + r = max; + } + break; + + default: + g_assert_not_reached(); + } + + float_raise(flags, s); + return r; +}

[46/72] softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc

Commit Message

Comments

Patch