From patchwork Wed Sep 28 17:27:32 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 4416 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 8DD1223EFA for ; Wed, 28 Sep 2011 17:27:41 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id 7B5DEA18708 for ; Wed, 28 Sep 2011 17:27:41 +0000 (UTC) Received: by mail-fx0-f52.google.com with SMTP id 23so1248729fxe.11 for ; Wed, 28 Sep 2011 10:27:41 -0700 (PDT) Received: by 10.223.63.8 with SMTP id z8mr2059665fah.84.1317230861238; Wed, 28 Sep 2011 10:27:41 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.3.234 with SMTP id f10cs121748laf; Wed, 28 Sep 2011 10:27:40 -0700 (PDT) Received: by 10.68.31.199 with SMTP id c7mr7374344pbi.73.1317230858204; Wed, 28 Sep 2011 10:27:38 -0700 (PDT) Received: from mnementh.archaic.org.uk (mnementh.archaic.org.uk. [81.2.115.146]) by mx.google.com with ESMTPS id h3si2903670pbg.110.2011.09.28.10.27.36 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 28 Sep 2011 10:27:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of pm215@archaic.org.uk designates 81.2.115.146 as permitted sender) client-ip=81.2.115.146; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of pm215@archaic.org.uk designates 81.2.115.146 as permitted sender) smtp.mail=pm215@archaic.org.uk Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.72) (envelope-from ) id 1R8xuv-0006VJ-Eo; Wed, 28 Sep 2011 18:27:33 +0100 From: Peter Maydell To: qemu-devel@nongnu.org Cc: patches@linaro.org Subject: [PATCH 4/5] softfloat: Implement fused multiply-add Date: Wed, 28 Sep 2011 18:27:32 +0100 Message-Id: <1317230853-24970-5-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.2.5 In-Reply-To: <1317230853-24970-1-git-send-email-peter.maydell@linaro.org> References: <1317230853-24970-1-git-send-email-peter.maydell@linaro.org> Implement fused multiply-add as a softfloat primitive. This implements "a+b*c" as a single step without any intermediate rounding; it is specified in IEEE 754-2008 and implemented in a number of CPUs. Signed-off-by: Peter Maydell --- fpu/softfloat-specialize.h | 178 ++++++++++++++++++ fpu/softfloat.c | 428 ++++++++++++++++++++++++++++++++++++++++++++ fpu/softfloat.h | 14 ++ 3 files changed, 620 insertions(+), 0 deletions(-) diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h index c165205..c5e2dab 100644 --- a/fpu/softfloat-specialize.h +++ b/fpu/softfloat-specialize.h @@ -420,6 +420,82 @@ static int pickNaN(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN, #endif /*---------------------------------------------------------------------------- +| Select which NaN to propagate for a three-input operation. +| For the moment we assume that no CPU needs the 'larger significand' +| information. +| Return values : 0 : a; 1 : b; 2 : c; 3 : default-NaN +*----------------------------------------------------------------------------*/ +#if defined(TARGET_ARM) +static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN, + flag cIsQNaN, flag cIsSNaN, flag infzero STATUS_PARAM) +{ + /* For ARM, the (inf,zero,qnan) case sets InvalidOp and returns + * the default NaN + */ + if (infzero && cIsQNaN) { + float_raise(float_flag_invalid STATUS_VAR); + return 3; + } + + /* This looks different from the ARM ARM pseudocode, because the ARM ARM + * puts the operands to a fused mac operation (a*b)+c in the order c,a,b. + */ + if (cIsSNaN) { + return 2; + } else if (aIsSNaN) { + return 0; + } else if (bIsSNaN) { + return 1; + } else if (cIsQNaN) { + return 2; + } else if (aIsQNaN) { + return 0; + } else { + return 1; + } +} +#elif defined(TARGET_PPC) +static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN, + flag cIsQNaN, flag cIsSNaN, flag infzero STATUS_PARAM) +{ + /* For PPC, the (inf,zero,qnan) case sets InvalidOp, but we prefer + * to return an input NaN if we have one (ie c) rather than generating + * a default NaN + */ + if (infzero) { + float_raise(float_flag_invalid STATUS_VAR); + return 2; + } + + /* If fRA is a NaN return it; otherwise if fRB is a NaN return it; + * otherwise return fRC. Note that muladd on PPC is (fRA * fRC) + frB + */ + if (aIsSNaN || aIsQNaN) { + return 0; + } else if (cIsSNaN || cIsQNaN) { + return 2; + } else { + return 1; + } +} +#else +/* A default implementation: prefer a to b to c. + * This is unlikely to actually match any real implementation. + */ +static int pickNaNMulAdd(flag aIsQNaN, flag aIsSNaN, flag bIsQNaN, flag bIsSNaN, + flag cIsQNaN, flag cIsSNaN, flag infzero STATUS_PARAM) +{ + if (aIsSNaN || aIsQNaN) { + return 0; + } else if (bIsSNaN || bIsQNaN) { + return 1; + } else { + return 2; + } +} +#endif + +/*---------------------------------------------------------------------------- | Takes two single-precision floating-point values `a' and `b', one of which | is a NaN, and returns the appropriate NaN result. If either `a' or `b' is a | signaling NaN, the invalid exception is raised. @@ -460,6 +536,57 @@ static float32 propagateFloat32NaN( float32 a, float32 b STATUS_PARAM) } /*---------------------------------------------------------------------------- +| Takes three single-precision floating-point values `a', `b' and `c', one of +| which is a NaN, and returns the appropriate NaN result. If any of `a', +| `b' or `c' is a signaling NaN, the invalid exception is raised. +| The input infzero indicates whether a*b was 0*inf or inf*0 (in which case +| obviously c is a NaN, and whether to propagate c or some other NaN is +| implementation defined). +*----------------------------------------------------------------------------*/ + +static float32 propagateFloat32MulAddNaN(float32 a, float32 b, + float32 c, flag infzero STATUS_PARAM) +{ + flag aIsQuietNaN, aIsSignalingNaN, bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN; + int which; + + aIsQuietNaN = float32_is_quiet_nan(a); + aIsSignalingNaN = float32_is_signaling_nan(a); + bIsQuietNaN = float32_is_quiet_nan(b); + bIsSignalingNaN = float32_is_signaling_nan(b); + cIsQuietNaN = float32_is_quiet_nan(c); + cIsSignalingNaN = float32_is_signaling_nan(c); + + if (aIsSignalingNaN | bIsSignalingNaN | cIsSignalingNaN) { + float_raise(float_flag_invalid STATUS_VAR); + } + + which = pickNaNMulAdd(aIsQuietNaN, aIsSignalingNaN, + bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN, infzero STATUS_VAR); + + if (STATUS(default_nan_mode)) { + /* Note that this check is after pickNaNMulAdd so that function + * has an opportunity to set the Invalid flag. + */ + return float32_default_nan; + } + + switch (which) { + case 0: + return float32_maybe_silence_nan(a); + case 1: + return float32_maybe_silence_nan(b); + case 2: + return float32_maybe_silence_nan(c); + case 3: + default: + return float32_default_nan; + } +} + +/*---------------------------------------------------------------------------- | Returns 1 if the double-precision floating-point value `a' is a quiet | NaN; otherwise returns 0. *----------------------------------------------------------------------------*/ @@ -596,6 +723,57 @@ static float64 propagateFloat64NaN( float64 a, float64 b STATUS_PARAM) } /*---------------------------------------------------------------------------- +| Takes three double-precision floating-point values `a', `b' and `c', one of +| which is a NaN, and returns the appropriate NaN result. If any of `a', +| `b' or `c' is a signaling NaN, the invalid exception is raised. +| The input infzero indicates whether a*b was 0*inf or inf*0 (in which case +| obviously c is a NaN, and whether to propagate c or some other NaN is +| implementation defined). +*----------------------------------------------------------------------------*/ + +static float64 propagateFloat64MulAddNaN(float64 a, float64 b, + float64 c, flag infzero STATUS_PARAM) +{ + flag aIsQuietNaN, aIsSignalingNaN, bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN; + int which; + + aIsQuietNaN = float64_is_quiet_nan(a); + aIsSignalingNaN = float64_is_signaling_nan(a); + bIsQuietNaN = float64_is_quiet_nan(b); + bIsSignalingNaN = float64_is_signaling_nan(b); + cIsQuietNaN = float64_is_quiet_nan(c); + cIsSignalingNaN = float64_is_signaling_nan(c); + + if (aIsSignalingNaN | bIsSignalingNaN | cIsSignalingNaN) { + float_raise(float_flag_invalid STATUS_VAR); + } + + which = pickNaNMulAdd(aIsQuietNaN, aIsSignalingNaN, + bIsQuietNaN, bIsSignalingNaN, + cIsQuietNaN, cIsSignalingNaN, infzero STATUS_VAR); + + if (STATUS(default_nan_mode)) { + /* Note that this check is after pickNaNMulAdd so that function + * has an opportunity to set the Invalid flag. + */ + return float64_default_nan; + } + + switch (which) { + case 0: + return float64_maybe_silence_nan(a); + case 1: + return float64_maybe_silence_nan(b); + case 2: + return float64_maybe_silence_nan(c); + case 3: + default: + return float64_default_nan; + } +} + +/*---------------------------------------------------------------------------- | Returns 1 if the extended double-precision floating-point value `a' is a | quiet NaN; otherwise returns 0. This slightly differs from the same | function for other types as floatx80 has an explicit bit. diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 3aafa81..42edbc7 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2118,6 +2118,217 @@ float32 float32_rem( float32 a, float32 b STATUS_PARAM ) } /*---------------------------------------------------------------------------- +| Returns the result of multiplying the single-precision floating-point values +| `a' and `b' then adding 'c', with no intermediate rounding step after the +| multiplication. The operation is performed according to the IEC/IEEE +| Standard for Binary Floating-Point Arithmetic 754-2008. +| The flags argument allows the caller to select negation of the +| addend, the intermediate product, or the final result. (The difference +| between this and having the caller do a separate negation is that negating +| externally will flip the sign bit on NaNs.) +*----------------------------------------------------------------------------*/ + +float32 float32_muladd(float32 a, float32 b, float32 c, int flags STATUS_PARAM) +{ + flag aSign, bSign, cSign, zSign; + int aExp, bExp, cExp, pExp, zExp, expDiff; + uint32_t aSig, bSig, cSig; + flag pInf, pZero, pSign; + uint64_t pSig64, cSig64, zSig64; + uint32_t pSig; + int shiftcount; + flag signflip, infzero; + + a = float32_squash_input_denormal(a STATUS_VAR); + b = float32_squash_input_denormal(b STATUS_VAR); + c = float32_squash_input_denormal(c STATUS_VAR); + aSig = extractFloat32Frac(a); + aExp = extractFloat32Exp(a); + aSign = extractFloat32Sign(a); + bSig = extractFloat32Frac(b); + bExp = extractFloat32Exp(b); + bSign = extractFloat32Sign(b); + cSig = extractFloat32Frac(c); + cExp = extractFloat32Exp(c); + cSign = extractFloat32Sign(c); + + infzero = ((aExp == 0 && aSig == 0 && bExp == 0xff && bSig == 0) || + (aExp == 0xff && aSig == 0 && bExp == 0 && bSig == 0)); + + /* It is implementation-defined whether the cases of (0,inf,qnan) + * and (inf,0,qnan) raise InvalidOperation or not (and what QNaN + * they return if they do), so we have to hand this information + * off to the target-specific pick-a-NaN routine. + */ + if (((aExp == 0xff) && aSig) || + ((bExp == 0xff) && bSig) || + ((cExp == 0xff) && cSig)) { + return propagateFloat32MulAddNaN(a, b, c, infzero STATUS_VAR); + } + + if (infzero) { + float_raise(float_flag_invalid STATUS_VAR); + return float32_default_nan; + } + + if (flags & float_muladd_negate_c) { + cSign ^= 1; + } + + signflip = (flags & float_muladd_negate_result) ? 1 : 0; + + /* Work out the sign and type of the product */ + pSign = aSign ^ bSign; + if (flags & float_muladd_negate_product) { + pSign ^= 1; + } + pInf = (aExp == 0xff) || (bExp == 0xff); + pZero = ((aExp | aSig) == 0) || ((bExp | bSig) == 0); + + if (cExp == 0xff) { + if (pInf && (pSign ^ cSign)) { + /* addition of opposite-signed infinities => InvalidOperation */ + float_raise(float_flag_invalid STATUS_VAR); + return float32_default_nan; + } + /* Otherwise generate an infinity of the same sign */ + return packFloat32(cSign ^ signflip, 0xff, 0); + } + + if (pInf) { + return packFloat32(pSign ^ signflip, 0xff, 0); + } + + if (pZero) { + if (cExp == 0) { + if (cSig == 0) { + /* Adding two exact zeroes */ + if (pSign == cSign) { + zSign = pSign; + } else if (STATUS(float_rounding_mode) == float_round_down) { + zSign = 1; + } else { + zSign = 0; + } + return packFloat32(zSign ^ signflip, 0, 0); + } + /* Exact zero plus a denorm */ + if (STATUS(flush_to_zero)) { + float_raise(float_flag_output_denormal STATUS_VAR); + return packFloat32(cSign ^ signflip, 0, 0); + } + } + /* Zero plus something non-zero : just return the something */ + return c ^ (signflip << 31); + } + + if (aExp == 0) { + normalizeFloat32Subnormal(aSig, &aExp, &aSig); + } + if (bExp == 0) { + normalizeFloat32Subnormal(bSig, &bExp, &bSig); + } + + /* Calculate the actual result a * b + c */ + + /* Multiply first; this is easy. */ + /* NB: we subtract 0x7e where float32_mul() subtracts 0x7f + * because we want the true exponent, not the "one-less-than" + * flavour that roundAndPackFloat32() takes. + */ + pExp = aExp + bExp - 0x7e; + aSig = (aSig | 0x00800000) << 7; + bSig = (bSig | 0x00800000) << 8; + pSig64 = (uint64_t)aSig * bSig; + if ((int64_t)(pSig64 << 1) >= 0) { + pSig64 <<= 1; + pExp--; + } + + zSign = pSign ^ signflip; + + /* Now pSig64 is the significand of the multiply, with the explicit bit in + * position 62. + */ + if (cExp == 0) { + if (!cSig) { + /* Throw out the special case of c being an exact zero now */ + shift64RightJamming(pSig64, 32, &pSig64); + pSig = pSig64; + return roundAndPackFloat32(zSign, pExp - 1, + pSig STATUS_VAR); + } + normalizeFloat32Subnormal(cSig, &cExp, &cSig); + } + + cSig64 = (uint64_t)cSig << 39; + cSig64 |= LIT64(0x4000000000000000); + expDiff = pExp - cExp; + + if (pSign == cSign) { + /* Addition */ + if (expDiff > 0) { + /* scale c to match p */ + shift64RightJamming(cSig64, expDiff, &cSig64); + zExp = pExp; + } else if (expDiff < 0) { + /* scale p to match c */ + shift64RightJamming(pSig64, -expDiff, &pSig64); + zExp = cExp; + } else { + /* no scaling needed */ + zExp = cExp; + } + /* Add significands and make sure explicit bit ends up in posn 62 */ + zSig64 = pSig64 + cSig64; + if ((int64_t)zSig64 < 0) { + shift64RightJamming(zSig64, 1, &zSig64); + } else { + zExp--; + } + shift64RightJamming(zSig64, 32, &zSig64); + return roundAndPackFloat32(zSign, zExp, zSig64 STATUS_VAR); + } else { + /* Subtraction */ + if (expDiff > 0) { + shift64RightJamming(cSig64, expDiff, &cSig64); + zSig64 = pSig64 - cSig64; + zExp = pExp; + } else if (expDiff < 0) { + shift64RightJamming(pSig64, -expDiff, &pSig64); + zSig64 = cSig64 - pSig64; + zExp = cExp; + zSign ^= 1; + } else { + zExp = pExp; + if (cSig64 < pSig64) { + zSig64 = pSig64 - cSig64; + } else if (pSig64 < cSig64) { + zSig64 = cSig64 - pSig64; + zSign ^= 1; + } else { + /* Exact zero */ + zSign = signflip; + if (STATUS(float_rounding_mode) == float_round_down) { + zSign ^= 1; + } + return packFloat32(zSign, 0, 0); + } + } + --zExp; + /* Do the equivalent of normalizeRoundAndPackFloat32() but + * starting with the significand in a uint64_t. + */ + shiftcount = countLeadingZeros64(zSig64) - 1; + zSig64 <<= shiftcount; + zExp -= shiftcount; + shift64RightJamming(zSig64, 32, &zSig64); + return roundAndPackFloat32(zSign, zExp, zSig64 STATUS_VAR); + } +} + + +/*---------------------------------------------------------------------------- | Returns the square root of the single-precision floating-point value `a'. | The operation is performed according to the IEC/IEEE Standard for Binary | Floating-Point Arithmetic. @@ -3465,6 +3676,223 @@ float64 float64_rem( float64 a, float64 b STATUS_PARAM ) } /*---------------------------------------------------------------------------- +| Returns the result of multiplying the double-precision floating-point values +| `a' and `b' then adding 'c', with no intermediate rounding step after the +| multiplication. The operation is performed according to the IEC/IEEE +| Standard for Binary Floating-Point Arithmetic 754-2008. +| The flags argument allows the caller to select negation of the +| addend, the intermediate product, or the final result. (The difference +| between this and having the caller do a separate negation is that negating +| externally will flip the sign bit on NaNs.) +*----------------------------------------------------------------------------*/ + +float64 float64_muladd(float64 a, float64 b, float64 c, int flags STATUS_PARAM) +{ + flag aSign, bSign, cSign, zSign; + int aExp, bExp, cExp, pExp, zExp, expDiff; + uint64_t aSig, bSig, cSig; + flag pInf, pZero, pSign; + uint64_t pSig0, pSig1, cSig0, cSig1, zSig0, zSig1; + int shiftcount; + flag signflip, infzero; + + a = float64_squash_input_denormal(a STATUS_VAR); + b = float64_squash_input_denormal(b STATUS_VAR); + c = float64_squash_input_denormal(c STATUS_VAR); + aSig = extractFloat64Frac(a); + aExp = extractFloat64Exp(a); + aSign = extractFloat64Sign(a); + bSig = extractFloat64Frac(b); + bExp = extractFloat64Exp(b); + bSign = extractFloat64Sign(b); + cSig = extractFloat64Frac(c); + cExp = extractFloat64Exp(c); + cSign = extractFloat64Sign(c); + + infzero = ((aExp == 0 && aSig == 0 && bExp == 0x7ff && bSig == 0) || + (aExp == 0x7ff && aSig == 0 && bExp == 0 && bSig == 0)); + + /* It is implementation-defined whether the cases of (0,inf,qnan) + * and (inf,0,qnan) raise InvalidOperation or not (and what QNaN + * they return if they do), so we have to hand this information + * off to the target-specific pick-a-NaN routine. + */ + if (((aExp == 0x7ff) && aSig) || + ((bExp == 0x7ff) && bSig) || + ((cExp == 0x7ff) && cSig)) { + return propagateFloat64MulAddNaN(a, b, c, infzero STATUS_VAR); + } + + if (infzero) { + float_raise(float_flag_invalid STATUS_VAR); + return float64_default_nan; + } + + if (flags & float_muladd_negate_c) { + cSign ^= 1; + } + + signflip = (flags & float_muladd_negate_result) ? 1 : 0; + + /* Work out the sign and type of the product */ + pSign = aSign ^ bSign; + if (flags & float_muladd_negate_product) { + pSign ^= 1; + } + pInf = (aExp == 0x7ff) || (bExp == 0x7ff); + pZero = ((aExp | aSig) == 0) || ((bExp | bSig) == 0); + + if (cExp == 0x7ff) { + if (pInf && (pSign ^ cSign)) { + /* addition of opposite-signed infinities => InvalidOperation */ + float_raise(float_flag_invalid STATUS_VAR); + return float64_default_nan; + } + /* Otherwise generate an infinity of the same sign */ + return packFloat64(cSign ^ signflip, 0x7ff, 0); + } + + if (pInf) { + return packFloat64(pSign ^ signflip, 0x7ff, 0); + } + + if (pZero) { + if (cExp == 0) { + if (cSig == 0) { + /* Adding two exact zeroes */ + if (pSign == cSign) { + zSign = pSign; + } else if (STATUS(float_rounding_mode) == float_round_down) { + zSign = 1; + } else { + zSign = 0; + } + return packFloat64(zSign ^ signflip, 0, 0); + } + /* Exact zero plus a denorm */ + if (STATUS(flush_to_zero)) { + float_raise(float_flag_output_denormal STATUS_VAR); + return packFloat64(cSign ^ signflip, 0, 0); + } + } + /* Zero plus something non-zero : just return the something */ + return c ^ ((uint64_t)signflip << 63); + } + + if (aExp == 0) { + normalizeFloat64Subnormal(aSig, &aExp, &aSig); + } + if (bExp == 0) { + normalizeFloat64Subnormal(bSig, &bExp, &bSig); + } + + /* Calculate the actual result a * b + c */ + + /* Multiply first; this is easy. */ + /* NB: we subtract 0x3fe where float64_mul() subtracts 0x3ff + * because we want the true exponent, not the "one-less-than" + * flavour that roundAndPackFloat64() takes. + */ + pExp = aExp + bExp - 0x3fe; + aSig = (aSig | LIT64(0x0010000000000000))<<10; + bSig = (bSig | LIT64(0x0010000000000000))<<11; + mul64To128(aSig, bSig, &pSig0, &pSig1); + if ((int64_t)(pSig0 << 1) >= 0) { + shortShift128Left(pSig0, pSig1, 1, &pSig0, &pSig1); + pExp--; + } + + zSign = pSign ^ signflip; + + /* Now [pSig0:pSig1] is the significand of the multiply, with the explicit + * bit in position 126. + */ + if (cExp == 0) { + if (!cSig) { + /* Throw out the special case of c being an exact zero now */ + shift128RightJamming(pSig0, pSig1, 64, &pSig0, &pSig1); + return roundAndPackFloat64(zSign, pExp - 1, + pSig1 STATUS_VAR); + } + normalizeFloat64Subnormal(cSig, &cExp, &cSig); + } + + cSig0 = cSig << 10; + cSig1 = 0; + cSig0 |= LIT64(0x4000000000000000); + expDiff = pExp - cExp; + + if (pSign == cSign) { + /* Addition */ + if (expDiff > 0) { + /* scale c to match p */ + shift128RightJamming(cSig0, cSig1, expDiff, &cSig0, &cSig1); + zExp = pExp; + } else if (expDiff < 0) { + /* scale p to match c */ + shift128RightJamming(pSig0, pSig1, -expDiff, &pSig0, &pSig1); + zExp = cExp; + } else { + /* no scaling needed */ + zExp = cExp; + } + /* Add significands and make sure explicit bit ends up in posn 126 */ + add128(pSig0, pSig1, cSig0, cSig1, &zSig0, &zSig1); + if ((int64_t)zSig0 < 0) { + shift128RightJamming(zSig0, zSig1, 1, &zSig0, &zSig1); + } else { + zExp--; + } + shift128RightJamming(zSig0, zSig1, 64, &zSig0, &zSig1); + return roundAndPackFloat64(zSign, zExp, zSig1 STATUS_VAR); + } else { + /* Subtraction */ + if (expDiff > 0) { + shift128RightJamming(cSig0, cSig1, expDiff, &cSig0, &cSig1); + sub128(pSig0, pSig1, cSig0, cSig1, &zSig0, &zSig1); + zExp = pExp; + } else if (expDiff < 0) { + shift128RightJamming(pSig0, pSig1, -expDiff, &pSig0, &pSig1); + sub128(cSig0, cSig1, pSig0, pSig1, &zSig0, &zSig1); + zExp = cExp; + zSign ^= 1; + } else { + zExp = pExp; + if (lt128(cSig0, cSig1, pSig0, pSig1)) { + sub128(pSig0, pSig1, cSig0, cSig1, &zSig0, &zSig1); + } else if (lt128(pSig0, pSig1, cSig0, cSig1)) { + sub128(cSig0, cSig1, pSig0, pSig1, &zSig0, &zSig1); + zSign ^= 1; + } else { + /* Exact zero */ + zSign = signflip; + if (STATUS(float_rounding_mode) == float_round_down) { + zSign ^= 1; + } + return packFloat64(zSign, 0, 0); + } + } + --zExp; + /* Do the equivalent of normalizeRoundAndPackFloat64() but + * starting with the significand in a pair of uint64_t. + */ + if (zSig0) { + shiftcount = countLeadingZeros64(zSig0) - 1; + shortShift128Left(zSig0, zSig1, shiftcount, &zSig0, &zSig1); + if (zSig1) { + zSig0 |= 1; + } + zExp -= shiftcount; + } else { + shiftcount = countLeadingZeros64(zSig1) - 1; + zSig0 = zSig1 << shiftcount; + zExp -= (shiftcount + 64); + } + return roundAndPackFloat64(zSign, zExp, zSig0 STATUS_VAR); + } +} + +/*---------------------------------------------------------------------------- | Returns the square root of the double-precision floating-point value `a'. | The operation is performed according to the IEC/IEEE Standard for Binary | Floating-Point Arithmetic. diff --git a/fpu/softfloat.h b/fpu/softfloat.h index 618ddee..07c2929 100644 --- a/fpu/softfloat.h +++ b/fpu/softfloat.h @@ -212,6 +212,18 @@ void set_floatx80_rounding_precision(int val STATUS_PARAM); void float_raise( int8 flags STATUS_PARAM); /*---------------------------------------------------------------------------- +| Options to indicate which negations to perform in float*_muladd() +| Using these differs from negating an input or output before calling +| the muladd function in that this means that a NaN doesn't have its +| sign bit inverted before it is propagated. +*----------------------------------------------------------------------------*/ +enum { + float_muladd_negate_c = 1, + float_muladd_negate_product = 2, + float_muladd_negate_result = 3, +}; + +/*---------------------------------------------------------------------------- | Software IEC/IEEE integer-to-floating-point conversion routines. *----------------------------------------------------------------------------*/ float32 int32_to_float32( int32 STATUS_PARAM ); @@ -269,6 +281,7 @@ float32 float32_sub( float32, float32 STATUS_PARAM ); float32 float32_mul( float32, float32 STATUS_PARAM ); float32 float32_div( float32, float32 STATUS_PARAM ); float32 float32_rem( float32, float32 STATUS_PARAM ); +float32 float32_muladd(float32, float32, float32, int STATUS_PARAM); float32 float32_sqrt( float32 STATUS_PARAM ); float32 float32_exp2( float32 STATUS_PARAM ); float32 float32_log2( float32 STATUS_PARAM ); @@ -375,6 +388,7 @@ float64 float64_sub( float64, float64 STATUS_PARAM ); float64 float64_mul( float64, float64 STATUS_PARAM ); float64 float64_div( float64, float64 STATUS_PARAM ); float64 float64_rem( float64, float64 STATUS_PARAM ); +float64 float64_muladd(float64, float64, float64, int STATUS_PARAM); float64 float64_sqrt( float64 STATUS_PARAM ); float64 float64_log2( float64 STATUS_PARAM ); int float64_eq( float64, float64 STATUS_PARAM );