From patchwork Fri Jun 5 18:59:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Myers X-Patchwork-Id: 281241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95C47C433DF for ; Fri, 5 Jun 2020 19:02:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6ACF32077D for ; Fri, 5 Jun 2020 19:02:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6ACF32077D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46540 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhHbQ-0000xi-F6 for qemu-devel@archiver.kernel.org; Fri, 05 Jun 2020 15:02:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58188) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHZG-00078r-1j for qemu-devel@nongnu.org; Fri, 05 Jun 2020 14:59:50 -0400 Received: from esa4.mentor.iphmx.com ([68.232.137.252]:33932) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHZE-0006dV-4I for qemu-devel@nongnu.org; Fri, 05 Jun 2020 14:59:49 -0400 IronPort-SDR: PNCzEkExifI0E5ZgJJk48ARcyoDe38aMDKvp0yI0GhzOK2iB862q7a58/ZSzGkz1uuYPzItN4j 7VtWehj5cY5cZlkj6LTEetfsFVZt0rIgKrDt4db2Tv86NUIdi8yaLywzf2eqbgI8VyLTDmO2B4 5KeU/+H+KI6btnjVQtEojfEUkbD5Zy/nD0XIQT5JOkyV/4G1eMaoNkN30B8iF/0Hv+63zX5+po zcCeCbXkF/v/ELs5nk6NCpyVOtT7zuxCG+IkuFIFPYuIAOF3hkuP4tOXvm6n9FHp8VJ+Y0LfVJ 1Nw= X-IronPort-AV: E=Sophos;i="5.73,477,1583222400"; d="scan'208";a="49633014" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 05 Jun 2020 10:59:45 -0800 IronPort-SDR: 0CKygn2+BMp+qFux50QRvAri70GpQKg/2zWcYNRoOeRy7DMP6kjMXwrkiZUWXPHDs5enix2j1I g2IuRyfmuvJWRKF/ToJeQ/qFTr7XuZ3PRnnwBF5sTlBNtTRVdx5GBPMIo3U3PONCN5AbwCeH1j VebSrLivS76sWu8I8bx2OTgUmjNr5x61NWBo0fF889vrCEHJabEOhuz2CR5irIdJrmJ+PfwzUc i13VbZ8lY9+BzafgYit+VO+bLgGB/cJ/lq5u+9joxZ/e0czhp/8f9ToDSrbTB06QYa2PHCskuN YbQ= Date: Fri, 5 Jun 2020 18:59:39 +0000 From: Joseph Myers X-X-Sender: jsm28@digraph.polyomino.org.uk To: , , , , , , , Subject: [PATCH 1/7] softfloat: merge floatx80_mod and floatx80_rem In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) Received-SPF: pass client-ip=68.232.137.252; envelope-from=joseph_myers@mentor.com; helo=esa4.mentor.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/05 14:59:46 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The m68k-specific softfloat code includes a function floatx80_mod that is extremely similar to floatx80_rem, but computing the remainder based on truncating the quotient toward zero rather than rounding it to nearest integer. This is also useful for emulating the x87 fprem and fprem1 instructions. Change the floatx80_rem implementation into floatx80_modrem that can perform either operation, with both floatx80_rem and floatx80_mod as thin wrappers available for all targets. There does not appear to be any use for the _mod operation for other floating-point formats in QEMU (the only other architectures using _rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc, for instructions that have been removed in the latest version of the architecture), so no change is made to the code for other formats. Signed-off-by: Joseph Myers --- fpu/softfloat.c | 49 ++++++++++++++++++------ include/fpu/softfloat.h | 2 + target/m68k/softfloat.c | 83 ----------------------------------------- target/m68k/softfloat.h | 1 - 4 files changed, 40 insertions(+), 95 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 6c8f2d597a..7b1ce7664f 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -5682,10 +5682,13 @@ floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status) /*---------------------------------------------------------------------------- | Returns the remainder of the extended double-precision floating-point value | `a' with respect to the corresponding value `b'. The operation is performed -| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic. +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic, +| if 'mod' is false; if 'mod' is true, return the remainder based on truncating +| the quotient toward zero instead. *----------------------------------------------------------------------------*/ -floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) +floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, + float_status *status) { bool aSign, zSign; int32_t aExp, bExp, expDiff; @@ -5731,7 +5734,7 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) expDiff = aExp - bExp; aSig1 = 0; if ( expDiff < 0 ) { - if ( expDiff < -1 ) return a; + if ( mod || expDiff < -1 ) return a; shift128Right( aSig0, 0, 1, &aSig0, &aSig1 ); expDiff = 0; } @@ -5763,14 +5766,16 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) term1 = 0; term0 = bSig; } - sub128( term0, term1, aSig0, aSig1, &alternateASig0, &alternateASig1 ); - if ( lt128( alternateASig0, alternateASig1, aSig0, aSig1 ) - || ( eq128( alternateASig0, alternateASig1, aSig0, aSig1 ) - && ( q & 1 ) ) - ) { - aSig0 = alternateASig0; - aSig1 = alternateASig1; - zSign = ! zSign; + if (!mod) { + sub128( term0, term1, aSig0, aSig1, &alternateASig0, &alternateASig1 ); + if ( lt128( alternateASig0, alternateASig1, aSig0, aSig1 ) + || ( eq128( alternateASig0, alternateASig1, aSig0, aSig1 ) + && ( q & 1 ) ) + ) { + aSig0 = alternateASig0; + aSig1 = alternateASig1; + zSign = ! zSign; + } } return normalizeRoundAndPackFloatx80( @@ -5778,6 +5783,28 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) } +/*---------------------------------------------------------------------------- +| Returns the remainder of the extended double-precision floating-point value +| `a' with respect to the corresponding value `b'. The operation is performed +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic. +*----------------------------------------------------------------------------*/ + +floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) +{ + return floatx80_modrem(a, b, false, status); +} + +/*---------------------------------------------------------------------------- +| Returns the remainder of the extended double-precision floating-point value +| `a' with respect to the corresponding value `b', with the quotient truncated +| toward zero. +*----------------------------------------------------------------------------*/ + +floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status) +{ + return floatx80_modrem(a, b, true, status); +} + /*---------------------------------------------------------------------------- | Returns the square root of the extended double-precision floating-point | value `a'. The operation is performed according to the IEC/IEEE Standard diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 16ca697a73..bff6934d09 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -687,6 +687,8 @@ floatx80 floatx80_add(floatx80, floatx80, float_status *status); floatx80 floatx80_sub(floatx80, floatx80, float_status *status); floatx80 floatx80_mul(floatx80, floatx80, float_status *status); floatx80 floatx80_div(floatx80, floatx80, float_status *status); +floatx80 floatx80_modrem(floatx80, floatx80, bool, float_status *status); +floatx80 floatx80_mod(floatx80, floatx80, float_status *status); floatx80 floatx80_rem(floatx80, floatx80, float_status *status); floatx80 floatx80_sqrt(floatx80, float_status *status); FloatRelation floatx80_compare(floatx80, floatx80, float_status *status); diff --git a/target/m68k/softfloat.c b/target/m68k/softfloat.c index 9f120cf15e..b6d0ed7acf 100644 --- a/target/m68k/softfloat.c +++ b/target/m68k/softfloat.c @@ -42,89 +42,6 @@ static floatx80 propagateFloatx80NaNOneArg(floatx80 a, float_status *status) return a; } -/* - * Returns the modulo remainder of the extended double-precision floating-point - * value `a' with respect to the corresponding value `b'. - */ - -floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status) -{ - bool aSign, zSign; - int32_t aExp, bExp, expDiff; - uint64_t aSig0, aSig1, bSig; - uint64_t qTemp, term0, term1; - - aSig0 = extractFloatx80Frac(a); - aExp = extractFloatx80Exp(a); - aSign = extractFloatx80Sign(a); - bSig = extractFloatx80Frac(b); - bExp = extractFloatx80Exp(b); - - if (aExp == 0x7FFF) { - if ((uint64_t) (aSig0 << 1) - || ((bExp == 0x7FFF) && (uint64_t) (bSig << 1))) { - return propagateFloatx80NaN(a, b, status); - } - goto invalid; - } - if (bExp == 0x7FFF) { - if ((uint64_t) (bSig << 1)) { - return propagateFloatx80NaN(a, b, status); - } - return a; - } - if (bExp == 0) { - if (bSig == 0) { - invalid: - float_raise(float_flag_invalid, status); - return floatx80_default_nan(status); - } - normalizeFloatx80Subnormal(bSig, &bExp, &bSig); - } - if (aExp == 0) { - if ((uint64_t) (aSig0 << 1) == 0) { - return a; - } - normalizeFloatx80Subnormal(aSig0, &aExp, &aSig0); - } - bSig |= UINT64_C(0x8000000000000000); - zSign = aSign; - expDiff = aExp - bExp; - aSig1 = 0; - if (expDiff < 0) { - return a; - } - qTemp = (bSig <= aSig0); - if (qTemp) { - aSig0 -= bSig; - } - expDiff -= 64; - while (0 < expDiff) { - qTemp = estimateDiv128To64(aSig0, aSig1, bSig); - qTemp = (2 < qTemp) ? qTemp - 2 : 0; - mul64To128(bSig, qTemp, &term0, &term1); - sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1); - shortShift128Left(aSig0, aSig1, 62, &aSig0, &aSig1); - expDiff -= 62; - } - expDiff += 64; - if (0 < expDiff) { - qTemp = estimateDiv128To64(aSig0, aSig1, bSig); - qTemp = (2 < qTemp) ? qTemp - 2 : 0; - qTemp >>= 64 - expDiff; - mul64To128(bSig, qTemp << (64 - expDiff), &term0, &term1); - sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1); - shortShift128Left(0, bSig, 64 - expDiff, &term0, &term1); - while (le128(term0, term1, aSig0, aSig1)) { - ++qTemp; - sub128(aSig0, aSig1, term0, term1, &aSig0, &aSig1); - } - } - return - normalizeRoundAndPackFloatx80( - 80, zSign, bExp + expDiff, aSig0, aSig1, status); -} - /* * Returns the mantissa of the extended double-precision floating-point * value `a'. diff --git a/target/m68k/softfloat.h b/target/m68k/softfloat.h index 365ef6ac7a..4bb9567134 100644 --- a/target/m68k/softfloat.h +++ b/target/m68k/softfloat.h @@ -23,7 +23,6 @@ #define TARGET_M68K_SOFTFLOAT_H #include "fpu/softfloat.h" -floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status); floatx80 floatx80_getman(floatx80 a, float_status *status); floatx80 floatx80_getexp(floatx80 a, float_status *status); floatx80 floatx80_scale(floatx80 a, floatx80 b, float_status *status); From patchwork Fri Jun 5 19:01:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Myers X-Patchwork-Id: 281239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 715B7C433E0 for ; Fri, 5 Jun 2020 19:04:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 09D682077D for ; Fri, 5 Jun 2020 19:04:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 09D682077D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhHdo-0004Lz-22 for qemu-devel@archiver.kernel.org; Fri, 05 Jun 2020 15:04:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58458) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHbH-00019D-E8 for qemu-devel@nongnu.org; Fri, 05 Jun 2020 15:01:55 -0400 Received: from esa2.mentor.iphmx.com ([68.232.141.98]:23580) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHbG-00076X-63 for qemu-devel@nongnu.org; Fri, 05 Jun 2020 15:01:54 -0400 IronPort-SDR: GESuwUlyosFx7RetCaBg7rzrfteYNguZE7SsASwaltADL/OLqJ8DyzdtgdqzMLBJ1Jnx50e+uf fYPMOrDWKfPZAbBBYWlSoomB3Y3ZJleHWXAMwEtC2/oBsBw0+tQTIfY/LvUnwstYXtW5JgblOH sSYwmPCu1yZKHgNWjkdid5wLit7nMMxk3NVYUztIOvyF/7A68rnJ0E1MTBQHIewxFsrqnuBetW /LHTvqcYVPfQXvUGIGw9HK+ByfGSeXPRNX3hdJvydHfpL6gU3Hx4ZNN2fHeHGZzbIJo1vRUhcU Hws= X-IronPort-AV: E=Sophos;i="5.73,477,1583222400"; d="scan'208";a="49520928" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 05 Jun 2020 11:01:52 -0800 IronPort-SDR: 0MqKEzmhVaRvFZrDZgkWBBV3SjyS1fZKaR7r85cI6dXFqt6lo92ZKt90CA7Qtsl96Br6VOGtFH X942gZLfkhce5SfNdlV/60t4tgu60xCD8P/CzHRssfy1bnouNkuvYjJza8e6Pwf0GjY7fcaR+Q SmiUwgOaQqcIjbjjV5U7NNxyhuEsI5R+dLDhBVshB0aC27X0k81PNi9i+Y0YR8wI9bnYy0wolz rEXbeK2lg2oidusI/tdKmWH+xYJVLIE3BwM4EGO4NKfgHVe7O5aPBx5t6Ao3G2pADPfgRmIV8g woM= Date: Fri, 5 Jun 2020 19:01:46 +0000 From: Joseph Myers X-X-Sender: jsm28@digraph.polyomino.org.uk To: , , , , , , , Subject: [PATCH 5/7] softfloat: return low bits of quotient from floatx80_modrem In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) To svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) Received-SPF: pass client-ip=68.232.141.98; envelope-from=joseph_myers@mentor.com; helo=esa2.mentor.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/05 14:58:48 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Both x87 and m68k need the low parts of the quotient for their remainder operations. Arrange for floatx80_modrem to track those bits and return them via a pointer. The architectures using float32_rem and float64_rem do not appear to need this information, so the *_rem interface is left unchanged and the information returned only from floatx80_modrem. The logic used to determine the low 7 bits of the quotient for m68k (target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it looks at the result of converting the remainder to integer, the quotient having been discarded by that point); this patch does not change that, but the m68k maintainers may wish to do so. Signed-off-by: Joseph Myers --- fpu/softfloat.c | 23 ++++++++++++++++++----- include/fpu/softfloat.h | 3 ++- 2 files changed, 20 insertions(+), 6 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 423a815196..c3c3f382af 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -5684,10 +5684,11 @@ floatx80 floatx80_div(floatx80 a, floatx80 b, float_status *status) | `a' with respect to the corresponding value `b'. The operation is performed | according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic, | if 'mod' is false; if 'mod' is true, return the remainder based on truncating -| the quotient toward zero instead. +| the quotient toward zero instead. '*quotient' is set to the low 64 bits of +| the absolute value of the integer quotient. *----------------------------------------------------------------------------*/ -floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, +floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, uint64_t *quotient, float_status *status) { bool aSign, zSign; @@ -5695,6 +5696,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, uint64_t aSig0, aSig1, bSig; uint64_t q, term0, term1, alternateASig0, alternateASig1; + *quotient = 0; if (floatx80_invalid_encoding(a) || floatx80_invalid_encoding(b)) { float_raise(float_flag_invalid, status); return floatx80_default_nan(status); @@ -5749,7 +5751,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, shift128Right( aSig0, 0, 1, &aSig0, &aSig1 ); expDiff = 0; } - q = ( bSig <= aSig0 ); + *quotient = q = ( bSig <= aSig0 ); if ( q ) aSig0 -= bSig; expDiff -= 64; while ( 0 < expDiff ) { @@ -5759,6 +5761,8 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, sub128( aSig0, aSig1, term0, term1, &aSig0, &aSig1 ); shortShift128Left( aSig0, aSig1, 62, &aSig0, &aSig1 ); expDiff -= 62; + *quotient <<= 62; + *quotient += q; } expDiff += 64; if ( 0 < expDiff ) { @@ -5772,6 +5776,12 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, ++q; sub128( aSig0, aSig1, term0, term1, &aSig0, &aSig1 ); } + if (expDiff < 64) { + *quotient <<= expDiff; + } else { + *quotient = 0; + } + *quotient += q; } else { term1 = 0; @@ -5786,6 +5796,7 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, aSig0 = alternateASig0; aSig1 = alternateASig1; zSign = ! zSign; + ++*quotient; } } return @@ -5802,7 +5813,8 @@ floatx80 floatx80_modrem(floatx80 a, floatx80 b, bool mod, floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) { - return floatx80_modrem(a, b, false, status); + uint64_t quotient; + return floatx80_modrem(a, b, false, "ient, status); } /*---------------------------------------------------------------------------- @@ -5813,7 +5825,8 @@ floatx80 floatx80_rem(floatx80 a, floatx80 b, float_status *status) floatx80 floatx80_mod(floatx80 a, floatx80 b, float_status *status) { - return floatx80_modrem(a, b, true, status); + uint64_t quotient; + return floatx80_modrem(a, b, true, "ient, status); } /*---------------------------------------------------------------------------- diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index bff6934d09..ff4e2605b1 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -687,7 +687,8 @@ floatx80 floatx80_add(floatx80, floatx80, float_status *status); floatx80 floatx80_sub(floatx80, floatx80, float_status *status); floatx80 floatx80_mul(floatx80, floatx80, float_status *status); floatx80 floatx80_div(floatx80, floatx80, float_status *status); -floatx80 floatx80_modrem(floatx80, floatx80, bool, float_status *status); +floatx80 floatx80_modrem(floatx80, floatx80, bool, uint64_t *, + float_status *status); floatx80 floatx80_mod(floatx80, floatx80, float_status *status); floatx80 floatx80_rem(floatx80, floatx80, float_status *status); floatx80 floatx80_sqrt(floatx80, float_status *status); From patchwork Fri Jun 5 19:02:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joseph Myers X-Patchwork-Id: 281240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5150C433DF for ; Fri, 5 Jun 2020 19:03:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 746412077D for ; Fri, 5 Jun 2020 19:03:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 746412077D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52498 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jhHd5-0003RT-LC for qemu-devel@archiver.kernel.org; Fri, 05 Jun 2020 15:03:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58494) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHbg-0001m8-A9 for qemu-devel@nongnu.org; Fri, 05 Jun 2020 15:02:20 -0400 Received: from esa4.mentor.iphmx.com ([68.232.137.252]:34075) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jhHbf-00079E-8L for qemu-devel@nongnu.org; Fri, 05 Jun 2020 15:02:19 -0400 IronPort-SDR: lBm2zgaXa52gU9BgZFS7yOl7g/uy0ivLid77BkBYvU88+5elh4WW/6m7m7ALhTS/Bhr87ogz7w NYOaOMNHSVTxCVYVH56Qs0onVwGWBFL8mx0wgjXHDwaIXK46mvWeAsY5Z+yyK7A8q+rRWLeL1d CUnkkgf1w+/sifzxxh94b0xsfV2X1ApbWkUmx1LPF4fBg5bNH70tflZSPH6gxxxb64n3AbCEsO UwltTmJb/8hFLbXu2xZQgqtzPD03+Q2OSmifSjyb1qE74Bzlm8uHO4LKcVnUkliKSKZwzozUna xUI= X-IronPort-AV: E=Sophos;i="5.73,477,1583222400"; d="scan'208";a="49633101" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 05 Jun 2020 11:02:17 -0800 IronPort-SDR: 0raJVmDGsvqfEhq/MUoYe1P/HVwTIRmNY+sOSXA3ZtN7xAuhX0OlR6i3QyZL2N7sGGgV/bz7i2 jtnyDj5jHUOsFHVSzA9ofsYtrsHPqyX0d3GA9dJ/VMVnLfEC2QvJMzuGyK7Je9h0U460pGxqxG TxMAGFcfsceo8EzdPio7c98rM7s9bb8+6ZaFQG1nsz2992NzaNPZctzdD51VJNYAkE/TDegFqH R6jGror8uPGWcxGPge33M284X3Fp0A3YK/TkBp+2X0TnjBvlGZroK9yQndn4Kp6jtKdd1pjz/u eF0= Date: Fri, 5 Jun 2020 19:02:11 +0000 From: Joseph Myers X-X-Sender: jsm28@digraph.polyomino.org.uk To: , , , , , , , Subject: [PATCH 6/7] target/i386: reimplement fprem1 using floatx80 operations In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) To svr-ies-mbx-02.mgc.mentorg.com (139.181.222.2) Received-SPF: pass client-ip=68.232.137.252; envelope-from=joseph_myers@mentor.com; helo=esa4.mentor.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/06/05 14:59:46 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -39 X-Spam_score: -4.0 X-Spam_bar: ---- X-Spam_report: (-4.0 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The x87 fprem1 emulation is currently based around conversion to double, which is inherently unsuitable for a good emulation of any floatx80 operation. Reimplement using the soft-float floatx80 remainder operations. Signed-off-by: Joseph Myers --- target/i386/fpu_helper.c | 96 +++++++++++++++++++--------------------- 1 file changed, 45 insertions(+), 51 deletions(-) diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c index 8ef5b463ea..bab35e00a0 100644 --- a/target/i386/fpu_helper.c +++ b/target/i386/fpu_helper.c @@ -934,63 +934,57 @@ void helper_fxtract(CPUX86State *env) merge_exception_flags(env, old_flags); } -void helper_fprem1(CPUX86State *env) +static void helper_fprem_common(CPUX86State *env, bool mod) { - double st0, st1, dblq, fpsrcop, fptemp; - CPU_LDoubleU fpsrcop1, fptemp1; - int expdif; - signed long long int q; - - st0 = floatx80_to_double(env, ST0); - st1 = floatx80_to_double(env, ST1); - - if (isinf(st0) || isnan(st0) || isnan(st1) || (st1 == 0.0)) { - ST0 = double_to_floatx80(env, 0.0 / 0.0); /* NaN */ - env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */ - return; - } - - fpsrcop = st0; - fptemp = st1; - fpsrcop1.d = ST0; - fptemp1.d = ST1; - expdif = EXPD(fpsrcop1) - EXPD(fptemp1); - - if (expdif < 0) { - /* optimisation? taken from the AMD docs */ - env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */ - /* ST0 is unchanged */ - return; - } + uint8_t old_flags = save_exception_flags(env); + uint64_t quotient; + CPU_LDoubleU temp0, temp1; + int exp0, exp1, expdiff; - if (expdif < 53) { - dblq = fpsrcop / fptemp; - /* round dblq towards nearest integer */ - dblq = rint(dblq); - st0 = fpsrcop - fptemp * dblq; + temp0.d = ST0; + temp1.d = ST1; + exp0 = EXPD(temp0); + exp1 = EXPD(temp1); - /* convert dblq to q by truncating towards zero */ - if (dblq < 0.0) { - q = (signed long long int)(-dblq); + env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */ + if (floatx80_is_zero(ST0) || floatx80_is_zero(ST1) || + exp0 == 0x7fff || exp1 == 0x7fff || + floatx80_invalid_encoding(ST0) || floatx80_invalid_encoding(ST1)) { + ST0 = floatx80_modrem(ST0, ST1, mod, "ient, &env->fp_status); + } else { + if (exp0 == 0) { + exp0 = 1 - clz64(temp0.l.lower); + } + if (exp1 == 0) { + exp1 = 1 - clz64(temp1.l.lower); + } + expdiff = exp0 - exp1; + if (expdiff < 64) { + ST0 = floatx80_modrem(ST0, ST1, mod, "ient, &env->fp_status); + env->fpus |= (quotient & 0x4) << (8 - 2); /* (C0) <-- q2 */ + env->fpus |= (quotient & 0x2) << (14 - 1); /* (C3) <-- q1 */ + env->fpus |= (quotient & 0x1) << (9 - 0); /* (C1) <-- q0 */ } else { - q = (signed long long int)dblq; + /* Partial remainder. This choice of how many bits to + * process at once is specified in AMD instruction set + * manuals, and empirically is followed by Intel + * processors as well; it ensures that the final remainder + * operation in a loop does produce the correct low three + * bits of the quotient. AMD manuals specify that the + * flags other than C2 are cleared, and empirically Intel + * processors clear them as well. */ + int n = 32 + (expdiff % 32); + temp1.d = floatx80_scalbn(temp1.d, expdiff - n, &env->fp_status); + ST0 = floatx80_mod(ST0, temp1.d, &env->fp_status); + env->fpus |= 0x400; /* C2 <-- 1 */ } - - env->fpus &= ~0x4700; /* (C3,C2,C1,C0) <-- 0000 */ - /* (C0,C3,C1) <-- (q2,q1,q0) */ - env->fpus |= (q & 0x4) << (8 - 2); /* (C0) <-- q2 */ - env->fpus |= (q & 0x2) << (14 - 1); /* (C3) <-- q1 */ - env->fpus |= (q & 0x1) << (9 - 0); /* (C1) <-- q0 */ - } else { - env->fpus |= 0x400; /* C2 <-- 1 */ - fptemp = pow(2.0, expdif - 50); - fpsrcop = (st0 / st1) / fptemp; - /* fpsrcop = integer obtained by chopping */ - fpsrcop = (fpsrcop < 0.0) ? - -(floor(fabs(fpsrcop))) : floor(fpsrcop); - st0 -= (st1 * fpsrcop * fptemp); } - ST0 = double_to_floatx80(env, st0); + merge_exception_flags(env, old_flags); +} + +void helper_fprem1(CPUX86State *env) +{ + helper_fprem_common(env, false); } void helper_fprem(CPUX86State *env)