From patchwork Fri Jan 31 19:17:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 860996 Delivered-To: patch@linaro.org Received: by 2002:adf:fb05:0:b0:385:e875:8a9e with SMTP id c5csp890929wrr; Fri, 31 Jan 2025 11:20:46 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWNOmn08iSoXnzxvmwCaeOQfnPjqKi1hL93qJmO4j7X9mfQK3OVgoiAP8baE7jXDUZXHWcUUw==@linaro.org X-Google-Smtp-Source: AGHT+IEgRnDHmo9cVeD8w0av3sSpkY+KOLpFNVFapl6qBrniqBdcBkWDmFw46C4/eUJuBGyP7Tge X-Received: by 2002:a05:620a:8086:b0:7b7:142d:53d4 with SMTP id af79cd13be357-7bffcd9b7d2mr1790985985a.39.1738351246034; Fri, 31 Jan 2025 11:20:46 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1738351246; cv=pass; d=google.com; s=arc-20240605; b=RTEG+xdT71P3x7etUPbrnMztri9C/ay0S4jZHiZuVntovyNKWJjXuD9xUZD1Wez8qV v+G4ZNJLgt7TrJ5PvN48uO4fPMhIPrFX1wC6AQSu3U7FtVu91iKq8t07+8PwzyyxAYb5 wuSQkO5Nc8S2tc5tLKTAomu0o9bVDct8a2dI0f996PFRipBwbXklXVBl3Phs/RhXRFqO KnFGs5Ul454ZkEGLdHUyXTNSnAssf9CHvtvKIR3K+C8kDUEs4ssO12ZOexzGydEa2zyv ADfVvHKqNlbJQQzCZEyrF6RXDANRV0SHnhG2ykdZwaLnrId9kJDgc3lCe4YW5mwJ9XO8 m2rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=xB3Oivyn8dTia2BSz5ooRDBVjKQdHfVjOxnvhbgt42U=; fh=T3r0zdCptMfvhfOunKrfbObcLBcSXv7WNDNor/iY12Y=; b=VAYKgqRk0dYDeCqRTnHkziPhpKGRUq6swbbzySUq/ZUENTb5JKTN4xmrLDAyF1kcsD hHN6wJNUPalYNafVCi0UuRyPARcpv3afR2D9oshX86XaMpUqQn6aiwvs4UCrehe4dsGJ AzO6paHaYR5cDSUa6OCJBCXOMFSvbB80gnlYnCGoEWfycUtXxE7LYznY/Tkql7CgoqBH Va4w3CsEw+UI1cdL9+dWei7HmTmKgtYZDoSZOw6Lf8vhTpm+/qmSQ98LbN9IZTXNwiBH XN679dwUuridswDy3D5hZrNhZ4UEbRl2LiReqN702Rwv2aHXcLo+d44XisFQiaMVmprm X9Ug==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FWw4hkva; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id af79cd13be357-7c00a903d78si463864485a.306.2025.01.31.11.20.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:20:46 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FWw4hkva; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A797D3858408 for ; Fri, 31 Jan 2025 19:20:45 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by sourceware.org (Postfix) with ESMTPS id 6D6883857810 for ; Fri, 31 Jan 2025 19:19:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D6883857810 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6D6883857810 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1031 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738351186; cv=none; b=I/q4ITEz+5aBque+9Fw2E456u5vwxzNFsyjRkdlFhCcbVskHBRItoY4insHZOYPrdn0tcEvZUJUkBgbvvckrJbk5IBLAtg3UIDCAoJczwzw5bR0NHYpFtwTufS4BCpohbRRbKB2amBGahG02jQA1lVr802rTT/41ChSOj4isQR0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738351186; c=relaxed/simple; bh=HvjwjHDdn1TWon6+kExEdwBYj3Czfv26zQZlSCiaQU4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=CyUG+8fDSx+o/TYH13njifkxwV1NLLoyTmHrIVYIveX6Tb4NRT4Hd/mCmkSQMNUpJsbosOjE/lkcAvln54FsPmlueM5O1ozR11WUSjwYj/fpkl3+2GUNB69uNEFe3O78KZqZdTaR0+ji+pMQSIJczQmPItoA+5o3W4ZxIwjFjJ8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1031.google.com with SMTP id 98e67ed59e1d1-2f441904a42so4061264a91.1 for ; Fri, 31 Jan 2025 11:19:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1738351156; x=1738955956; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xB3Oivyn8dTia2BSz5ooRDBVjKQdHfVjOxnvhbgt42U=; b=FWw4hkvaT+mIAwJH8349dO55JJCMGYwrgGG8N6BCnApzjcEQAieY1lN6Ty8IE3At7z VjYat2rqBvdLiQpSjz83L3/q9uJ9dtssaL+OAesaa4eTwC7oVCQ4VlyXGoPLFHUZJsT9 ufgg8kh+k0JL6UzCnq6utuX2hW71NIW6zSqf2blBCDSK6MrJtGHOrKRsOcuqWsb+a10v u0coHGjaqPpTZfBvH5fw4V0g2bMBeX/L1WVyrKbC8c3BYz1G3aGLf1YvUAPsA3mV6tgW SNd9N27ASIFelyqORRkRsoG3i8xuLyP7nHMPpur/8bJif6+eFHiZ68c51U7eln6Ijm/m i3hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351156; x=1738955956; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xB3Oivyn8dTia2BSz5ooRDBVjKQdHfVjOxnvhbgt42U=; b=WAPhDaBD/84ZJtGXAf5ILwerxHDDMAijFa0JXYEhgCZofgLa2UzWeEK4C9I8zxDGYi 34KZ5ZEubY5Wdxv+FUlJNX85j1Dr4yHAyInzkYto9I156tPZmRca2BTS3OztD/7aMaEJ JCrk1/GV+uvgNegbWfZmQJggi+ZVGwJ3emfhzKObhkRSGD7bGHeT6a2WSxfMww6s06H6 DnpEpmQ12otlWCeq/qUmrMu/0CZiqiSyosZ2QMWSkjDJXYTEDS+M8qFIzqXMoUzIQU6U nUwABb2uSYHjO0CcOuufOQtskdemkoRVz4H6LB1SLbmkIYnFxEIAIOr4S3jyTXFusKxG e5Zw== X-Gm-Message-State: AOJu0YzTjYnsHqUhyhJ9kRx2OIlu1wGBYVEmhJZv9T+S6moZJWPZhFMU gqS8G/muJCNtEM3dCpyTVtFu8UspZL9bPjE4GC8pj5knDD1kl0lgczYHUq7C4Dv1aUcoHa92T7V q X-Gm-Gg: ASbGnctIlkKlWFeuYZV4pMNeBSktCvGFjU8pW7f/BJjU7qFLQhP3mCGLW2Vh22+45Dz RRtWxlrnN3gzslDPtDCFvAIty+DyE7DehcVJxqew1t+PKsjCbxFmnZEXcMQFZGOj2CyIDjF00v3 QtSl7S4yaWVc7KgsOqieOdzdI6SDEvn12yRsKOxDwzne57Syk9WmM2auztIiSXgsBOY3scQrFSr c/+at5BlUiQJfTQeVq3uHJmrPbtKGc0rM4f4wCbithvN9X/JYF4ett78G6L5wNyxXhLbyYm8PEy sI2WOTQjqOKaghWOyX2j0p3NUu7qrg== X-Received: by 2002:a17:90b:37c3:b0:2ee:8ea0:6b9c with SMTP id 98e67ed59e1d1-2f83abda22dmr21001771a91.12.1738351156405; Fri, 31 Jan 2025 11:19:16 -0800 (PST) Received: from ubuntu-vm.. ([177.103.113.118]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f830a3d74esm2530475a91.2.2025.01.31.11.19.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:19:16 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: DJ Delorie , Joseph Myers , Paul Zimmermann , Alexei Sibidanov Subject: [PATCH 10/15] math: Use atan2pif from CORE-MATH Date: Fri, 31 Jan 2025 16:17:14 -0300 Message-ID: <20250131191844.2582716-11-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250131191844.2582716-1-adhemerval.zanella@linaro.org> References: <20250131191844.2582716-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic atan2pif. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 79.4006 70.8726 10.74% x86_64v2 77.5136 69.1424 10.80% x86_64v3 71.8050 68.1637 5.07% aarch64 (Neoverse) 27.8363 24.7700 11.02% power8 39.3893 17.2929 56.10% power10 19.7200 16.8187 14.71% reciprocal-throughput master patched improvement x86_64 38.3457 30.9471 19.29% x86_64v2 37.4023 30.3112 18.96% x86_64v3 33.0713 24.4891 25.95% aarch64 (Neoverse) 19.3683 15.3259 20.87% power8 19.5507 8.27165 57.69% power10 9.05331 7.63775 15.64% --- SHARED-FILES | 4 + sysdeps/aarch64/libm-test-ulps | 4 - sysdeps/arc/fpu/libm-test-ulps | 4 - sysdeps/arc/nofpu/libm-test-ulps | 1 - sysdeps/arm/libm-test-ulps | 4 - sysdeps/hppa/fpu/libm-test-ulps | 4 - sysdeps/i386/fpu/libm-test-ulps | 4 - .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - sysdeps/ieee754/flt-32/s_atan2pif.c | 238 ++++++++++++++++++ sysdeps/loongarch/lp64/libm-test-ulps | 4 - sysdeps/mips/mips64/libm-test-ulps | 4 - sysdeps/or1k/fpu/libm-test-ulps | 4 - sysdeps/or1k/nofpu/libm-test-ulps | 1 - sysdeps/powerpc/fpu/libm-test-ulps | 4 - sysdeps/riscv/nofpu/libm-test-ulps | 1 - sysdeps/riscv/rvd/libm-test-ulps | 4 - sysdeps/s390/fpu/libm-test-ulps | 4 - sysdeps/sparc/fpu/libm-test-ulps | 4 - sysdeps/x86_64/fpu/libm-test-ulps | 4 - 19 files changed, 242 insertions(+), 59 deletions(-) create mode 100644 sysdeps/ieee754/flt-32/s_atan2pif.c diff --git a/SHARED-FILES b/SHARED-FILES index e700f4b155..b403a2a6f0 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -342,3 +342,7 @@ sysdeps/ieee754/flt-32/s_asinpif.c: (src/binary32/asinpi/asinpif.c in CORE-MATH) - the code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_atan2pif.c: + (src/binary32/atan2pi/atan2pif.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index abb0611ee5..be29b37721 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -158,22 +158,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_advsimd": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index 35aebba38a..1383c88b95 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -90,19 +90,15 @@ double: 8 Function: "atan2pi": double: 1 -float: 1 Function: "atan2pi_downward": double: 1 -float: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 Function: "atan2pi_upward": double: 1 -float: 2 Function: "atan_downward": double: 1 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 325546e582..9028f5cbe7 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -24,7 +24,6 @@ double: 1 Function: "atan2pi": double: 1 -float: 1 Function: "atanh": double: 2 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index 0927fdb980..e1c538f79f 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -87,19 +87,15 @@ double: 1 Function: "atan2pi": double: 1 -float: 1 Function: "atan2pi_downward": double: 1 -float: 3 Function: "atan2pi_towardzero": double: 1 -float: 2 Function: "atan2pi_upward": double: 1 -float: 3 Function: "atan_downward": double: 1 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 02cc3b5ddc..796da7b5ab 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -87,19 +87,15 @@ double: 1 Function: "atan2pi": double: 1 -float: 1 Function: "atan2pi_downward": double: 1 -float: 3 Function: "atan2pi_towardzero": double: 1 -float: 2 Function: "atan2pi_upward": double: 1 -float: 3 Function: "atan_downward": double: 1 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index 69d0eb1eec..4f687c762b 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -146,25 +146,21 @@ ldouble: 1 Function: "atan2pi": double: 1 -float: 1 float128: 3 ldouble: 1 Function: "atan2pi_downward": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "atan2pi_upward": double: 2 -float: 2 float128: 2 ldouble: 2 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 392d7d252c..f24c87b302 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -146,25 +146,21 @@ ldouble: 1 Function: "atan2pi": double: 1 -float: 1 float128: 3 ldouble: 2 Function: "atan2pi_downward": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "atan2pi_upward": double: 2 -float: 2 float128: 2 ldouble: 2 diff --git a/sysdeps/ieee754/flt-32/s_atan2pif.c b/sysdeps/ieee754/flt-32/s_atan2pif.c new file mode 100644 index 0000000000..8c9cbc1373 --- /dev/null +++ b/sysdeps/ieee754/flt-32/s_atan2pif.c @@ -0,0 +1,238 @@ +/* Correctly-rounded half revolution arctangent function of two binary32 values. + +Copyright (c) 2022-2025 Alexei Sibidanov. + +The original version of this file was copied from the CORE-MATH +project (file src/binary32/atan2pi/atan2pif.c, revision dbebee1). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +*/ + +#include +#include +#include +#include +#include "math_config.h" + +static inline double +muldd (double xh, double xl, double ch, double cl, double *l) +{ + double ahlh = ch * xl; + double alhh = cl * xh; + double ahhh = ch * xh; + double ahhl = fma (ch, xh, -ahhh); + ahhl += alhh + ahlh; + ch = ahhh + ahhl; + *l = (ahhh - ch) + ahhl; + return ch; +} + +static double +polydd (double xh, double xl, int n, const double c[][2], double *l) +{ + int i = n - 1; + double ch = c[i][0], cl = c[i][1]; + while (--i >= 0) + { + ch = muldd (xh, xl, ch, cl, &cl); + double th = ch + c[i][0], tl = (c[i][0] - th) + ch; + ch = th; + cl += tl + c[i][1]; + } + *l = cl; + return ch; +} + +float +__atan2pif (float y, float x) +{ + static const double cn[] = + { + 0x1.45f306dc9c883p-2, 0x1.988d83a142adap-1, 0x1.747bebf492057p-1, + 0x1.2cc5645094ff3p-2, 0x1.a0521c711ab66p-5, 0x1.881b8058b9a0dp-9, + 0x1.b16ff514a0afp-16 + }; + static const double cd[] = + { + 0x1p+0, 0x1.6b8b143a3f6dap+1, 0x1.8421201d18ed5p+1, + 0x1.8221d086914ebp+0, 0x1.670657e3a07bap-2, 0x1.0f4951fd1e72dp-5, + 0x1.b3874b8798286p-11 + }; + static const double m[] = { 0, 1 }; + static const double off[] + = { 0.0f, 0.5f, 1.0f, 0.5f, -0.0f, -0.5f, -1.0f, -0.5f }; + static const float sgnf[] = { 1, -1 }; + static const double sgn[] = { 1, -1 }; + uint32_t ux = asuint (x); + uint32_t uy = asuint (y); + uint32_t ax = ux & (~0u >> 1); + uint32_t ay = uy & (~0u >> 1); + if (__glibc_unlikely (ay >= (0xff << 23) || ax >= (0xff << 23))) + { + if (ay > (0xff << 23)) + return x + y; /* nan */ + if (ax > (0xff << 23)) + return x + y; /* nan */ + uint32_t yinf = ay == (0xff << 23); + uint32_t xinf = ax == (0xff << 23); + if (yinf & xinf) + { + if (ux >> 31) + return 0.75f * sgnf[uy >> 31]; + else + return 0.25f * sgnf[uy >> 31]; + } + if (xinf) + { + if (ux >> 31) + return sgnf[uy >> 31]; + else + return 0.0f * sgnf[uy >> 31]; + } + if (yinf) + return 0.5f * sgnf[uy >> 31]; + } + if (__glibc_unlikely (ay == 0)) + { + if (__glibc_unlikely (!(ay | ax))) + { + uint32_t i = (uy >> 31) * 4 + (ux >> 31) * 2; + return off[i]; + } + if (!(ux >> 31)) + return 0.0f * sgnf[uy >> 31]; + } + if (__glibc_unlikely (ax == ay)) + { + static const float s[] = { 0.25, 0.75, -0.25, -0.75 }; + uint32_t i = (uy >> 31) * 2 + (ux >> 31); + return s[i]; + } + uint32_t gt = ay > ax, i = (uy >> 31) * 4 + (ux >> 31) * 2 + gt; + + double zx = x, zy = y; + double z = (m[gt] * zx + m[1 - gt] * zy) / (m[gt] * zy + m[1 - gt] * zx); + double r = cn[0], z2 = z*z; + z *= sgn[gt]; + /* avoid spurious underflow in the polynomial evaluation excluding extremely + small arguments */ + if (__glibc_likely (z2 > 0x1p-54)) + { + double z4 = z2*z2, z8 = z4*z4; + double cn0 = r + z2*cn[1]; + double cn2 = cn[2] + z2*cn[3]; + double cn4 = cn[4] + z2*cn[5]; + double cn6 = cn[6]; + cn0 += z4*cn2; + cn4 += z4*cn6; + cn0 += z8*cn4; + double cd0 = cd[0] + z2*cd[1]; + double cd2 = cd[2] + z2*cd[3]; + double cd4 = cd[4] + z2*cd[5]; + double cd6 = cd[6]; + cd0 += z4*cd2; + cd4 += z4*cd6; + cd0 += z8*cd4; + r = cn0/cd0; + } + r = z * r + off[i]; + uint64_t res = asuint64 (r); + if (__glibc_unlikely ((res << 1) > 0x6d40000000000000 + && ((res + 8) & 0xfffffff) <= 16)) + { + if (ax == ay) + { + static const double off2[] = { 0.25, 0.75, -0.25, -0.75 }; + r = off2[(uy >> 31) * 2 + (ux >> 31)]; + } + else + { + double zh, zl; + if (!gt) + { + zh = zy / zx; + zl = fma (zh, -zx, zy) / zx; + } + else + { + zh = zx / zy; + zl = fma (zh, -zy, zx) / zy; + } + double z2l, z2h = muldd (zh, zl, zh, zl, &z2l); + static const double c[][2] = + { + { 0x1.45f306dc9c883p-2, -0x1.6b01ec5513324p-56 }, + { -0x1.b2995e7b7b604p-4, 0x1.e402b0c13eedcp-58 }, + { 0x1.04c26be3b06cfp-4, -0x1.571d178a53efp-60 }, + { -0x1.7483758e69c03p-5, 0x1.819a6ed7aaf38p-63 }, + { 0x1.21bb9452523ffp-5, -0x1.234d866fb9807p-60 }, + { -0x1.da1bace3cc54ep-6, -0x1.c84f6ada49294p-64 }, + { 0x1.912b1c23345ddp-6, -0x1.534890fbc165p-60 }, + { -0x1.5bade52f5f52ap-6, 0x1.f783bafc832f6p-60 }, + { 0x1.32c69d084c5cp-6, 0x1.042d155953025p-60 }, + { -0x1.127bcfb3e8c7dp-6, -0x1.85aae199a7b6bp-60 }, + { 0x1.f0af43b11a731p-7, 0x1.8f0356356663p-61 }, + { -0x1.c57e86801029ep-7, 0x1.dcdf3e3b38eb4p-61 }, + { 0x1.a136408617ea1p-7, 0x1.a71affb36c6c4p-63 }, + { -0x1.824ac7814ba37p-7, 0x1.8928b295c0898p-61 }, + { 0x1.6794e32ea5471p-7, 0x1.0b4334fb41e63p-61 }, + { -0x1.501d57f643d97p-7, 0x1.516785bf1376ep-61 }, + { 0x1.3adf02ff2400ap-7, -0x1.b0e30bb8c8076p-62 }, + { -0x1.267702f94faap-7, -0x1.7a4d3a1850cc6p-62 }, + { 0x1.10dce97099686p-7, 0x1.fcc208eee2571p-61 }, + { -0x1.eee49cdad8002p-8, -0x1.9109b3f1bab82p-64 }, + { 0x1.af93bc191a929p-8, 0x1.069fd3b47d7bp-62 }, + { -0x1.6240751b54675p-8, -0x1.72dc8cfd03b6fp-62 }, + { 0x1.0b61e84080884p-8, 0x1.825824c80941bp-63 }, + { -0x1.6a72a8a74e3a5p-9, 0x1.8786a82fd117ep-63 }, + { 0x1.aede3217d939dp-10, -0x1.93b626982e1fep-68 }, + { -0x1.b66568f09ebeep-11, -0x1.704a39121d0a5p-66 }, + { 0x1.73af3977fa973p-12, -0x1.aa050e2244ea3p-68 }, + { -0x1.fc69d85ed28c9p-14, 0x1.867f17b764cap-68 }, + { 0x1.0c883a9270162p-15, -0x1.6842833896dd9p-70 }, + { -0x1.9a0b27b6dfe15p-18, 0x1.427fc2f4e1327p-73 }, + { 0x1.91e15e7ab5bdcp-21, -0x1.730dbc6279d0dp-77 }, + { -0x1.7b1119c1ff867p-25, 0x1.145f9980759c4p-79 } + }; + double pl, ph = polydd (z2h, z2l, 32, c, &pl); + zh *= sgn[gt]; + zl *= sgn[gt]; + ph = muldd (zh, zl, ph, pl, &pl); + double sh = ph + off[i], sl = ((off[i] - sh) + ph) + pl; + float rf = sh; + double th = rf, dh = sh - th, tm = dh + sl; + r = th + tm; + double d = r - th; + if (!(asuint64 (d) << 12)) + { + double ad = fabs (d), am = fabs (tm); + if (ad > am) + r -= d * 0x1p-10; + if (ad < am) + r += d * 0x1p-10; + } + } + } + float rf = r; + if (__glibc_unlikely (rf == 0.0f && y != 0.0f)) + __set_errno (ERANGE); + return rf; +} +libm_alias_float (__atan2pi, atan2pi) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 33dd6718ba..d5adc119cf 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -118,22 +118,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_downward": diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 869ceff928..c901b00f20 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -118,22 +118,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_downward": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index 75db236e09..9934382bde 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -87,19 +87,15 @@ double: 8 Function: "atan2pi": double: 1 -float: 1 Function: "atan2pi_downward": double: 1 -float: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 Function: "atan2pi_upward": double: 1 -float: 2 Function: "atan_downward": double: 1 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index a1f7c80097..7ff5ee4425 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -69,7 +69,6 @@ double: 8 Function: "atan2pi": double: 1 -float: 1 Function: "atan_downward": double: 1 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index fa3cf2e844..b1c01b4d94 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -151,25 +151,21 @@ ldouble: 3 Function: "atan2pi": double: 1 -float: 1 float128: 3 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 float128: 2 ldouble: 4 Function: "atan2pi_towardzero": double: 1 -float: 2 float128: 2 ldouble: 5 Function: "atan2pi_upward": double: 1 -float: 2 float128: 2 ldouble: 4 diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index a5184ecad9..f55df65c6a 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -94,7 +94,6 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan_downward": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 3bfc9668d5..879f5c5669 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -118,22 +118,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_downward": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 7d61bf1cef..c4a27b96ad 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -118,22 +118,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_downward": diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index 426f45893e..fbf1507bd9 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -118,22 +118,18 @@ ldouble: 2 Function: "atan2pi": double: 1 -float: 1 ldouble: 3 Function: "atan2pi_downward": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 2 ldouble: 2 Function: "atan_downward": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index d4c4bfa42b..a340df6243 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -244,25 +244,21 @@ float: 2 Function: "atan2pi": double: 1 -float: 1 float128: 3 ldouble: 2 Function: "atan2pi_downward": double: 1 -float: 3 float128: 2 ldouble: 2 Function: "atan2pi_towardzero": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "atan2pi_upward": double: 1 -float: 3 float128: 2 ldouble: 2