From patchwork Mon Nov 11 13:45:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 842458 Delivered-To: patch@linaro.org Received: by 2002:a5d:6307:0:b0:381:e71e:8f7b with SMTP id i7csp3034386wru; Mon, 11 Nov 2024 05:48:57 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU5Ojv55pZ6ta0oiPGmAQtB2hMRzunC0DgiaJQs5jAaQUmikXXjjxl2VWNcCBnulH6H97oFVg==@linaro.org X-Google-Smtp-Source: AGHT+IEUanYtOVgtXFCBaAEN7NzA7y2d8a/rE1fvIViWAtule8aDWxie46Wy+sRRAJ6r7ab9ACDD X-Received: by 2002:a05:622a:1a15:b0:462:aac9:56c5 with SMTP id d75a77b69052e-46309459984mr183103941cf.56.1731332936764; Mon, 11 Nov 2024 05:48:56 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1731332936; cv=pass; d=google.com; s=arc-20240605; b=HntCVp3id+9utwU7uggA98WihuIZcjhRWIUJgZjsKgantxk7j+BvS0cgVBmh99/zav 1728GF0txtOmnL1pGctN8CNEeE3OCiagybmJufzOi9SaWuuAdjIvm4ZYtOzralPFJTq3 fHQjGyRzSwmbp6TyUrihYrPZgrPVu9fbZ1iQRWVesq5JXs3Blg2RJ1ImRlqdUcvmya25 gfLWTHMdPAPDhYl0wm/ShpOVSLo57IJo3sDdYgOowSv7lqNh5a1C1IgUoO7xz4Jv+a4f kLLnggZ5jSN9jfhyHiz6SQdWXMDAdxYYKjS6aBm4rpOHyza4SNGJ37DgjwQyhYEaTmFF tF9w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=zIkV52nfYsTo4/O491yaj/b3rVc0S8U7KWEoWvYw5IE=; fh=dOT1c3bMtfitA3niap6ZM+rrtX8XZRbOZMtD8c16ZiA=; b=NFcZhtlUw2ot8li11L3Svp3CBGqgAYtYVto+AYlquRn8Np6At3b4BeYYgFfKWVipQy CUjcFTy4fORjrJC5AA0AiFo/rDTN+tkdBmSYxuHL7PmKLtl9nOwxlFqiFKhErYyiWjrZ r+e/wCXWtcI56tAA5yc1SugEhlkRTiOKveHLSr0NcNdxvtkCDSxknzQkCvKinaE0OunS eDIF6VApA34TtKzY5smRThVIrN9QmZR0eZV1JRLKTiZ9A41tcHH2g4y6DYZtTwVnhu47 Z4kiqLdMJmh3RnZiTlTNnd3X5ByYHyonyt2T1yuvobgm8/MWZz8Ub6cbPLOqpoMPLrfK Fnkw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xdKYIVNM; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d75a77b69052e-462ff40e9aesi112428271cf.139.2024.11.11.05.48.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Nov 2024 05:48:56 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xdKYIVNM; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 55C3D3858294 for ; Mon, 11 Nov 2024 13:48:56 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by sourceware.org (Postfix) with ESMTPS id B16FB3858CDB for ; Mon, 11 Nov 2024 13:47:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B16FB3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B16FB3858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::442 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731332888; cv=none; b=D5LsP2QHG0MMw/17Bufe79kzwbTZNIOtqZwt0TMVLGp4PedaW02jQVZzvqFoaPHV08fHCBUL+JI+X3KqWFUL3h4/XFsNJz483aPAJx4kK1FE470CIDTXDTZ3MVl2czzpvUh3algetTTnW8lIqgZ0SAT9vXCiCMxZAEA9dbcag7Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731332888; c=relaxed/simple; bh=k3bywYtx0ByitDnf/FMc4fvcdn0oE6nusIBOrUWYpms=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XOkLlIEDNCK7NJ2vVf9KfcYQwKuv3f/Zo1Br3B/17Erx+3GFF2Dy1VNL0NFstAM/BBkgVFVVjylANf8LCdbZTEL6gXkxPVhoEOAzcX8Y1l/7ttQTFVo9/3yOSoSU7ZqY8CQa0olbZQ4AeKLUoJpxdU1sL/vQcEpkkttsQjEWNU4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x442.google.com with SMTP id d2e1a72fcca58-720d01caa66so4210666b3a.2 for ; Mon, 11 Nov 2024 05:47:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1731332876; x=1731937676; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zIkV52nfYsTo4/O491yaj/b3rVc0S8U7KWEoWvYw5IE=; b=xdKYIVNMRMs7awff4XfIfzC+V8rCNfTxDAWXvHNi49kJekrORyA8DCu1V5vxTcyy42 y1Gjq/TjsEezHSEN5vV6GXf6RihfnDgbLsRUJjnlUnwL07buYsUMCVA+ovMk2C3Fv36i DcVQokHmKNsc5vbUE1sRSCguMwkxCd4OISjU51KIaMV6T+R6Zge24Fz5T+2GzRZ/ogXP dUATUZR7XL1zDKfmzc7OCuLcHmHGSAku7RkT/Im+2+w+MwkScBwcftbcGnIInniW3Z7N uZyVZj75bF0AvfoCduA5M4g/pe47OgXfjIqFi/bMF2nC4Gs9yuk6q2ABVnDQ4X326cWl RHqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731332876; x=1731937676; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zIkV52nfYsTo4/O491yaj/b3rVc0S8U7KWEoWvYw5IE=; b=fkRGB2Wilw4ikbwVM0lk7MXxkrxlARXmVI+DNHMp/eFv81z2lkrlA2CPtjThEBNYJX bLIG1x5HhUPCD7atuGCS8+XO9PwPEVOc+VUiD5dLaNl+ikufHMFJz/RFgTFC0jb9KulC HC4wCGRWg5YTMCKh5zg/9o5gxEQ0PY+Kr2xijJ91uw0MOBjJhBall0UWxhTuuLI1uiLz NH/d41t3kcO37ZYE3YxXxN2Ww6D2xcqJP30RUlnlUPI39RqEQj4Gw5SbKReuKoMpQAcY E2kOLU4p/OTwX6g+CyFHuy6cjgcYiAaY6a5MxJsNTH0OwAl6/DJiYkUTiRcTDqKe/eYR b8Lg== X-Gm-Message-State: AOJu0YzhQzOMYaubuRIW/oH3jpdXKLYM2qY9gOZN+7xUHOp3iJHgSPXb cXR4lZdGzibZ9BnsL8eeXkfS0X0h9sDC+TpgvgxmlX0RihcS5wT0hNBjtzIzrdjyiMzTFosniWt OpFmc61i/ X-Received: by 2002:a05:6a20:7490:b0:1db:df34:a1d6 with SMTP id adf61e73a8af0-1dc22b912d7mr16485259637.42.1731332876197; Mon, 11 Nov 2024 05:47:56 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:1b55:b2b2:a79f:60ab:6ea2]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7f41f65bf93sm8530126a12.79.2024.11.11.05.47.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Nov 2024 05:47:55 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Alexei Sibidanov , Paul Zimmermann Subject: [PATCH 06/11] math: Use cbrtf from CORE-MATH Date: Mon, 11 Nov 2024 10:45:44 -0300 Message-ID: <20241111134740.1410635-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241111134740.1410635-1-adhemerval.zanella@linaro.org> References: <20241111134740.1410635-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic cbrtf. The code was adapted to glibc style and to use the definition of math_config.h. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 68.6348 36.8908 46.25% x86_64v2 67.3418 36.6968 45.51% x86_64v3 63.4981 32.7859 48.37% aarch64 29.3172 12.1496 58.56% power10 18.0845 8.8893 50.85% powerpc 18.0859 8.79527 51.37% reciprocal-throughput master patched improvement x86_64 36.4369 13.3565 63.34% x86_64v2 37.3611 13.1149 64.90% x86_64v3 31.6024 11.2102 64.53% aarch64 18.6866 7.3474 60.68% power10 9.4758 3.6329 61.66% powerpc 9.58896 3.90439 59.28% Signed-off-by: Alexei Sibidanov Signed-off-by: Paul Zimmermann Signed-off-by: Adhemerval Zanella --- SHARED-FILES | 4 + sysdeps/aarch64/libm-test-ulps | 4 - sysdeps/alpha/fpu/libm-test-ulps | 4 - sysdeps/arc/fpu/libm-test-ulps | 4 - sysdeps/arc/nofpu/libm-test-ulps | 1 - sysdeps/arm/libm-test-ulps | 4 - sysdeps/csky/fpu/libm-test-ulps | 4 - sysdeps/csky/nofpu/libm-test-ulps | 4 - sysdeps/hppa/fpu/libm-test-ulps | 4 - sysdeps/ieee754/flt-32/s_cbrtf.c | 136 ++++++++++++++++--------- sysdeps/loongarch/lp64/libm-test-ulps | 4 - sysdeps/m68k/m680x0/fpu/libm-test-ulps | 4 - sysdeps/microblaze/libm-test-ulps | 1 - sysdeps/mips/mips32/libm-test-ulps | 4 - sysdeps/mips/mips64/libm-test-ulps | 4 - sysdeps/nios2/libm-test-ulps | 1 - sysdeps/or1k/fpu/libm-test-ulps | 4 - sysdeps/or1k/nofpu/libm-test-ulps | 4 - sysdeps/powerpc/fpu/libm-test-ulps | 4 - sysdeps/powerpc/nofpu/libm-test-ulps | 4 - sysdeps/riscv/nofpu/libm-test-ulps | 4 - sysdeps/riscv/rvd/libm-test-ulps | 4 - sysdeps/s390/fpu/libm-test-ulps | 4 - sysdeps/sh/libm-test-ulps | 2 - sysdeps/sparc/fpu/libm-test-ulps | 4 - sysdeps/x86_64/fpu/libm-test-ulps | 4 - 26 files changed, 91 insertions(+), 134 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index 228f415dfd..d367f4b62f 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -268,3 +268,7 @@ sysdeps/ieee754/flt-32/s_log10p1f.c (file src/binary32/log10p1/log10p1f.c in CORE-MATH) - The code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_cbrtf.c + (file src/binary32/cbrt/cbrtf.c in CORE-MATH) + - The code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index c523d45802..4979769b58 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -474,7 +474,6 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_advsimd": @@ -483,7 +482,6 @@ float: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_sve": @@ -492,12 +490,10 @@ float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/alpha/fpu/libm-test-ulps b/sysdeps/alpha/fpu/libm-test-ulps index 212c52c8cc..a2b5404f9d 100644 --- a/sysdeps/alpha/fpu/libm-test-ulps +++ b/sysdeps/alpha/fpu/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index 7812a11b5b..c6f3646797 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -337,19 +337,15 @@ float: 2 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 3 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index d0cfa46c3d..6319012db5 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -84,7 +84,6 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index 6cdd3d53d6..d9317046a9 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -333,19 +333,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/csky/fpu/libm-test-ulps b/sysdeps/csky/fpu/libm-test-ulps index a7b2bec17e..c3a3db9bcb 100644 --- a/sysdeps/csky/fpu/libm-test-ulps +++ b/sysdeps/csky/fpu/libm-test-ulps @@ -330,19 +330,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/csky/nofpu/libm-test-ulps b/sysdeps/csky/nofpu/libm-test-ulps index 4e4451a5d2..68a74bf1d0 100644 --- a/sysdeps/csky/nofpu/libm-test-ulps +++ b/sysdeps/csky/nofpu/libm-test-ulps @@ -328,19 +328,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 021a2a482c..a54737db2e 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -338,20 +338,16 @@ float: 1 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/ieee754/flt-32/s_cbrtf.c b/sysdeps/ieee754/flt-32/s_cbrtf.c index 68b8b0ec37..5a7a9a952d 100644 --- a/sysdeps/ieee754/flt-32/s_cbrtf.c +++ b/sysdeps/ieee754/flt-32/s_cbrtf.c @@ -1,61 +1,99 @@ -/* Compute cubic root of float value. - Copyright (C) 1997-2024 Free Software Foundation, Inc. - This file is part of the GNU C Library. +/* Correctly-rounded cubic root of binary32 value. - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. +Copyright (c) 2023, 2024 Alexei Sibidanov. - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. +The original version of this file was copied from the CORE-MATH +project (file src/binary32/cbrt/cbrtf.c, revision bc385c2). - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: -#include -#include +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ -#define CBRT2 1.2599210498948731648 /* 2^(1/3) */ -#define SQR_CBRT2 1.5874010519681994748 /* 2^(2/3) */ - -static const double factor[5] = -{ - 1.0 / SQR_CBRT2, - 1.0 / CBRT2, - 1.0, - CBRT2, - SQR_CBRT2 -}; - +#include +#include +#include +#include +#include "math_config.h" float __cbrtf (float x) { - float xm, ym, u, t2; - int xe; - - /* Reduce X. XM now is an range 1.0 to 0.5. */ - xm = __frexpf (fabsf (x), &xe); - - /* If X is not finite or is null return it (with raising exceptions - if necessary. - Note: *Our* version of `frexp' sets XE to zero if the argument is - Inf or NaN. This is not portable but faster. */ - if (xe == 0 && fpclassify (x) <= FP_ZERO) - return x + x; - - u = (0.492659620528969547 + (0.697570460207922770 - - 0.191502161678719066 * xm) * xm); - - t2 = u * u * u; - - ym = u * (t2 + 2.0 * xm) / (2.0 * t2 + xm) * factor[2 + xe % 3]; - - return __ldexpf (x > 0.0 ? ym : -ym, xe / 3); + static const union + { + double d; + uint64_t u; + } escale[3] = + { + { .d = 1.0 }, + { .d = 0x1.428a2f98d728bp+0 }, /* 2^(1/3) */ + { .d = 0x1.965fea53d6e3dp+0 }, /* 2^(2/3) */ + }; + uint32_t u = asuint (x); + uint32_t au = u << 1; + uint32_t sgn = u >> 31; + uint32_t e = au >> 24; + if (__glibc_unlikely (au < 1u << 24 || au >= 0xffu << 24)) + { + if (au >= 0xffu << 24) + return x + x; /* inf, nan */ + if (au == 0) + return x; /* +-0 */ + int nz = __builtin_clz (au) - 7; /* subnormal */ + au <<= nz; + e -= nz - 1; + } + uint32_t mant = au & 0xffffff; + e += 899; + uint32_t et = e / 3, it = e % 3; + uint64_t isc = escale[it].u; + isc += (int64_t) (et - 342) << 52; + isc |= (int64_t) sgn << 63; + double cvt2 = asdouble (isc); + static const double c[] = + { + 0x1.2319d352ea5d5p-1, 0x1.67ad8ee258d1ap-1, -0x1.9342edf9cbad9p-2, + 0x1.b6388fc510a75p-3, -0x1.6002455599e2fp-4, 0x1.7b096936192c4p-6, + -0x1.e5577187e8bf8p-9, 0x1.169ef81d6c34ep-12 + }; + double z = asdouble ((uint64_t) mant << 28 | UINT64_C(0x3ff) << 52); + double r0 = -0x1.9931c6c2d19d1p-6 / z; + double z2 = z * z; + double z4 = z2 * z2; + double f = ((c[0] + z * c[1]) + z2 * (c[2] + z * c[3])) + + z4 * ((c[4] + z * c[5]) + z2 * (c[6] + z * c[7])) + r0; + double r = f * cvt2; + float ub = r; + float lb = r - cvt2 * 1.4182e-9; + if (__glibc_likely (ub == lb)) + return ub; + const double u0 = -0x1.ab16ec65d138fp+3; + double h = f * f * f - z; + f -= (f * r0 * u0) * h; + r = f * cvt2; + uint64_t cvt1 = asuint64 (r); + ub = r; + int64_t m0 = cvt1 << 19; + int64_t m1 = m0 >> 63; + if (__glibc_unlikely ((m0 ^ m1) < (UINT64_C(1) << 31))) + { + cvt1 = (cvt1 + (UINT64_C(1) << 31)) & UINT64_C(0xffffffff00000000); + ub = asdouble (cvt1); + } + return ub; } libm_alias_float (__cbrt, cbrt) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index ecd9cc5873..ba070f8224 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/m68k/m680x0/fpu/libm-test-ulps b/sysdeps/m68k/m680x0/fpu/libm-test-ulps index 3964b83b81..8456a59010 100644 --- a/sysdeps/m68k/m680x0/fpu/libm-test-ulps +++ b/sysdeps/m68k/m680x0/fpu/libm-test-ulps @@ -379,22 +379,18 @@ ldouble: 1 Function: "cbrt": double: 1 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 1 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 1 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 1 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/microblaze/libm-test-ulps b/sysdeps/microblaze/libm-test-ulps index 328e31582b..c89096defd 100644 --- a/sysdeps/microblaze/libm-test-ulps +++ b/sysdeps/microblaze/libm-test-ulps @@ -81,7 +81,6 @@ float: 1 Function: "cbrt": double: 3 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/mips/mips32/libm-test-ulps b/sysdeps/mips/mips32/libm-test-ulps index c319e0642c..cef264d649 100644 --- a/sysdeps/mips/mips32/libm-test-ulps +++ b/sysdeps/mips/mips32/libm-test-ulps @@ -333,19 +333,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 365b860c54..724249d3ad 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/nios2/libm-test-ulps b/sysdeps/nios2/libm-test-ulps index 5240767c0e..dbccba13cb 100644 --- a/sysdeps/nios2/libm-test-ulps +++ b/sysdeps/nios2/libm-test-ulps @@ -84,7 +84,6 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index 9ced4b0052..df2b69ac75 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -333,19 +333,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index c7ae0f002b..2263f3f0b7 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -333,19 +333,15 @@ float: 1 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_downward": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: "cbrt_upward": double: 5 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 8d0c18eed1..36fa54d97e 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -506,25 +506,21 @@ ldouble: 6 Function: "cbrt": double: 4 -float: 1 float128: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 float128: 1 ldouble: 5 Function: "cbrt_towardzero": double: 3 -float: 1 float128: 1 ldouble: 3 Function: "cbrt_upward": double: 5 -float: 1 float128: 2 ldouble: 2 diff --git a/sysdeps/powerpc/nofpu/libm-test-ulps b/sysdeps/powerpc/nofpu/libm-test-ulps index 20036c779c..c32c8017b4 100644 --- a/sysdeps/powerpc/nofpu/libm-test-ulps +++ b/sysdeps/powerpc/nofpu/libm-test-ulps @@ -421,22 +421,18 @@ ldouble: 6 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 5 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 2 Function: Real part of "ccos": diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index cccc864a7a..79927c2bd9 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 14fc7633af..fbd5b8fed7 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index a25bb505b3..ade5a39db4 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/sh/libm-test-ulps b/sysdeps/sh/libm-test-ulps index 8562796de8..b0040d7218 100644 --- a/sysdeps/sh/libm-test-ulps +++ b/sysdeps/sh/libm-test-ulps @@ -164,11 +164,9 @@ float: 2 Function: "cbrt": double: 4 -float: 1 Function: "cbrt_towardzero": double: 3 -float: 1 Function: Real part of "ccos": double: 1 diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index 6ea02058e9..d78b46b97b 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -417,22 +417,18 @@ ldouble: 2 Function: "cbrt": double: 4 -float: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 ldouble: 1 Function: Real part of "ccos": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index e3c811549c..327937929d 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -638,25 +638,21 @@ ldouble: 1 Function: "cbrt": double: 4 -float: 1 float128: 1 ldouble: 1 Function: "cbrt_downward": double: 4 -float: 1 float128: 1 ldouble: 1 Function: "cbrt_towardzero": double: 3 -float: 1 float128: 1 ldouble: 1 Function: "cbrt_upward": double: 5 -float: 1 float128: 1 ldouble: 1