From patchwork Fri Jan 31 19:17:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 860992 Delivered-To: patch@linaro.org Received: by 2002:adf:fb05:0:b0:385:e875:8a9e with SMTP id c5csp890389wrr; Fri, 31 Jan 2025 11:19:24 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVQsT5hzHZPFolqQOvM9FfWu7hmQ3HruxoZvFt9cuou25hCAe2UN6fCc1kZuj4ebuPlKHC5MQ==@linaro.org X-Google-Smtp-Source: AGHT+IEuZBwyovJgdOv9ulQ8vriHRY+xh3Kst3LWBLvLKUQ4yJ3UfvbM9rYvh+GffcirzUvwgFHR X-Received: by 2002:a05:622a:41:b0:467:6b7b:7d0 with SMTP id d75a77b69052e-46fd0a1cf2amr195474621cf.11.1738351164269; Fri, 31 Jan 2025 11:19:24 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1738351164; cv=pass; d=google.com; s=arc-20240605; b=AsI9C9RthGQxRLXfuzX+St3t8uEnZPgMxPQ6Q0b83RAxsUdzLA//hSVcx7Bj/MK5qQ uWrlA7tCjoM6hRdpydzZPV60498d9lNVFuD6/Fq1xr/Ty/4k7uX9p29x6hn0n93/o9pM qu+MfexYl61m6a2KHCxCg0VUx3hSzY160mnEF9VJYJA3FyuQWF5B7X3YILg4MUWNCAYl ZGFi3U4AyVXDRTH1dUAFF2yPJ0p+6VMTLbMst7f7Q1knGR3sBxe+ebM2kGBDe4wyT9bP 0n06N3+Rz3caLzZMbDeN1Tm+JOtkpbj0Xr6hXXel2QVz5bjqSBpampkFUA1Av6k4nQSM 26YQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=B/q02ZYneL0NQAyDdRxS7n3bgi2bvkhapOK3Q4fwIPY=; fh=T3r0zdCptMfvhfOunKrfbObcLBcSXv7WNDNor/iY12Y=; b=HH5rPEWGjS87MS5yX2J5VR5Kh0Ft1w8pMONTzPxcrJyIZFPKL5Kj6QJZ6+iV593RPL 8PAQxxAlTC7yUjL2irZrtG+AY7xjf4Qcdglb5Wv8o2AuYXiWf0e88a44lbj4llyWtQFl djlcyvMd8EXAdCS+Q+gWccwQR3lR/YgyVRGN4FHBfPEiMDCPqu/ZueoP2O6fhYON0d4C QIIdgoZX9dLnQO1DQ9oNqT05kIqbLwy1n0dc68+2PJaXY+EfOS1snzPFBFqc/bn6Go8b T8jVFaPcx29xN6Q4BQw7kAxuegrm5PnTAFl8+uLFEn1BxSiqKqNNLQug+5iBX73YROSk /k1w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FG9IBTw6; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d75a77b69052e-46fdf1e4c74si42407941cf.572.2025.01.31.11.19.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:19:24 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=FG9IBTw6; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D54213858C66 for ; Fri, 31 Jan 2025 19:19:23 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by sourceware.org (Postfix) with ESMTPS id 68F983858429 for ; Fri, 31 Jan 2025 19:19:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 68F983858429 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 68F983858429 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1032 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738351142; cv=none; b=FewIMRkDY/NLkaGnSKx9d+8CrwqClRvQn1Jee588yYHocfFInUPhTX/4ho95M8mxewyrmqd76FNZXwV3f+OfayxC4OXYVfUESbSbiYMbEGBeeusonqfLa8jBENPKg5m00gIMn404r2PO6qvoqTRQ0DU4XF04d17IHGJ/BgZafME= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1738351142; c=relaxed/simple; bh=7Y5q7SD0fb9xRPmeyY0O1SLca2xc7fHbTZd63ZaJmlM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=VX0pjNH1VOaWP3y6e4uXIgEq1a0nJ3Sl5uZ5QIqJCDSaZ/1HGMRMDcJYKYw8Tew+Pru0JTm7UhpipZsiJsJRC0Lrmxo3euazr+Ii680qJLob/y+37QS7+8S9A0h/yAWabV4pKvYJslseVFjvPe54OzxGiUqkRojwKig7ImHR9bc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pj1-x1032.google.com with SMTP id 98e67ed59e1d1-2ef70c7efa5so3273424a91.2 for ; Fri, 31 Jan 2025 11:19:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1738351132; x=1738955932; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=B/q02ZYneL0NQAyDdRxS7n3bgi2bvkhapOK3Q4fwIPY=; b=FG9IBTw6P1vMIUMaEh8qoYW9uxXyZkeH0XprkxPIYn5C/ZbaQtCGezFVWQE85l5WlU 8c/+786ahN95IFkFPysng79jxcXW0PVPcO0C/g0Ach9hSKiREO5hYAIZNP3lJS0/yIsS yu6vjPMLIrlcfnBiSL8q0HOPW5O2qkZjwwEgZlIKZ++cNNR9SlNKBnji+vpbhd+lvylK DsyLEsvm88YOz2gF5Pen/STUyvKnCiO15iojjy76UzCOCKxKAm5cTcdmsHniM2iKhOfU Y1361k+i2Gfi1CQVqvKgp8RCaXBIrEsZRKkDfHVRilt3se+K+ICJQaq0wQ4jJ2PDNZ7D c1SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738351132; x=1738955932; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=B/q02ZYneL0NQAyDdRxS7n3bgi2bvkhapOK3Q4fwIPY=; b=SVxDU1Zr1jxA6LA/1g8MggUketrJu9zBG+Te2He3VzqUhjOGOAYlkjWsF4qIGvGrDY /LfeX+scHiChhuJS7g85GfLfrZCVV1Plhr5Gz8pWe4aG2HCym7X5viwMPC2WDpDLFEoK GXlTdqULNaYBG4l28NoM60EZCDf7SJo0brjDM9mS2LyFBywa4zXKXlWRTHo/iExF09IA 2SheZJOcamBxWf3FnbWO73x/sSLpRylIdY4e/z1XR+75/d0pqQz8AHG0ONnGRixIPuGY CsLwdbTc4/sFg3UZAMtoZUSUYjQCqPl+1ijVlWIJ+YXf7YIz91wGW4sWSFp2sWOCidZY z4iw== X-Gm-Message-State: AOJu0YzKjWsj7MFIPjgui8ngqNwRRFjVqlFlkubjWC2xRY3enCMr65OC mo2/GrKszD2FNUwgQZMADhykLrnvMMuSaIvhx/ZMeqJBJ30pHKtvJPaYeWymyTriujiUdeMWBC7 G X-Gm-Gg: ASbGnctOYdovOtyNXQHgakq09lT7PSa/EJoqzstwlP1fXxf8GuZQF/rqv5cjzFVZisU qxz2gcL2BP14ZL9gPsw2RioqBm1m8FbJeqZDgQd3ugGWBhNOLyXKbv9PtdB2UcQm50qDSVeNzh9 inFA/9/vdbos1JLiAFHODr95gXqTlIt17VA/RTslN2zjsTvHnLZgIth3NGKy4Uqmr1L+6OqaQYR IFyk2f3iqlSI5PYp3h+2C169slgg3DdRo7mF5Vj4lvW3Hb4zEGZjA7j2EjCiX/kDDq5lCEvETeS t8ZIGbN/eSzGkIa/8JFqRAWnjx/b0g== X-Received: by 2002:a17:90b:518b:b0:2ee:db8a:29f0 with SMTP id 98e67ed59e1d1-2f83ac73bddmr17239074a91.27.1738351130172; Fri, 31 Jan 2025 11:18:50 -0800 (PST) Received: from ubuntu-vm.. ([177.103.113.118]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f830a3d74esm2530475a91.2.2025.01.31.11.18.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 31 Jan 2025 11:18:49 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: DJ Delorie , Joseph Myers , Paul Zimmermann , Alexei Sibidanov Subject: [PATCH 00/15] Add c23 CORE-MATH binary32 implementations to libm Date: Fri, 31 Jan 2025 16:17:04 -0300 Message-ID: <20250131191844.2582716-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org This patchset adds the optimized and correctly rounded acospif, asinpif, atan2pif, atanpif, cospif, sinpif, and tanpif from CORE-MATH [1]. Each implementation has a benchmark to evaluate the performance improvements. All implementation shows performance improvements in all but one case: asinpif on x86_64/x86_64-v2. This is due to the use of a fma operation in the fast patch. Only x86_64-v3 provides it without a function call, and this is mitigated with an ifunc variant for x86_64-v3. [1] https://gitlab.inria.fr/core-math/core-math Adhemerval Zanella (15): benchtests: Add acospif benchtests: Add asinpif benchtests: Add atan2pif benchtests: Add atanpif benchtests: Add cospif benchtests: Add sinpif benchtests: Add tanpif math: Use acospif from CORE-MATH math: Use asinpif from CORE-MATH math: Use atan2pif from CORE-MATH math: Use atanpif from CORE-MATH math: Use cospif from CORE-MATH math: Use sinpif from CORE-MATH math: Use tanpif from CORE-MATH x86_64: Add asinpif with FMA SHARED-FILES | 28 + benchtests/Makefile | 7 + benchtests/acospif-inputs | 2710 +++++++++++++++++ benchtests/asinpif-inputs | 2710 +++++++++++++++++ benchtests/atan2pif-inputs | 2005 ++++++++++++ benchtests/atanpif-inputs | 2005 ++++++++++++ benchtests/cospif-inputs | 2409 +++++++++++++++ benchtests/sinpif-inputs | 2409 +++++++++++++++ benchtests/tanpif-inputs | 2409 +++++++++++++++ sysdeps/aarch64/libm-test-ulps | 28 - sysdeps/arc/fpu/libm-test-ulps | 28 - sysdeps/arc/nofpu/libm-test-ulps | 7 - sysdeps/arm/libm-test-ulps | 28 - sysdeps/hppa/fpu/libm-test-ulps | 28 - sysdeps/i386/fpu/libm-test-ulps | 28 - .../i386/i686/fpu/multiarch/libm-test-ulps | 28 - sysdeps/ieee754/flt-32/math_config.h | 25 + sysdeps/ieee754/flt-32/s_acospif.c | 137 + sysdeps/ieee754/flt-32/s_asinpif.c | 138 + sysdeps/ieee754/flt-32/s_atan2pif.c | 238 ++ sysdeps/ieee754/flt-32/s_atanpif.c | 109 + sysdeps/ieee754/flt-32/s_cospif.c | 136 + sysdeps/ieee754/flt-32/s_sinpif.c | 134 + sysdeps/ieee754/flt-32/s_tanpif.c | 88 + sysdeps/loongarch/lp64/libm-test-ulps | 28 - sysdeps/mips/mips64/libm-test-ulps | 28 - sysdeps/or1k/fpu/libm-test-ulps | 28 - sysdeps/or1k/nofpu/libm-test-ulps | 7 - sysdeps/powerpc/fpu/libm-test-ulps | 28 - sysdeps/powerpc/fpu/math_private.h | 1 + sysdeps/riscv/nofpu/libm-test-ulps | 7 - sysdeps/riscv/rvd/libm-test-ulps | 28 - sysdeps/s390/fpu/libm-test-ulps | 28 - sysdeps/sparc/fpu/libm-test-ulps | 28 - sysdeps/x86_64/fpu/libm-test-ulps | 28 - sysdeps/x86_64/fpu/multiarch/Makefile | 2 + sysdeps/x86_64/fpu/multiarch/s_asinpif-fma.c | 4 + sysdeps/x86_64/fpu/multiarch/s_asinpif.c | 33 + 38 files changed, 17737 insertions(+), 413 deletions(-) create mode 100644 benchtests/acospif-inputs create mode 100644 benchtests/asinpif-inputs create mode 100644 benchtests/atan2pif-inputs create mode 100644 benchtests/atanpif-inputs create mode 100644 benchtests/cospif-inputs create mode 100644 benchtests/sinpif-inputs create mode 100644 benchtests/tanpif-inputs create mode 100644 sysdeps/ieee754/flt-32/s_acospif.c create mode 100644 sysdeps/ieee754/flt-32/s_asinpif.c create mode 100644 sysdeps/ieee754/flt-32/s_atan2pif.c create mode 100644 sysdeps/ieee754/flt-32/s_atanpif.c create mode 100644 sysdeps/ieee754/flt-32/s_cospif.c create mode 100644 sysdeps/ieee754/flt-32/s_sinpif.c create mode 100644 sysdeps/ieee754/flt-32/s_tanpif.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_asinpif-fma.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_asinpif.c