From patchwork Fri Dec 11 12:11:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Siddhesh Poyarekar X-Patchwork-Id: 58276 Delivered-To: patch@linaro.org Received: by 10.112.157.166 with SMTP id wn6csp90522lbb; Fri, 11 Dec 2015 04:13:12 -0800 (PST) X-Received: by 10.66.255.10 with SMTP id am10mr24518232pad.79.1449835992002; Fri, 11 Dec 2015 04:13:12 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id f18si959070pfj.115.2015.12.11.04.13.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Dec 2015 04:13:11 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-65498-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; spf=pass (google.com: domain of libc-alpha-return-65498-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-65498-patch=linaro.org@sourceware.org; dkim=pass header.i=@sourceware.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=lbrC7mBqw4pEftckr8KbBic58BVssbx +9T2+14+WEJpeeXMu1KmVwslvE6Ht/bKQLxiicozO5ZMov3mLK+hzTj3F9l61Bs2 Z4HnVaHXvw0Zky/OKkwGRjbKWL+LqwQwYP5qPCtWoL/Sk5FU2hztYr7Fpg6bkdd4 BUdJp8epbs2I= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=QaqTAKdwgrGbhJryvp/E2/ziQko=; b=G6lK+ uCHXiTGgYrM9ryNFFoglCa7JU9X39DHv+LKlsMFndu1hFZ/wrGCjZzDVWObF9tMS UQ4o4Ue62jul82sVBArF1ayAEuhy6yxSOYVccqOrU6fFdVNMjwkM8oaM58Dg5x2P y7HrtyN/odyNtLgIya6Fh5qpbzW2F2Z5gMf224= Received: (qmail 57538 invoked by alias); 11 Dec 2015 12:12:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 57115 invoked by uid 89); 11 Dec 2015 12:12:42 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-pa0-f53.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ntBGoJ7AP0C0csMNnPvAwkT/gbTRrPIyHBGJvtb6Kzo=; b=inrKdqmGqDULfs4RmTRvVMw9/yu0ttLFGpheaNcfNJTw/pDCAcpBWKWw3DZ9uudC6b HpBN2FWAMqYSmAGc1G1GfapapGayV87lrmKVwKlwGDWifTwgiDgh8WBDjvk+nyaneJcw W07fyV25m5p2Z4+nr91F6BOzzQ9lG3uBZANnD1Ejsbpi4FBvWq8VEmUb6lI9DJTP2ghG 7xfctL6r8w0PXMvEOLSevD9U2FTY3cgT3Iyh6XxvX9EmFrEqLrsKkB6Xx8AOZM9+KZer ACBYdyI9OirYaKeki2O1n7i3Whp810K52tbBM2ej5KA4CbHZELLQ8CiAtxCWn8INb2HX 1euQ== X-Gm-Message-State: ALoCoQlEg/dvgMPUAxv8fqrEFj3YEF1IvPZHukeu2IsC96YC0igwC8aHJkKttxkXXkhY+txmBTLmyGJptM6pYE7p43uOAGG23w== X-Received: by 10.66.119.136 with SMTP id ku8mr24752745pab.128.1449835946312; Fri, 11 Dec 2015 04:12:26 -0800 (PST) From: Siddhesh Poyarekar To: libc-alpha@sourceware.org Cc: Siddhesh Poyarekar Subject: [PATCH 1/3] Consolidate range reduction in sincos for x > 281474976710656 Date: Fri, 11 Dec 2015 17:41:48 +0530 Message-Id: <1449835910-4651-2-git-send-email-siddhesh.poyarekar@linaro.org> In-Reply-To: <1449835910-4651-1-git-send-email-siddhesh.poyarekar@linaro.org> References: <1449835910-4651-1-git-send-email-siddhesh.poyarekar@linaro.org> Range reduction needs to be done only once for sin and cos, so copy over all of the relevant functions (__sin, __cos, reduce_and_compute) and consolidate common code. Siddhesh * sysdeps/ieee754/dbl-64/s_sincos.c (__sin_local, __cos_local): Copy from s_sin.c. (reduce_and_compute): Consolidate for __sin_local as well as __cos_local. * sysdeps/ieee754/dbl-64/s_sin.c: Avoid including __sin and __cos when included in s_sincos.c --- sysdeps/ieee754/dbl-64/s_sin.c | 26 +- sysdeps/ieee754/dbl-64/s_sincos.c | 492 +++++++++++++++++++++++++++++++++++++- 2 files changed, 497 insertions(+), 21 deletions(-) -- 2.5.0 diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index a635a86..a8c5d4a 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.csysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -280,12 +280,9 @@ reduce_and_compute (double x, unsigned int k) /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else +#ifndef IN_SINCOS double SECTION -#endif __sin (double x) { double xx, res, t, cor, y, s, c, sn, ssn, cs, ccs, xn, a, da, db, eps, xn1, @@ -294,9 +291,7 @@ __sin (double x) int4 k, m, n; double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -518,12 +513,8 @@ __sin (double x) /* it computes the correctly rounded (to nearest) value of cos(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else double SECTION -#endif __cos (double x) { double y, xx, res, t, cor, xn, a, da, db, eps, xn1, @@ -533,9 +524,7 @@ __cos (double x) double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -742,6 +731,7 @@ __cos (double x) return retval; } +#endif /************************************************************************/ /* Routine compute sin(x) for 2^-26 < |x|< 0.25 by Taylor with more */ @@ -1217,17 +1207,19 @@ csloww2 (double x, double dx, double orig, int n) return __mpcos (orig, 0, true); } -#ifndef __cos +#ifndef IN_SINCOS +# ifndef __cos weak_alias (__cos, cos) -# ifdef NO_LONG_DOUBLE +# ifdef NO_LONG_DOUBLE strong_alias (__cos, __cosl) weak_alias (__cos, cosl) +# endif # endif -#endif -#ifndef __sin +# ifndef __sin weak_alias (__sin, sin) -# ifdef NO_LONG_DOUBLE +# ifdef NO_LONG_DOUBLE strong_alias (__sin, __sinl) weak_alias (__sin, sinl) +# endif # endif #endif diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 2a3fc06..078d65d 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -22,18 +22,502 @@ #include -#define __sin __sin_local -#define __cos __cos_local #define IN_SINCOS 1 #include "s_sin.c" + +/* Compute sin (X). K is passed to avoid computing it again. Make sure that + this is in sync with __sin in s_sin.c for the range of X it computes values + for. */ +static double +__always_inline +__sin_local (double x, int4 k) +{ + double xx, res, t, cor, y, s, c, sn, ssn, cs, ccs, xn, a, da, db, eps, xn1, + xn2; + mynumber u, v; + int4 m, n; + double retval = 0; + + if (k < 0x3e500000) /* if x->0 =>sin(x)=x */ + { + math_check_force_underflow (x); + retval = x; + } + /*---------------------------- 2^-26 < |x|< 0.25 ----------------------*/ + else if (k < 0x3fd00000) + { + xx = x * x; + /* Taylor series. */ + t = POLYNOMIAL (xx) * (xx * x); + res = x + t; + cor = (x - res) + t; + retval = (res == res + 1.07 * cor) ? res : slow (x); + } /* else if (k < 0x3fd00000) */ +/*---------------------------- 0.25<|x|< 0.855469---------------------- */ + else if (k < 0x3feb6000) + { + u.x = (x > 0) ? big + x : big - x; + y = (x > 0) ? x - (u.x - big) : x + (u.x - big); + xx = y * y; + s = y + y * xx * (sn3 + xx * sn5); + c = xx * (cs2 + xx * (cs4 + xx * cs6)); + SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); + if (x <= 0) + { + sn = -sn; + ssn = -ssn; + } + cor = (ssn + s * ccs - sn * c) + cs * s; + res = sn + cor; + cor = (sn - res) + cor; + retval = (res == res + 1.096 * cor) ? res : slow1 (x); + } /* else if (k < 0x3feb6000) */ + +/*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ + else if (k < 0x400368fd) + { + + y = (x > 0) ? hp0 - x : hp0 + x; + if (y >= 0) + { + u.x = big + y; + y = (y - (u.x - big)) + hp1; + } + else + { + u.x = big - y; + y = (-hp1) - (y + (u.x - big)); + } + res = do_cos (u, y, &cor); + retval = (res == res + 1.020 * cor) ? ((x > 0) ? res : -res) : slow2 (x); + } /* else if (k < 0x400368fd) */ + +/*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ + else if (k < 0x419921FB) + { + t = (x * hpinv + toint); + xn = t - toint; + v.x = t; + y = (x - xn * mp1) - xn * mp2; + n = v.i[LOW_HALF] & 3; + da = xn * mp3; + a = y - da; + da = (y - a) - da; + eps = fabs (x) * 1.2e-30; + + switch (n) + { /* quarter of unit circle */ + case 0: + case 2: + xx = a * a; + if (n) + { + a = -a; + da = -da; + } + if (xx < 0.01588) + { + /* Taylor series. */ + res = TAYLOR_SIN (xx, a, da, cor); + cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps; + retval = (res == res + cor) ? res : sloww (a, da, x); + } + else + { + if (a > 0) + m = 1; + else + { + m = 0; + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big); + res = do_sin (u, y, da, &cor); + cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps; + retval = ((res == res + cor) ? ((m) ? res : -res) + : sloww1 (a, da, x, m)); + } + break; + + case 1: + case 3: + if (a < 0) + { + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big) + da; + res = do_cos (u, y, &cor); + cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps; + retval = ((res == res + cor) ? ((n & 2) ? -res : res) + : sloww2 (a, da, x, n)); + break; + } + } /* else if (k < 0x419921FB ) */ + +/*---------------------105414350 <|x|< 281474976710656 --------------------*/ + else + { + t = (x * hpinv + toint); + xn = t - toint; + v.x = t; + xn1 = (xn + 8.0e22) - 8.0e22; + xn2 = xn - xn1; + y = ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); + n = v.i[LOW_HALF] & 3; + da = xn1 * pp3; + t = y - da; + da = (y - t) - da; + da = (da - xn2 * pp3) - xn * pp4; + a = t + da; + da = (t - a) + da; + eps = 1.0e-24; + + switch (n) + { + case 0: + case 2: + xx = a * a; + if (n) + { + a = -a; + da = -da; + } + if (xx < 0.01588) + { + /* Taylor series. */ + res = TAYLOR_SIN (xx, a, da, cor); + cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps; + retval = (res == res + cor) ? res : bsloww (a, da, x, n); + } + else + { + double t; + if (a > 0) + { + m = 1; + t = a; + db = da; + } + else + { + m = 0; + t = -a; + db = -da; + } + u.x = big + t; + y = t - (u.x - big); + res = do_sin (u, y, db, &cor); + cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps; + retval = ((res == res + cor) ? ((m) ? res : -res) + : bsloww1 (a, da, x, n)); + } + break; + + case 1: + case 3: + if (a < 0) + { + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big) + da; + res = do_cos (u, y, &cor); + cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps; + retval = ((res == res + cor) ? ((n & 2) ? -res : res) + : bsloww2 (a, da, x, n)); + break; + } + } /* else if (k < 0x42F00000 ) */ + + return retval; +} + +/* Compute cos (X). K is passed to avoid computing it again. Make sure that + this is in sync with __cos in s_sin.c for the range of X it computes values + for. */ +static double +__always_inline +__cos_local (double x, int4 k) +{ + double y, xx, res, t, cor, xn, a, da, db, eps, xn1, + xn2; + mynumber u, v; + int4 m, n; + + double retval = 0; + + /* |x|<2^-27 => cos(x)=1 */ + if (k < 0x3e400000) + retval = 1.0; + + else if (k < 0x3feb6000) + { /* 2^-27 < |x| < 0.855469 */ + y = fabs (x); + u.x = big + y; + y = y - (u.x - big); + res = do_cos (u, y, &cor); + retval = (res == res + 1.020 * cor) ? res : cslow2 (x); + } /* else if (k < 0x3feb6000) */ + + else if (k < 0x400368fd) + { /* 0.855469 <|x|<2.426265 */ ; + y = hp0 - fabs (x); + a = y + hp1; + da = (y - a) + hp1; + xx = a * a; + if (xx < 0.01588) + { + res = TAYLOR_SIN (xx, a, da, cor); + cor = (cor > 0) ? 1.02 * cor + 1.0e-31 : 1.02 * cor - 1.0e-31; + retval = (res == res + cor) ? res : csloww (a, da, x); + } + else + { + if (a > 0) + { + m = 1; + } + else + { + m = 0; + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big); + res = do_sin (u, y, da, &cor); + cor = (cor > 0) ? 1.035 * cor + 1.0e-31 : 1.035 * cor - 1.0e-31; + retval = ((res == res + cor) ? ((m) ? res : -res) + : csloww1 (a, da, x, m)); + } + + } /* else if (k < 0x400368fd) */ + + + else if (k < 0x419921FB) + { /* 2.426265<|x|< 105414350 */ + t = (x * hpinv + toint); + xn = t - toint; + v.x = t; + y = (x - xn * mp1) - xn * mp2; + n = v.i[LOW_HALF] & 3; + da = xn * mp3; + a = y - da; + da = (y - a) - da; + eps = fabs (x) * 1.2e-30; + + switch (n) + { + case 1: + case 3: + xx = a * a; + if (n == 1) + { + a = -a; + da = -da; + } + if (xx < 0.01588) + { + res = TAYLOR_SIN (xx, a, da, cor); + cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps; + retval = (res == res + cor) ? res : csloww (a, da, x); + } + else + { + if (a > 0) + { + m = 1; + } + else + { + m = 0; + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big); + res = do_sin (u, y, da, &cor); + cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps; + retval = ((res == res + cor) ? ((m) ? res : -res) + : csloww1 (a, da, x, m)); + } + break; + + case 0: + case 2: + if (a < 0) + { + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big) + da; + res = do_cos (u, y, &cor); + cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps; + retval = ((res == res + cor) ? ((n) ? -res : res) + : csloww2 (a, da, x, n)); + break; + } + } /* else if (k < 0x419921FB ) */ + + else + { + t = (x * hpinv + toint); + xn = t - toint; + v.x = t; + xn1 = (xn + 8.0e22) - 8.0e22; + xn2 = xn - xn1; + y = ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); + n = v.i[LOW_HALF] & 3; + da = xn1 * pp3; + t = y - da; + da = (y - t) - da; + da = (da - xn2 * pp3) - xn * pp4; + a = t + da; + da = (t - a) + da; + eps = 1.0e-24; + + switch (n) + { + case 1: + case 3: + xx = a * a; + if (n == 1) + { + a = -a; + da = -da; + } + if (xx < 0.01588) + { + res = TAYLOR_SIN (xx, a, da, cor); + cor = (cor > 0) ? 1.02 * cor + eps : 1.02 * cor - eps; + retval = (res == res + cor) ? res : bsloww (a, da, x, n); + } + else + { + double t; + if (a > 0) + { + m = 1; + t = a; + db = da; + } + else + { + m = 0; + t = -a; + db = -da; + } + u.x = big + t; + y = t - (u.x - big); + res = do_sin (u, y, db, &cor); + cor = (cor > 0) ? 1.035 * cor + eps : 1.035 * cor - eps; + retval = ((res == res + cor) ? ((m) ? res : -res) + : bsloww1 (a, da, x, n)); + } + break; + + case 0: + case 2: + if (a < 0) + { + a = -a; + da = -da; + } + u.x = big + a; + y = a - (u.x - big) + da; + res = do_cos (u, y, &cor); + cor = (cor > 0) ? 1.025 * cor + eps : 1.025 * cor - eps; + retval = ((res == res + cor) ? ((n) ? -res : res) + : bsloww2 (a, da, x, n)); + break; + } + } /* else if (k < 0x42F00000 ) */ + + return retval; +} + +/* Consolidated version of reduce_and_compute in s_sin.c that does range + reduction only once and computes sin and cos together. */ +static inline void +__always_inline +reduce_and_compute_sincos (double x, double *sinxp, double *cosxp) +{ + double a, da, sinx, cosx; + unsigned int n = __branred (x, &a, &da); + n = n % 4; + switch (n) + { + case 0: + if (a * a < 0.01588) + sinx = bsloww (a, da, x, n); + else + sinx = bsloww1 (a, da, x, n); + cosx = bsloww2 (a, da, x, n); + break; + + case 1: + if (a * a < 0.01588) + cosx = bsloww (-a, -da, x, n); + else + cosx = bsloww1 (-a, -da, x, n); + sinx = bsloww2 (a, da, x, n); + break; + + case 2: + if (a * a < 0.01588) + sinx = bsloww (-a, -da, x, n); + else + sinx = bsloww1 (-a, -da, x, n); + cosx = bsloww2 (a, da, x, n); + break; + + case 3: + if (a * a < 0.01588) + cosx = bsloww (a, da, x, n); + else + cosx = bsloww1 (a, da, x, n); + sinx = bsloww2 (a, da, x, n); + break; + } + + *sinxp = sinx; + *cosxp = cosx; +} + void __sincos (double x, double *sinx, double *cosx) { + mynumber u; + int k; + SET_RESTORE_ROUND_53BIT (FE_TONEAREST); - *sinx = __sin (x); - *cosx = __cos (x); + u.x = x; + k = 0x7fffffff & u.i[HIGH_HALF]; + + if (k < 0x42F00000) + { + *sinx = __sin_local (x, k); + *cosx = __cos_local (x, k); + return; + } + if (k < 0x7ff00000) + { + reduce_and_compute_sincos (x, sinx, cosx); + return; + } + + if (isinf (x)) + __set_errno (EDOM); + + *sinx = *cosx = x / x; } weak_alias (__sincos, sincos) #ifdef NO_LONG_DOUBLE