From patchwork Fri Nov 11 17:26:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 81878 Delivered-To: patch@linaro.org Received: by 10.140.97.165 with SMTP id m34csp1376854qge; Fri, 11 Nov 2016 09:27:08 -0800 (PST) X-Received: by 10.99.166.2 with SMTP id t2mr21499506pge.149.1478885228049; Fri, 11 Nov 2016 09:27:08 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id sh1si9356619pac.308.2016.11.11.09.27.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Nov 2016 09:27:08 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-441164-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-441164-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-441164-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; q=dns; s=default; b=q5Ls2A9ao1zwn+NiLVcHPNdVK/VRGPBscEcq8tY6SjveyjO79a wN56u7N/31bDshvoNXGbUjQKYGCWvHqUA+9BTZgMIuVbkZfN8lvwEIiNVeCeAa5k Y87JvvpcQVD5lrm6EQ/w447aRAAAyd8r6vnq4EYAXucMwdi5Xav2xvb0k= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; s= default; bh=bPwMAGfxP5g1JWxQUL+Pvf0mZoE=; b=lehLERd3N2mKM7+iKhvd a4T2d+9FheLyr2HTNYR0yIdzdWqcii8kh8NMQV7WQafa0O9dAvQb/RMGxEFUCiNk xrUSqcBWQViHp6XKMOyGfjpnYz4aJvyON3YDIgYw0BMnikj+6BdGSj51hvvoujBf xIQJl+fDCQytjy3RQTxwW2I= Received: (qmail 68996 invoked by alias); 11 Nov 2016 17:26:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 68945 invoked by uid 89); 11 Nov 2016 17:26:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL, BAYES_50, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=no version=3.3.2 spammy=H*c:sk:HHHPRHH, COND_EXPR, sk:complex, x1 X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com Received: from mail-he1eur01on0079.outbound.protection.outlook.com (HELO EUR01-HE1-obe.outbound.protection.outlook.com) (104.47.0.79) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 11 Nov 2016 17:26:36 +0000 Received: from HE1PR0801MB2027.eurprd08.prod.outlook.com (10.168.95.16) by HE1PR0802MB2618.eurprd08.prod.outlook.com (10.175.36.18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.707.6; Fri, 11 Nov 2016 17:26:32 +0000 Received: from HE1PR0801MB2027.eurprd08.prod.outlook.com ([10.168.95.16]) by HE1PR0801MB2027.eurprd08.prod.outlook.com ([10.168.95.16]) with mapi id 15.01.0707.013; Fri, 11 Nov 2016 17:26:32 +0000 From: Tamar Christina To: GCC Patches , Wilco Dijkstra , "rguenther@suse.de" , "law@redhat.com" , Michael Meissner CC: nd Subject: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE. Date: Fri, 11 Nov 2016 17:26:31 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Tamar.Christina@arm.com; x-ms-exchange-messagesentrepresentingtype: 1 x-microsoft-exchange-diagnostics: 1; HE1PR0802MB2618; 7:YfmJC9cDu7Q8pVHVIgKSJC9ZXFmQ3q0PXQX934Uufn0NwwsaSPc8Pbk+t0LjiAYZFVQkfAxFoSF3QezJxdIlxVCQ/9SPsYPFmpYCoYx7kad6Xqix0AqpXfKFQhPYgkfOi1LkN/wJdALtalvKvujT1aegbTsyhEU5mjHdnqEVXinVcBubA2j3jUb99QWWmUdd1n+GG8XNEiNfAs6WwnjUP/5MwbUOfxcmgz6HHl6AnBOAtsBhJmKH1pG0qiAji4OPJDoA2V3yWzl460bRz7noU5pdPh0NcUpcG8UEL8SDlHRjXAYEhij4vhZe3OpW30FH0KDs6gLranPAhIkHA6VB9BHzjzuGs23IuxE278LQ/3o= x-ms-office365-filtering-correlation-id: 4da2d301-89a4-40dd-7d21-08d40a57e11e x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:HE1PR0802MB2618; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415395)(6040176)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026); SRVR:HE1PR0802MB2618; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0802MB2618; x-forefront-prvs: 012349AD1C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(6009001)(7916002)(199003)(189002)(53754006)(377424004)(54356999)(77096005)(7696004)(92566002)(5660300001)(106116001)(81166006)(81156014)(99936001)(9686002)(106356001)(8936002)(101416001)(6116002)(50986999)(68736007)(2501003)(3660700001)(3280700002)(102836003)(3846002)(122556002)(2906002)(4326007)(105586002)(8676002)(74316002)(586003)(7736002)(2900100001)(7846002)(305945005)(87936001)(76576001)(97736004)(33656002)(66066001)(4001150100001)(2201001)(86362001)(189998001)(5001770100001)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0802MB2618; H:HE1PR0801MB2027.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Nov 2016 17:26:31.9144 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0802MB2618 X-IsSubscribed: yes Hi All, This is v3 of the patch which adds an optimized route to the fpclassify builtin for floating point numbers which are similar to IEEE-754 in format. The patch has been rewritten to do it in GIMPLE instead of a fold. As part of the implementation optimized versions of is_normal, is_subnormal, is_nan, is_infinite and is_zero have been created. This patch also introduces two new intrinsics __builtin_iszero and __builtin_issubnormal. NOTE: the old code for ISNORMAL, ISSUBNORMAL, ISNAN and ISINFINITE had a special case for ibm_extended_format which dropped the second part of the number (which was being represented as two numbers internally). fpclassify did not have such a case. As such I have dropped it as I am under the impression the format is deprecated? So the optimization isn't as important? If this is wrong it would be easy to add that back in. Should ISFINITE be change as well? Also should it be SUBNORMAL or DENORMAL? And what should I do about Documentation? I'm not sure how to document a new BUILTIN. The goal is to make it faster by: 1. Trying to determine the most common case first (e.g. the float is a Normal number) and then the rest. The amount of code generated at -O2 are about the same +/- 1 instruction, but the code is much better. 2. Using integer operation in the optimized path. At a high level, the optimized path uses integer operations to perform the following checks in the given order: - normal - zero - nan - infinite - subnormal The operations are ordered in the order of most occurrence of the values. In case the optimization can't be applied a fall-back method is used which is similar to the existing implementation using FP instructions. However the operations now also follow the same order as described above. Which means there should be some slight benefits there as well. A limitation with this new approach is that the exponent of the floating point has to fit in 32 bits and the floating point has to have an IEEE like format and values for NaN and INF (e.g. for NaN and INF all bits of the exp must be set). To determine this IEEE likeness a new boolean was added to real_format. As an example, AArch64 now generates for classification of doubles: f: fmov x1, d0 mov w0, 7 ubfx x2, x1, 52, 11 add w2, w2, 1 tst w2, 2046 bne .L1 lsl x1, x1, 1 mov w0, 13 cbz x1, .L1 mov x2, -9007199254740992 cmp x1, x2 mov w0, 5 mov w3, 11 csel w0, w0, w3, eq mov w1, 3 csel w0, w0, w1, ls .L1: ret and for the floating point version: f: adrp x2, .LC0 fabs d1, d0 adrp x1, .LC1 mov w0, 7 ldr d3, [x2, #:lo12:.LC0] ldr d2, [x1, #:lo12:.LC1] fcmpe d1, d3 fccmpe d1, d2, 2, ge bls .L1 fcmp d0, #0.0 mov w0, 13 beq .L1 fcmp d1, d1 bvs .L5 fcmpe d1, d2 mov w0, 5 mov w1, 11 csel w0, w0, w1, gt .L1: ret .L5: mov w0, 3 ret One new test to test that the integer version does not generate FP code, correctness is tested using the existing test code for FP classifiy. Glibc benchmarks ran against the built-in and this shows the following performance gain on Aarch64 using the integer code: * zero: 0% * inf/nan: 29% * normal: 69.1% On x86_64: * zero: 0% * inf/nan: 89.9% * normal: 4.7% Regression tests ran on aarch64-none-linux and arm-none-linux-gnueabi and no regression. x86_64 bootstrapped successfully as well. Ok for trunk? Thanks, Tamar gcc/ 2016-11-11 Tamar Christina * gcc/builtins.c (fold_builtin_fpclassify): Removed. (expand_builtin): Added builtins to lowering list. (fold_builtin_n): Removed fold_builtin_varargs. (fold_builtin_varargs): Removed. * gcc/builtins.def (BUILT_IN_ISZERO, BUILT_IN_ISSUBNORMAL): Added. (fold_builtin_interclass_mathfn): Use get_min_float instead. * gcc/real.h (get_min_float): Added. * gcc/real.c (get_min_float): Added. * gcc/gimple-low.c (lower_stm): Define BUILT_IN_FPCLASSIFY, CASE_FLT_FN (BUILT_IN_ISINF), BUILT_IN_ISINFD32, BUILT_IN_ISINFD64, BUILT_IN_ISINFD128, BUILT_IN_ISNAND32, BUILT_IN_ISNAND64, BUILT_IN_ISNAND128, BUILT_IN_ISNAN, BUILT_IN_ISNORMAL, BUILT_IN_ISZERO, BUILT_IN_ISSUBNORMAL. (lower_builtin_fpclassify, is_nan, is_normal, is_infinity): Added. (is_zero, is_subnormal, use_ieee_int_mode): Likewise. (lower_builtin_isnan, lower_builtin_isinfinite): Likewise. (lower_builtin_isnormal, lower_builtin_iszero): Likewise. (lower_builtin_issubnormal): Likewise. (emit_tree_cond, get_num_as_int, emit_tree_and_return_var): Added. * gcc/real.h (real_format): Added is_ieee_compatible field. * gcc/real.c (ieee_single_format): Set is_ieee_compatible flag. (mips_single_format): Likewise. (motorola_single_format): Likewise. (spu_single_format): Likewise. (ieee_double_format): Likewise. (mips_double_format): Likewise. (motorola_double_format): Likewise. (ieee_extended_motorola_format): Likewise. (ieee_extended_intel_128_format): Likewise. (ieee_extended_intel_96_round_53_format): Likewise. (ibm_extended_format): Likewise. (mips_extended_format): Likewise. (ieee_quad_format): Likewise. (mips_quad_format): Likewise. (vax_f_format): Likewise. (vax_d_format): Likewise. (vax_g_format): Likewise. (decimal_single_format): Likewise. (decimal_quad_format): Likewise. (iee_half_format): Likewise. (mips_single_format): Likewise. (arm_half_format): Likewise. (real_internal_format): Likewise. gcc/testsuite/ 2016-11-11 Tamar Christina * gcc.target/aarch64/builtin-fpclassify.c: New codegen test. * gcc.dg/fold-notunord.c: Removed. diff --git a/gcc/builtins.c b/gcc/builtins.c index 3ac2d44148440b124559ba7cd3de483b7a74b72d..fb09d342c836d68ef40a90fca803dbd496407ecb 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -160,7 +160,6 @@ static tree fold_builtin_0 (location_t, tree); static tree fold_builtin_1 (location_t, tree, tree); static tree fold_builtin_2 (location_t, tree, tree, tree); static tree fold_builtin_3 (location_t, tree, tree, tree, tree); -static tree fold_builtin_varargs (location_t, tree, tree*, int); static tree fold_builtin_strpbrk (location_t, tree, tree, tree); static tree fold_builtin_strstr (location_t, tree, tree, tree); @@ -5998,10 +5997,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, if (! flag_unsafe_math_optimizations) break; gcc_fallthrough (); - CASE_FLT_FN (BUILT_IN_ISINF): CASE_FLT_FN (BUILT_IN_FINITE): case BUILT_IN_ISFINITE: - case BUILT_IN_ISNORMAL: target = expand_builtin_interclass_mathfn (exp, target); if (target) return target; @@ -6281,8 +6278,20 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, } break; + CASE_FLT_FN (BUILT_IN_ISINF): + case BUILT_IN_ISNAND32: + case BUILT_IN_ISNAND64: + case BUILT_IN_ISNAND128: + case BUILT_IN_ISNAN: + case BUILT_IN_ISINFD32: + case BUILT_IN_ISINFD64: + case BUILT_IN_ISINFD128: + case BUILT_IN_ISNORMAL: + case BUILT_IN_ISZERO: + case BUILT_IN_ISSUBNORMAL: + case BUILT_IN_FPCLASSIFY: case BUILT_IN_SETJMP: - /* This should have been lowered to the builtins below. */ + /* These should have been lowered to the builtins below. */ gcc_unreachable (); case BUILT_IN_SETJMP_SETUP: @@ -7646,30 +7655,6 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg) switch (DECL_FUNCTION_CODE (fndecl)) { tree result; - - CASE_FLT_FN (BUILT_IN_ISINF): - { - /* isinf(x) -> isgreater(fabs(x),DBL_MAX). */ - tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER); - tree type = TREE_TYPE (arg); - REAL_VALUE_TYPE r; - char buf[128]; - - if (is_ibm_extended) - { - /* NaN and Inf are encoded in the high-order double value - only. The low-order value is not significant. */ - type = double_type_node; - mode = DFmode; - arg = fold_build1_loc (loc, NOP_EXPR, type, arg); - } - get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf)); - real_from_string (&r, buf); - result = build_call_expr (isgr_fn, 2, - fold_build1_loc (loc, ABS_EXPR, type, arg), - build_real (type, r)); - return result; - } CASE_FLT_FN (BUILT_IN_FINITE): case BUILT_IN_ISFINITE: { @@ -7701,79 +7686,6 @@ fold_builtin_interclass_mathfn (location_t loc, tree fndecl, tree arg) result);*/ return result; } - case BUILT_IN_ISNORMAL: - { - /* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) & - islessequal(fabs(x),DBL_MAX). */ - tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL); - tree type = TREE_TYPE (arg); - tree orig_arg, max_exp, min_exp; - machine_mode orig_mode = mode; - REAL_VALUE_TYPE rmax, rmin; - char buf[128]; - - orig_arg = arg = builtin_save_expr (arg); - if (is_ibm_extended) - { - /* Use double to test the normal range of IBM extended - precision. Emin for IBM extended precision is - different to emin for IEEE double, being 53 higher - since the low double exponent is at least 53 lower - than the high double exponent. */ - type = double_type_node; - mode = DFmode; - arg = fold_build1_loc (loc, NOP_EXPR, type, arg); - } - arg = fold_build1_loc (loc, ABS_EXPR, type, arg); - - get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf)); - real_from_string (&rmax, buf); - sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1); - real_from_string (&rmin, buf); - max_exp = build_real (type, rmax); - min_exp = build_real (type, rmin); - - max_exp = build_call_expr (isle_fn, 2, arg, max_exp); - if (is_ibm_extended) - { - /* Testing the high end of the range is done just using - the high double, using the same test as isfinite(). - For the subnormal end of the range we first test the - high double, then if its magnitude is equal to the - limit of 0x1p-969, we test whether the low double is - non-zero and opposite sign to the high double. */ - tree const islt_fn = builtin_decl_explicit (BUILT_IN_ISLESS); - tree const isgt_fn = builtin_decl_explicit (BUILT_IN_ISGREATER); - tree gt_min = build_call_expr (isgt_fn, 2, arg, min_exp); - tree eq_min = fold_build2 (EQ_EXPR, integer_type_node, - arg, min_exp); - tree as_complex = build1 (VIEW_CONVERT_EXPR, - complex_double_type_node, orig_arg); - tree hi_dbl = build1 (REALPART_EXPR, type, as_complex); - tree lo_dbl = build1 (IMAGPART_EXPR, type, as_complex); - tree zero = build_real (type, dconst0); - tree hilt = build_call_expr (islt_fn, 2, hi_dbl, zero); - tree lolt = build_call_expr (islt_fn, 2, lo_dbl, zero); - tree logt = build_call_expr (isgt_fn, 2, lo_dbl, zero); - tree ok_lo = fold_build1 (TRUTH_NOT_EXPR, integer_type_node, - fold_build3 (COND_EXPR, - integer_type_node, - hilt, logt, lolt)); - eq_min = fold_build2 (TRUTH_ANDIF_EXPR, integer_type_node, - eq_min, ok_lo); - min_exp = fold_build2 (TRUTH_ORIF_EXPR, integer_type_node, - gt_min, eq_min); - } - else - { - tree const isge_fn - = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL); - min_exp = build_call_expr (isge_fn, 2, arg, min_exp); - } - result = fold_build2 (BIT_AND_EXPR, integer_type_node, - max_exp, min_exp); - return result; - } default: break; } @@ -7794,12 +7706,6 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index) switch (builtin_index) { - case BUILT_IN_ISINF: - if (!HONOR_INFINITIES (arg)) - return omit_one_operand_loc (loc, type, integer_zero_node, arg); - - return NULL_TREE; - case BUILT_IN_ISINF_SIGN: { /* isinf_sign(x) -> isinf(x) ? (signbit(x) ? -1 : 1) : 0 */ @@ -7838,100 +7744,11 @@ fold_builtin_classify (location_t loc, tree fndecl, tree arg, int builtin_index) return omit_one_operand_loc (loc, type, integer_one_node, arg); return NULL_TREE; - - case BUILT_IN_ISNAN: - if (!HONOR_NANS (arg)) - return omit_one_operand_loc (loc, type, integer_zero_node, arg); - - { - bool is_ibm_extended = MODE_COMPOSITE_P (TYPE_MODE (TREE_TYPE (arg))); - if (is_ibm_extended) - { - /* NaN and Inf are encoded in the high-order double value - only. The low-order value is not significant. */ - arg = fold_build1_loc (loc, NOP_EXPR, double_type_node, arg); - } - } - arg = builtin_save_expr (arg); - return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg); - default: gcc_unreachable (); } } -/* Fold a call to __builtin_fpclassify(int, int, int, int, int, ...). - This builtin will generate code to return the appropriate floating - point classification depending on the value of the floating point - number passed in. The possible return values must be supplied as - int arguments to the call in the following order: FP_NAN, FP_INFINITE, - FP_NORMAL, FP_SUBNORMAL and FP_ZERO. The ellipses is for exactly - one floating point argument which is "type generic". */ - -static tree -fold_builtin_fpclassify (location_t loc, tree *args, int nargs) -{ - tree fp_nan, fp_infinite, fp_normal, fp_subnormal, fp_zero, - arg, type, res, tmp; - machine_mode mode; - REAL_VALUE_TYPE r; - char buf[128]; - - /* Verify the required arguments in the original call. */ - if (nargs != 6 - || !validate_arg (args[0], INTEGER_TYPE) - || !validate_arg (args[1], INTEGER_TYPE) - || !validate_arg (args[2], INTEGER_TYPE) - || !validate_arg (args[3], INTEGER_TYPE) - || !validate_arg (args[4], INTEGER_TYPE) - || !validate_arg (args[5], REAL_TYPE)) - return NULL_TREE; - - fp_nan = args[0]; - fp_infinite = args[1]; - fp_normal = args[2]; - fp_subnormal = args[3]; - fp_zero = args[4]; - arg = args[5]; - type = TREE_TYPE (arg); - mode = TYPE_MODE (type); - arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg)); - - /* fpclassify(x) -> - isnan(x) ? FP_NAN : - (fabs(x) == Inf ? FP_INFINITE : - (fabs(x) >= DBL_MIN ? FP_NORMAL : - (x == 0 ? FP_ZERO : FP_SUBNORMAL))). */ - - tmp = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, - build_real (type, dconst0)); - res = fold_build3_loc (loc, COND_EXPR, integer_type_node, - tmp, fp_zero, fp_subnormal); - - sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1); - real_from_string (&r, buf); - tmp = fold_build2_loc (loc, GE_EXPR, integer_type_node, - arg, build_real (type, r)); - res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp, fp_normal, res); - - if (HONOR_INFINITIES (mode)) - { - real_inf (&r); - tmp = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, - build_real (type, r)); - res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp, - fp_infinite, res); - } - - if (HONOR_NANS (mode)) - { - tmp = fold_build2_loc (loc, ORDERED_EXPR, integer_type_node, arg, arg); - res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp, res, fp_nan); - } - - return res; -} - /* Fold a call to an unordered comparison function such as __builtin_isgreater(). FNDECL is the FUNCTION_DECL for the function being called and ARG0 and ARG1 are the arguments for the call. @@ -8243,30 +8060,9 @@ fold_builtin_1 (location_t loc, tree fndecl, tree arg0) return ret; return fold_builtin_interclass_mathfn (loc, fndecl, arg0); } - - CASE_FLT_FN (BUILT_IN_ISINF): - case BUILT_IN_ISINFD32: - case BUILT_IN_ISINFD64: - case BUILT_IN_ISINFD128: - { - tree ret = fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISINF); - if (ret) - return ret; - return fold_builtin_interclass_mathfn (loc, fndecl, arg0); - } - - case BUILT_IN_ISNORMAL: - return fold_builtin_interclass_mathfn (loc, fndecl, arg0); - case BUILT_IN_ISINF_SIGN: return fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISINF_SIGN); - CASE_FLT_FN (BUILT_IN_ISNAN): - case BUILT_IN_ISNAND32: - case BUILT_IN_ISNAND64: - case BUILT_IN_ISNAND128: - return fold_builtin_classify (loc, fndecl, arg0, BUILT_IN_ISNAN); - case BUILT_IN_FREE: if (integer_zerop (arg0)) return build_empty_stmt (loc); @@ -8465,7 +8261,11 @@ fold_builtin_n (location_t loc, tree fndecl, tree *args, int nargs, bool) ret = fold_builtin_3 (loc, fndecl, args[0], args[1], args[2]); break; default: - ret = fold_builtin_varargs (loc, fndecl, args, nargs); + /* There used to be a call to fold_builtin_varargs here, but with the + lowering of fpclassify which was it's only member the function became + redundant. As such it has been removed. The function's default case + was the same as what is below the switch here, so the function can + safely be removed. */ break; } if (ret) @@ -9422,37 +9222,6 @@ fold_builtin_object_size (tree ptr, tree ost) return NULL_TREE; } -/* Builtins with folding operations that operate on "..." arguments - need special handling; we need to store the arguments in a convenient - data structure before attempting any folding. Fortunately there are - only a few builtins that fall into this category. FNDECL is the - function, EXP is the CALL_EXPR for the call. */ - -static tree -fold_builtin_varargs (location_t loc, tree fndecl, tree *args, int nargs) -{ - enum built_in_function fcode = DECL_FUNCTION_CODE (fndecl); - tree ret = NULL_TREE; - - switch (fcode) - { - case BUILT_IN_FPCLASSIFY: - ret = fold_builtin_fpclassify (loc, args, nargs); - break; - - default: - break; - } - if (ret) - { - ret = build1 (NOP_EXPR, TREE_TYPE (ret), ret); - SET_EXPR_LOCATION (ret, loc); - TREE_NO_WARNING (ret) = 1; - return ret; - } - return NULL_TREE; -} - /* Initialize format string characters in the target charset. */ bool diff --git a/gcc/builtins.def b/gcc/builtins.def index 219feebd3aebefbd079bf37cc801453cd1965e00..e3d12eccfed528fd6df0570b65f8aef42494d675 100644 --- a/gcc/builtins.def +++ b/gcc/builtins.def @@ -831,6 +831,8 @@ DEF_EXT_LIB_BUILTIN (BUILT_IN_ISINFL, "isinfl", BT_FN_INT_LONGDOUBLE, ATTR_CO DEF_EXT_LIB_BUILTIN (BUILT_IN_ISINFD32, "isinfd32", BT_FN_INT_DFLOAT32, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_EXT_LIB_BUILTIN (BUILT_IN_ISINFD64, "isinfd64", BT_FN_INT_DFLOAT64, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_EXT_LIB_BUILTIN (BUILT_IN_ISINFD128, "isinfd128", BT_FN_INT_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_C99_C90RES_BUILTIN (BUILT_IN_ISZERO, "iszero", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) +DEF_C99_C90RES_BUILTIN (BUILT_IN_ISSUBNORMAL, "issubnormal", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_C99_C90RES_BUILTIN (BUILT_IN_ISNAN, "isnan", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC_LEAF) DEF_EXT_LIB_BUILTIN (BUILT_IN_ISNANF, "isnanf", BT_FN_INT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST) DEF_EXT_LIB_BUILTIN (BUILT_IN_ISNANL, "isnanl", BT_FN_INT_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST) diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c index 64752b67b86b3d01df5f5661e4666df98b7b91d1..ac9dd6cb319eba25402a68fd4ffecfdc8f0d2118 100644 --- a/gcc/gimple-low.c +++ b/gcc/gimple-low.c @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. If not see #include "calls.h" #include "gimple-iterator.h" #include "gimple-low.h" +#include "stor-layout.h" +#include "target.h" /* The differences between High GIMPLE and Low GIMPLE are the following: @@ -72,6 +74,12 @@ static void lower_gimple_bind (gimple_stmt_iterator *, struct lower_data *); static void lower_try_catch (gimple_stmt_iterator *, struct lower_data *); static void lower_gimple_return (gimple_stmt_iterator *, struct lower_data *); static void lower_builtin_setjmp (gimple_stmt_iterator *); +static void lower_builtin_fpclassify (gimple_stmt_iterator *); +static void lower_builtin_isnan (gimple_stmt_iterator *); +static void lower_builtin_isinfinite (gimple_stmt_iterator *); +static void lower_builtin_isnormal (gimple_stmt_iterator *); +static void lower_builtin_iszero (gimple_stmt_iterator *); +static void lower_builtin_issubnormal (gimple_stmt_iterator *); static void lower_builtin_posix_memalign (gimple_stmt_iterator *); @@ -330,19 +338,61 @@ lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data) if (decl && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL) { - if (DECL_FUNCTION_CODE (decl) == BUILT_IN_SETJMP) - { - lower_builtin_setjmp (gsi); - data->cannot_fallthru = false; - return; - } - else if (DECL_FUNCTION_CODE (decl) == BUILT_IN_POSIX_MEMALIGN - && flag_tree_bit_ccp - && gimple_builtin_call_types_compatible_p (stmt, decl)) - { - lower_builtin_posix_memalign (gsi); - return; - } + switch (DECL_FUNCTION_CODE (decl)) + { + case BUILT_IN_SETJMP: + lower_builtin_setjmp (gsi); + data->cannot_fallthru = false; + return; + + case BUILT_IN_POSIX_MEMALIGN: + if (flag_tree_bit_ccp + && gimple_builtin_call_types_compatible_p (stmt, decl)) + { + lower_builtin_posix_memalign (gsi); + return; + } + break; + + case BUILT_IN_FPCLASSIFY: + lower_builtin_fpclassify (gsi); + data->cannot_fallthru = false; + return; + + CASE_FLT_FN (BUILT_IN_ISINF): + case BUILT_IN_ISINFD32: + case BUILT_IN_ISINFD64: + case BUILT_IN_ISINFD128: + lower_builtin_isinfinite (gsi); + data->cannot_fallthru = false; + return; + + case BUILT_IN_ISNAND32: + case BUILT_IN_ISNAND64: + case BUILT_IN_ISNAND128: + CASE_FLT_FN (BUILT_IN_ISNAN): + lower_builtin_isnan (gsi); + data->cannot_fallthru = false; + return; + + case BUILT_IN_ISNORMAL: + lower_builtin_isnormal (gsi); + data->cannot_fallthru = false; + return; + + case BUILT_IN_ISZERO: + lower_builtin_iszero (gsi); + data->cannot_fallthru = false; + return; + + case BUILT_IN_ISSUBNORMAL: + lower_builtin_issubnormal (gsi); + data->cannot_fallthru = false; + return; + + default: + break; + } } if (decl && (flags_from_decl_or_type (decl) & ECF_NORETURN)) @@ -822,6 +872,580 @@ lower_builtin_setjmp (gimple_stmt_iterator *gsi) gsi_remove (gsi, false); } +static tree +emit_tree_and_return_var (gimple_seq *seq, tree arg) +{ + tree tmp = create_tmp_reg (TREE_TYPE(arg)); + gassign *stm = gimple_build_assign(tmp, arg); + gimple_seq_add_stmt (seq, stm); + return tmp; +} + +/* This function builds an if statement that ends up using explicit branches + instead of becoming a csel. This function assumes you will fall through to + the next statements after this condition for the false branch. */ +static void +emit_tree_cond (gimple_seq *seq, tree result_variable, tree exit_label, + tree cond, tree true_branch) +{ + /* Create labels for fall through */ + tree true_label = create_artificial_label (UNKNOWN_LOCATION); + tree false_label = create_artificial_label (UNKNOWN_LOCATION); + gcond *stmt = gimple_build_cond_from_tree (cond, true_label, false_label); + gimple_seq_add_stmt (seq, stmt); + + /* Build the true case. */ + gimple_seq_add_stmt (seq, gimple_build_label (true_label)); + tree value = TREE_CONSTANT (true_branch) + ? true_branch + : emit_tree_and_return_var (seq, true_branch); + gimple_seq_add_stmt (seq, gimple_build_assign (result_variable, value)); + gimple_seq_add_stmt (seq, gimple_build_goto (exit_label)); + + /* Build the false case. */ + gimple_seq_add_stmt (seq, gimple_build_label (false_label)); +} + +static tree +get_num_as_int (gimple_seq *seq, tree arg, location_t loc) +{ + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + const real_format *format = REAL_MODE_FORMAT (mode); + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + + gcc_assert (format->b == 2); + + /* Re-interpret the float as an unsigned integer type + with equal precision. */ + tree int_arg_type = build_nonstandard_integer_type (type_width, true); + tree conv_arg = fold_build1_loc (loc, VIEW_CONVERT_EXPR, int_arg_type, arg); + return emit_tree_and_return_var(seq, conv_arg); +} + + /* Check if the number that is being classified is close enough to IEEE 754 + format to be able to go in the early exit code. */ +static bool +use_ieee_int_mode (tree arg) +{ + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + + const real_format *format = REAL_MODE_FORMAT (mode); + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + return (format->is_binary_ieee_compatible + && FLOAT_WORDS_BIG_ENDIAN == WORDS_BIG_ENDIAN + /* We explicitly disable quad float support on 32 bit systems. */ + && !(UNITS_PER_WORD == 4 && type_width == 128) + && targetm.scalar_mode_supported_p (mode)); +} + +static tree +is_normal (gimple_seq *seq, tree arg, location_t loc) +{ + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + const real_format *format = REAL_MODE_FORMAT (mode); + const tree bool_type = boolean_type_node; + + /* If not using optimized route then exit early. */ + if (!use_ieee_int_mode (arg)) + { + REAL_VALUE_TYPE rinf, rmin; + tree arg_p + = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type, + arg)); + char buf[128]; + real_inf(&rinf); + get_min_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf)); + real_from_string (&rmin, buf); + + tree inf_exp = fold_build2_loc (loc, LT_EXPR, bool_type, arg_p, + build_real (type, rinf)); + + tree min_exp = fold_build2_loc (loc, GE_EXPR, bool_type, arg_p, + build_real (type, rmin)); + + tree res + = fold_build2_loc (loc, BIT_AND_EXPR, bool_type, + emit_tree_and_return_var (seq, min_exp), + emit_tree_and_return_var (seq, inf_exp)); + + return emit_tree_and_return_var (seq, res); + } + + gcc_assert (format->b == 2); + + const tree int_type = unsigned_type_node; + const int exp_bits = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p; + const int exp_mask = (1 << exp_bits) - 1; + + /* Get the number reinterpreted as an integer. */ + tree int_arg = get_num_as_int (seq, arg, loc); + + /* Extract exp bits from the float, where we expect the exponent to be. + We create a new type because BIT_FIELD_REF does not allow you to + extract less bits than the precision of the storage variable. */ + tree exp_tmp + = fold_build3_loc (loc, BIT_FIELD_REF, + build_nonstandard_integer_type (exp_bits, true), + int_arg, + build_int_cstu (int_type, exp_bits), + build_int_cstu (int_type, format->p - 1)); + tree exp_bitfield = emit_tree_and_return_var (seq, exp_tmp); + + /* Re-interpret the extracted exponent bits as a 32 bit int. + This allows us to continue doing operations as int_type. */ + tree exp + = emit_tree_and_return_var(seq,fold_build1_loc (loc, NOP_EXPR, int_type, + exp_bitfield)); + + /* exp_mask & ~1. */ + tree mask_check + = fold_build2_loc (loc, BIT_AND_EXPR, int_type, + build_int_cstu (int_type, exp_mask), + fold_build1_loc (loc, BIT_NOT_EXPR, int_type, + build_int_cstu (int_type, 1))); + + /* (exp + 1) & mask_check. + Check to see if exp is not all 0 or all 1. */ + tree exp_check + = fold_build2_loc (loc, BIT_AND_EXPR, int_type, + emit_tree_and_return_var (seq, + fold_build2_loc (loc, PLUS_EXPR, int_type, exp, + build_int_cstu (int_type, 1))), + mask_check); + + tree res = fold_build2_loc (loc, NE_EXPR, boolean_type_node, + build_int_cstu (int_type, 0), + emit_tree_and_return_var(seq, exp_check)); + + return emit_tree_and_return_var (seq, res); +} + +static tree +is_zero (gimple_seq *seq, tree arg, location_t loc) +{ + tree type = TREE_TYPE (arg); + + /* If not using optimized route then exit early. */ + if (!use_ieee_int_mode (arg)) + { + tree arg_p + = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type, + arg)); + tree res = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, arg_p, + build_real (type, dconst0)); + return emit_tree_and_return_var (seq, res); + } + + machine_mode mode = TYPE_MODE (type); + const real_format *format = REAL_MODE_FORMAT (mode); + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + + gcc_assert (format->b == 2); + + tree int_arg_type = build_nonstandard_integer_type (type_width, true); + + /* Get the number reinterpreted as an integer. + Shift left to remove the sign. */ + tree int_arg + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + get_num_as_int (seq, arg, loc), + build_int_cstu (int_arg_type, 1)); + + /* num << 1 == 0. + This checks to see if the number is zero. */ + tree zero_check + = fold_build2_loc (loc, EQ_EXPR, boolean_type_node, + build_int_cstu (int_arg_type, 0), + emit_tree_and_return_var (seq, int_arg)); + + return emit_tree_and_return_var (seq, zero_check); +} + +static tree +is_subnormal (gimple_seq *seq, tree arg, location_t loc) +{ + const tree bool_type = boolean_type_node; + + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + const real_format *format = REAL_MODE_FORMAT (mode); + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + + tree int_arg_type = build_nonstandard_integer_type (type_width, true); + + gcc_assert (format->b == 2); + + /* If not using optimized route then exit early. */ + if (!use_ieee_int_mode (arg)) + { + tree arg_p + = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type, + arg)); + REAL_VALUE_TYPE r; + char buf[128]; + sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1); + real_from_string (&r, buf); + tree subnorm = fold_build2_loc (loc, LT_EXPR, bool_type, + arg_p, build_real (type, r)); + + tree zero = fold_build2_loc (loc, GT_EXPR, bool_type, arg_p, + build_real (type, dconst0)); + + tree res + = fold_build2_loc (loc, BIT_AND_EXPR, bool_type, + emit_tree_and_return_var (seq, subnorm), + emit_tree_and_return_var (seq, zero)); + + return emit_tree_and_return_var (seq, res); + } + + /* Get the number reinterpreted as an integer. + Shift left to remove the sign. */ + tree int_arg + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + get_num_as_int (seq, arg, loc), + build_int_cstu (int_arg_type, 1)); + + /* Check for a zero exponent and non-zero mantissa. + This can be done with two comparisons by first apply a + removing the sign bit and checking if the value is larger + than the mantissa mask. */ + + /* This creates a mask to be used to check the mantissa value in the shifted + integer representation of the fpnum. */ + tree significant_bit = build_int_cstu (int_arg_type, format->p - 1); + tree mantissa_mask + = fold_build2_loc (loc, MINUS_EXPR, int_arg_type, + fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + build_int_cstu (int_arg_type, 2), + significant_bit), + build_int_cstu (int_arg_type, 1)); + + /* Check if exponent is zero and mantissa is not. */ + tree subnorm_check + = emit_tree_and_return_var(seq, + fold_build2_loc (loc, LE_EXPR, bool_type, + emit_tree_and_return_var(seq, int_arg), + mantissa_mask)); + + return emit_tree_and_return_var (seq, subnorm_check); +} + +static tree +is_infinity (gimple_seq *seq, tree arg, location_t loc) +{ + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + const tree bool_type = boolean_type_node; + + if (!HONOR_INFINITIES (mode)) + { + return build_int_cst (bool_type, 0); + } + + /* If not using optimized route then exit early. */ + if (!use_ieee_int_mode (arg)) + { + tree arg_p + = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type, + arg)); + REAL_VALUE_TYPE r; + real_inf (&r); + tree res = fold_build2_loc (loc, EQ_EXPR, bool_type, arg_p, + build_real (type, r)); + + return emit_tree_and_return_var (seq, res); + } + + const real_format *format = REAL_MODE_FORMAT (mode); + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + + gcc_assert (format->b == 2); + + tree int_arg_type = build_nonstandard_integer_type (type_width, true); + + /* This creates a mask to be used to check the exp value in the shifted + integer representation of the fpnum. */ + const int exp_bits = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p; + gcc_assert (format->p > 0); + + tree significant_bit = build_int_cstu (int_arg_type, format->p); + tree exp_mask + = fold_build2_loc (loc, MINUS_EXPR, int_arg_type, + fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + build_int_cstu (int_arg_type, 2), + build_int_cstu (int_arg_type, exp_bits - 1)), + build_int_cstu (int_arg_type, 1)); + + /* Get the number reinterpreted as an integer. + Shift left to remove the sign. */ + tree int_arg + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + get_num_as_int (seq, arg, loc), + build_int_cstu (int_arg_type, 1)); + + /* This mask checks to see if the exp has all bits set and mantissa no + bits set. */ + tree inf_mask + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, exp_mask, significant_bit); + + /* Check if exponent has all bits set and mantissa is 0. */ + tree inf_check + = emit_tree_and_return_var(seq, + fold_build2_loc (loc, EQ_EXPR, bool_type, + emit_tree_and_return_var(seq, int_arg), + inf_mask)); + + return emit_tree_and_return_var (seq, inf_check); +} + +/* Determines if the given number is a NaN value. + This function is the last in the chain and only has to + check if it's preconditions are true. */ +static tree +is_nan (gimple_seq *seq, tree arg, location_t loc) +{ + tree type = TREE_TYPE (arg); + + machine_mode mode = TYPE_MODE (type); + const real_format *format = REAL_MODE_FORMAT (mode); + const tree bool_type = boolean_type_node; + + if (!HONOR_NANS (mode)) + { + return build_int_cst (bool_type, 0); + } + + /* If not using optimized route then exit early. */ + if (!use_ieee_int_mode (arg)) + { + tree arg_p + = emit_tree_and_return_var (seq, fold_build1_loc (loc, ABS_EXPR, type, + arg)); + tree eq_check + = fold_build2_loc (loc, ORDERED_EXPR, bool_type,arg_p, arg_p); + + tree res + = fold_build1_loc (loc, BIT_NOT_EXPR, bool_type, + emit_tree_and_return_var (seq, eq_check)); + + return emit_tree_and_return_var (seq, res); + } + + const HOST_WIDE_INT type_width = TYPE_PRECISION (type); + tree int_arg_type = build_nonstandard_integer_type (type_width, true); + + /* This creates a mask to be used to check the exp value in the shifted + integer representation of the fpnum. */ + const int exp_bits = (GET_MODE_SIZE (mode) * BITS_PER_UNIT) - format->p; + tree significant_bit = build_int_cstu (int_arg_type, format->p); + tree exp_mask + = fold_build2_loc (loc, MINUS_EXPR, int_arg_type, + fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + build_int_cstu (int_arg_type, 2), + build_int_cstu (int_arg_type, exp_bits - 1)), + build_int_cstu (int_arg_type, 1)); + + /* Get the number reinterpreted as an integer. + Shift left to remove the sign. */ + tree int_arg + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, + get_num_as_int (seq, arg, loc), + build_int_cstu (int_arg_type, 1)); + + /* This mask checks to see if the exp has all bits set and mantissa no + bits set. */ + tree inf_mask + = fold_build2_loc (loc, LSHIFT_EXPR, int_arg_type, exp_mask, significant_bit); + + /* Check if exponent has all bits set and mantissa is not 0. */ + tree nan_check + = emit_tree_and_return_var(seq, + fold_build2_loc (loc, GT_EXPR, bool_type, + emit_tree_and_return_var(seq, int_arg), + inf_mask)); + + return emit_tree_and_return_var (seq, nan_check); +} + +/* Validate a single argument ARG against a tree code CODE representing + a type. */ +static bool +gimple_validate_arg (gimple* call, int index, enum tree_code code) +{ + const tree arg = gimple_call_arg(call, index); + if (!arg) + return false; + else if (code == POINTER_TYPE) + return POINTER_TYPE_P (TREE_TYPE (arg)); + else if (code == INTEGER_TYPE) + return INTEGRAL_TYPE_P (TREE_TYPE (arg)); + return code == TREE_CODE (TREE_TYPE (arg)); +} + +/* Lowers calls to __builtin_fpclassify to + fpclassify (x) -> + isnormal(x) ? FP_NORMAL : + iszero (x) ? FP_ZERO : + isnan (x) ? FP_NAN : + isinfinite (x) ? FP_INFINITE : + FP_SUBNORMAL. + + The code may use integer arithmentic if it decides + that the produced assembly would be faster. This can only be done + for numbers that are similar to IEEE-754 in format. + + This builtin will generate code to return the appropriate floating + point classification depending on the value of the floating point + number passed in. The possible return values must be supplied as + int arguments to the call in the following order: FP_NAN, FP_INFINITE, + FP_NORMAL, FP_SUBNORMAL and FP_ZERO. The ellipses is for exactly + one floating point argument which is "type generic". +*/ +static void +lower_builtin_fpclassify (gimple_stmt_iterator *gsi) +{ + gimple *call = gsi_stmt (*gsi); + location_t loc = gimple_location (call); + + /* Verify the required arguments in the original call. */ + if (gimple_call_num_args (call) != 6 + || !gimple_validate_arg (call, 0, INTEGER_TYPE) + || !gimple_validate_arg (call, 1, INTEGER_TYPE) + || !gimple_validate_arg (call, 2, INTEGER_TYPE) + || !gimple_validate_arg (call, 3, INTEGER_TYPE) + || !gimple_validate_arg (call, 4, INTEGER_TYPE) + || !gimple_validate_arg (call, 5, REAL_TYPE)) + return; + + /* Collect the arguments from the call. */ + tree fp_nan = gimple_call_arg (call, 0); + tree fp_infinite = gimple_call_arg (call, 1); + tree fp_normal = gimple_call_arg (call, 2); + tree fp_subnormal = gimple_call_arg (call, 3); + tree fp_zero = gimple_call_arg (call, 4); + tree arg = gimple_call_arg (call, 5); + + gimple_seq body = NULL; + + /* Create label to jump to to exit. */ + tree done_label = create_artificial_label (UNKNOWN_LOCATION); + tree dest; + tree orig_dest = dest = gimple_call_lhs (call); + if (orig_dest && TREE_CODE (orig_dest) == SSA_NAME) + dest = create_tmp_reg (TREE_TYPE (orig_dest)); + + emit_tree_cond (&body, dest, done_label, + is_normal(&body, arg, loc), fp_normal); + emit_tree_cond (&body, dest, done_label, + is_zero(&body, arg, loc), fp_zero); + emit_tree_cond (&body, dest, done_label, + is_nan(&body, arg, loc), fp_nan); + emit_tree_cond (&body, dest, done_label, + is_infinity(&body, arg, loc), fp_infinite); + + /* And finally, emit the default case if nothing else matches. + This replaces the call to is_subnormal. */ + gimple_seq_add_stmt (&body, gimple_build_assign (dest, fp_subnormal)); + gimple_seq_add_stmt (&body, gimple_build_label (done_label)); + + /* Build orig_dest = dest if necessary. */ + if (dest != orig_dest) + { + gimple_seq_add_stmt (&body, gimple_build_assign (orig_dest, dest)); + } + + gsi_insert_seq_before (gsi, body, GSI_SAME_STMT); + + + /* Remove the call to __builtin_fpclassify. */ + gsi_remove (gsi, false); +} + +static void +gen_call_fp_builtin (gimple_stmt_iterator *gsi, + tree (*fndecl)(gimple_seq *, tree, location_t)) +{ + gimple *call = gsi_stmt (*gsi); + location_t loc = gimple_location (call); + + /* Verify the required arguments in the original call. */ + if (gimple_call_num_args (call) != 1 + || !gimple_validate_arg (call, 0, REAL_TYPE)) + return; + + tree arg = gimple_call_arg (call, 0); + gimple_seq body = NULL; + + /* Create label to jump to to exit. */ + tree done_label = create_artificial_label (UNKNOWN_LOCATION); + tree dest; + tree orig_dest = dest = gimple_call_lhs (call); + tree type = TREE_TYPE (orig_dest); + if (orig_dest && TREE_CODE (orig_dest) == SSA_NAME) + dest = create_tmp_reg (type); + + tree t_true = build_int_cst (type, true); + tree t_false = build_int_cst (type, false); + + emit_tree_cond (&body, dest, done_label, + fndecl(&body, arg, loc), t_true); + + /* And finally, emit the default case if nothing else matches. + This replaces the call to false. */ + gimple_seq_add_stmt (&body, gimple_build_assign (dest, t_false)); + gimple_seq_add_stmt (&body, gimple_build_label (done_label)); + + /* Build orig_dest = dest if necessary. */ + if (dest != orig_dest) + { + gimple_seq_add_stmt (&body, gimple_build_assign (orig_dest, dest)); + } + + gsi_insert_seq_before (gsi, body, GSI_SAME_STMT); + + /* Remove the call to the builtin. */ + gsi_remove (gsi, false); +} + +static void +lower_builtin_isnan (gimple_stmt_iterator *gsi) +{ + gen_call_fp_builtin (gsi, &is_nan); +} + +static void +lower_builtin_isinfinite (gimple_stmt_iterator *gsi) +{ + gen_call_fp_builtin (gsi, &is_infinity); +} + +static void +lower_builtin_isnormal (gimple_stmt_iterator *gsi) +{ + gen_call_fp_builtin (gsi, &is_normal); +} + +static void +lower_builtin_iszero (gimple_stmt_iterator *gsi) +{ + gen_call_fp_builtin (gsi, &is_zero); +} + +static void +lower_builtin_issubnormal (gimple_stmt_iterator *gsi) +{ + gen_call_fp_builtin (gsi, &is_subnormal); +} + /* Lower calls to posix_memalign to res = posix_memalign (ptr, align, size); if (res == 0) diff --git a/gcc/real.h b/gcc/real.h index 59af580e78f2637be84f71b98b45ec6611053222..30604adf0f7d4ca4257ed92f6d019b52a52db6c5 100644 --- a/gcc/real.h +++ b/gcc/real.h @@ -161,6 +161,19 @@ struct real_format bool has_signed_zero; bool qnan_msb_set; bool canonical_nan_lsbs_set; + + /* This flag indicates whether the format is suitable for the optimized + code paths for the __builtin_fpclassify function and friends. For + this, the format must be a base 2 representation with the sign bit as + the most-significant bit followed by (exp <= 32) exponent bits + followed by the mantissa bits. It must be possible to interpret the + bits of the floating-point representation as an integer. NaNs and + INFs (if available) must be represented by the same schema used by + IEEE 754. (NaNs must be represented by an exponent with all bits 1, + any mantissa except all bits 0 and any sign bit. +INF and -INF must be + represented by an exponent with all bits 1, a mantissa with all bits 0 and + a sign bit of 0 and 1 respectively.) */ + bool is_binary_ieee_compatible; const char *name; }; @@ -511,6 +524,11 @@ extern bool real_isinteger (const REAL_VALUE_TYPE *, HOST_WIDE_INT *); float string. BUF must be large enough to contain the result. */ extern void get_max_float (const struct real_format *, char *, size_t); +/* Write into BUF the minimum negative representable finite floating-point + number, x, such that b**(x-1) is normalized. + BUF must be large enough to contain the result. */ +extern void get_min_float (const struct real_format *, char *, size_t); + #ifndef GENERATOR_FILE /* real related routines. */ extern wide_int real_to_integer (const REAL_VALUE_TYPE *, bool *, int); diff --git a/gcc/real.c b/gcc/real.c index 66e88e2ad366f7848609d157074c80420d778bcf..20c907a6d543c73ba62aa9a8ddf6973d82de7832 100644 --- a/gcc/real.c +++ b/gcc/real.c @@ -3052,6 +3052,7 @@ const struct real_format ieee_single_format = true, true, false, + true, "ieee_single" }; @@ -3075,6 +3076,7 @@ const struct real_format mips_single_format = true, false, true, + true, "mips_single" }; @@ -3098,6 +3100,7 @@ const struct real_format motorola_single_format = true, true, true, + true, "motorola_single" }; @@ -3132,6 +3135,7 @@ const struct real_format spu_single_format = true, false, false, + false, "spu_single" }; @@ -3343,6 +3347,7 @@ const struct real_format ieee_double_format = true, true, false, + true, "ieee_double" }; @@ -3366,6 +3371,7 @@ const struct real_format mips_double_format = true, false, true, + true, "mips_double" }; @@ -3389,6 +3395,7 @@ const struct real_format motorola_double_format = true, true, true, + true, "motorola_double" }; @@ -3735,6 +3742,7 @@ const struct real_format ieee_extended_motorola_format = true, true, true, + false, "ieee_extended_motorola" }; @@ -3758,6 +3766,7 @@ const struct real_format ieee_extended_intel_96_format = true, true, false, + false, "ieee_extended_intel_96" }; @@ -3781,6 +3790,7 @@ const struct real_format ieee_extended_intel_128_format = true, true, false, + false, "ieee_extended_intel_128" }; @@ -3806,6 +3816,7 @@ const struct real_format ieee_extended_intel_96_round_53_format = true, true, false, + false, "ieee_extended_intel_96_round_53" }; @@ -3896,6 +3907,7 @@ const struct real_format ibm_extended_format = true, true, false, + false, "ibm_extended" }; @@ -3919,6 +3931,7 @@ const struct real_format mips_extended_format = true, false, true, + false, "mips_extended" }; @@ -4184,6 +4197,7 @@ const struct real_format ieee_quad_format = true, true, false, + true, "ieee_quad" }; @@ -4207,6 +4221,7 @@ const struct real_format mips_quad_format = true, false, true, + true, "mips_quad" }; @@ -4509,6 +4524,7 @@ const struct real_format vax_f_format = false, false, false, + false, "vax_f" }; @@ -4532,6 +4548,7 @@ const struct real_format vax_d_format = false, false, false, + false, "vax_d" }; @@ -4555,6 +4572,7 @@ const struct real_format vax_g_format = false, false, false, + false, "vax_g" }; @@ -4633,6 +4651,7 @@ const struct real_format decimal_single_format = true, true, false, + false, "decimal_single" }; @@ -4657,6 +4676,7 @@ const struct real_format decimal_double_format = true, true, false, + false, "decimal_double" }; @@ -4681,6 +4701,7 @@ const struct real_format decimal_quad_format = true, true, false, + false, "decimal_quad" }; @@ -4820,6 +4841,7 @@ const struct real_format ieee_half_format = true, true, false, + true, "ieee_half" }; @@ -4846,6 +4868,7 @@ const struct real_format arm_half_format = true, false, false, + false, "arm_half" }; @@ -4893,6 +4916,7 @@ const struct real_format real_internal_format = true, true, false, + false, "real_internal" }; @@ -5080,6 +5104,16 @@ get_max_float (const struct real_format *fmt, char *buf, size_t len) gcc_assert (strlen (buf) < len); } +/* Write into BUF the minimum negative representable finite floating-point + number, x, such that b**(x-1) is normalized. + BUF must be large enough to contain the result. */ +void +get_min_float (const struct real_format *fmt, char *buf, size_t len) +{ + sprintf (buf, "0x1p%d", fmt->emin - 1); + gcc_assert (strlen (buf) < len); +} + /* True if mode M has a NaN representation and the treatment of NaN operands is important. */ diff --git a/gcc/testsuite/gcc.dg/c99-builtins.c b/gcc/testsuite/gcc.dg/c99-builtins.c new file mode 100644 index 0000000000000000000000000000000000000000..3ca3ed43e7a69a266467ae2a9aa738ce2d15afb9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/c99-builtins.c @@ -0,0 +1,131 @@ +/* { dg-options "-O2" } */ +/* { dg-do run } */ + +#include +#include +#include +#include + +int +main(void) +{ + + /* Test FP Classify as a whole. */ + + assert(fpclassify((float)0) == FP_ZERO); + assert(fpclassify((float)-0.0) == FP_ZERO); + printf("PASS fpclassify(float)\n"); + + assert(fpclassify((double)0) == FP_ZERO); + assert(fpclassify((double)-0) == FP_ZERO); + printf("PASS fpclassify(double)\n"); + + assert(fpclassify((long double)0) == FP_ZERO); + assert(fpclassify((long double)-0.0) == FP_ZERO); + + printf("PASS fpclassify(long double)\n"); + + assert(fpclassify((float)1) == FP_NORMAL); + assert(fpclassify((float)1000) == FP_NORMAL); + printf("PASS fpclassify(float)\n"); + + assert(fpclassify((double)1) == FP_NORMAL); + assert(fpclassify((double)1000) == FP_NORMAL); + printf("PASS fpclassify(double)\n"); + + assert(fpclassify((long double)1) == FP_NORMAL); + assert(fpclassify((long double)1000) == FP_NORMAL); + printf("PASS fpclassify(long double)\n"); + + assert(fpclassify(0x1.2p-150f) == FP_SUBNORMAL); + printf("PASS fpclassify(float)\n"); + + assert(fpclassify(0x1.2p-1075) == FP_SUBNORMAL); + printf("PASS fpclassify(double)\n"); + + assert(fpclassify(0x1.2p-16383L) == FP_SUBNORMAL); + printf("PASS fpclassify(long double)\n"); + + assert(fpclassify(HUGE_VALF) == FP_INFINITE); + assert(fpclassify((float)HUGE_VAL) == FP_INFINITE); + assert(fpclassify((float)HUGE_VALL) == FP_INFINITE); + printf("PASS fpclassify(float)\n"); + + assert(fpclassify(HUGE_VAL) == FP_INFINITE); + assert(fpclassify((double)HUGE_VALF) == FP_INFINITE); + assert(fpclassify((double)HUGE_VALL) == FP_INFINITE); + printf("PASS fpclassify(double)\n"); + + assert(fpclassify(HUGE_VALL) == FP_INFINITE); + assert(fpclassify((long double)HUGE_VALF) == FP_INFINITE); + assert(fpclassify((long double)HUGE_VAL) == FP_INFINITE); + printf("PASS fpclassify(long double)\n"); + + assert(fpclassify(NAN) == FP_NAN); + printf("PASS fpclassify(float)\n"); + + assert(fpclassify((double)NAN) == FP_NAN); + printf("PASS fpclassify(double)\n"); + + assert(fpclassify((long double)NAN) == FP_NAN); + printf("PASS fpclassify(long double)\n"); + + /* Test if individual builtins work. */ + + assert(__builtin_iszero((float)0)); + assert(__builtin_iszero((float)-0.0)); + printf("PASS __builtin_iszero(float)\n"); + + assert(__builtin_iszero((double)0)); + assert(__builtin_iszero((double)-0)); + printf("PASS __builtin_iszero(double)\n"); + + assert(__builtin_iszero((long double)0)); + assert(__builtin_iszero((long double)-0.0)); + printf("PASS __builtin_iszero(long double)\n"); + + assert(__builtin_isnormal((float)1)); + assert(__builtin_isnormal((float)1000)); + printf("PASS __builtin_isnormal(float)\n"); + + assert(__builtin_isnormal((double)1)); + assert(__builtin_isnormal((double)1000)); + printf("PASS __builtin_isnormal(double)\n"); + + assert(__builtin_isnormal((long double)1)); + assert(__builtin_isnormal((long double)1000)); + printf("PASS __builtin_isnormal(long double)\n"); + + assert(__builtin_issubnormal(0x1.2p-150f)); + printf("PASS __builtin_issubnormal(float)\n"); + + assert(__builtin_issubnormal(0x1.2p-1075)); + printf("PASS __builtin_issubnormal(double)\n"); + + assert(__builtin_issubnormal(0x1.2p-16383L)); + printf("PASS __builtin_issubnormal(long double)\n"); + + assert(__builtin_isinf(HUGE_VALF)); + assert(__builtin_isinf((float)HUGE_VAL)); + assert(__builtin_isinf((float)HUGE_VALL)); + printf("PASS __builtin_isinf(float)\n"); + + assert(__builtin_isinf(HUGE_VAL)); + assert(__builtin_isinf((double)HUGE_VALF)); + assert(__builtin_isinf((double)HUGE_VALL)); + printf("PASS __builtin_isinf(double)\n"); + + assert(__builtin_isinf(HUGE_VALL)); + assert(__builtin_isinf((long double)HUGE_VALF)); + assert(__builtin_isinf((long double)HUGE_VAL)); + printf("PASS __builtin_isinf(double)\n"); + + assert(__builtin_isnan(NAN)); + printf("PASS __builtin_isnan(float)\n"); + assert(__builtin_isnan((double)NAN)); + printf("PASS __builtin_isnan(double)\n"); + assert(__builtin_isnan((long double)NAN)); + printf("PASS __builtin_isnan(double)\n"); + + exit(0); +} diff --git a/gcc/testsuite/gcc.dg/fold-notunord.c b/gcc/testsuite/gcc.dg/fold-notunord.c deleted file mode 100644 index ca345154ac204cb5f380855828421b7f88d49052..0000000000000000000000000000000000000000 --- a/gcc/testsuite/gcc.dg/fold-notunord.c +++ /dev/null @@ -1,9 +0,0 @@ -/* { dg-do compile } */ -/* { dg-options "-O -ftrapping-math -fdump-tree-optimized" } */ - -int f (double d) -{ - return !__builtin_isnan (d); -} - -/* { dg-final { scan-tree-dump " ord " "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c b/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c new file mode 100644 index 0000000000000000000000000000000000000000..84a73a6483780dac2347e72fa7d139545d2087eb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/builtin-fpclassify.c @@ -0,0 +1,22 @@ +/* This file checks the code generation for the new __builtin_fpclassify. + because checking the exact assembly isn't very useful, we'll just be checking + for the presence of certain instructions and the omition of others. */ +/* { dg-options "-O2" } */ +/* { dg-do compile } */ +/* { dg-final { scan-assembler-not "\[ \t\]?fabs\[ \t\]?" } } */ +/* { dg-final { scan-assembler-not "\[ \t\]?fcmp\[ \t\]?" } } */ +/* { dg-final { scan-assembler-not "\[ \t\]?fcmpe\[ \t\]?" } } */ +/* { dg-final { scan-assembler "\[ \t\]?sbfx\[ \t\]?" } } */ + +#include +#include + +/* + fp_nan = args[0]; + fp_infinite = args[1]; + fp_normal = args[2]; + fp_subnormal = args[3]; + fp_zero = args[4]; +*/ + +int f(double x) { return __builtin_fpclassify(0, 1, 4, 3, 2, x); }