From patchwork Fri Oct 21 23:03:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Botcazou X-Patchwork-Id: 78755 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp1534912qge; Fri, 21 Oct 2016 16:04:09 -0700 (PDT) X-Received: by 10.99.65.133 with SMTP id o127mr4708030pga.73.1477091049247; Fri, 21 Oct 2016 16:04:09 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id v21si4433363pgh.182.2016.10.21.16.04.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Oct 2016 16:04:09 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-439297-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-439297-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-439297-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=l//vMlqHK+mqPKqg rBJ33u573HGSsLjKh3ayZnQ8EEbrmsqNSuKiNSKpcpEAuyHXFyQCVerK7VaAFsEE 3KdgWyIteDa+NHdIPe6d8RedGet8JJVstrXAnOyjWSVDyVv0hGDZ10ulIduLbqOU ygP/M5BlIOr+awGv94tPkMcRwPQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=yN1chsZoKbZ0ed6Zf56hcM N49cE=; b=Yk1ORnxFj1cYaOppxV7lsfsmFFLfe/8l4n72gfMR7TMJ0m8vXF31wL BCCFlMVtiCvI1DAXj4Ro+zWChfkEgDy7F/1bcEElQYh3aQ56Dw6rcYqlv5de7nYO 5hrXLLBdFMzhCfXb44vgKFbHt8F+fAmGX+n09T/A+oqHRlqsuzDIo= Received: (qmail 22606 invoked by alias); 21 Oct 2016 23:03:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22597 invoked by uid 89); 21 Oct 2016 23:03:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.4 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=no version=3.3.2 spammy=multiplication, xint, XINT, INT X-HELO: smtp.eu.adacore.com Received: from mel.act-europe.fr (HELO smtp.eu.adacore.com) (194.98.77.210) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Oct 2016 23:03:37 +0000 Received: from localhost (localhost [127.0.0.1]) by filtered-smtp.eu.adacore.com (Postfix) with ESMTP id 386BC812E2 for ; Sat, 22 Oct 2016 01:03:34 +0200 (CEST) Received: from smtp.eu.adacore.com ([127.0.0.1]) by localhost (smtp.eu.adacore.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9ADJOVw0oJGf for ; Sat, 22 Oct 2016 01:03:34 +0200 (CEST) Received: from polaris.localnet (bon31-6-88-161-99-133.fbx.proxad.net [88.161.99.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.eu.adacore.com (Postfix) with ESMTPSA id D3E5C812E1 for ; Sat, 22 Oct 2016 01:03:33 +0200 (CEST) From: Eric Botcazou To: gcc-patches@gcc.gnu.org Subject: [rs6000] Add support for signed overflow arithmetic Date: Sat, 22 Oct 2016 01:03:33 +0200 Message-ID: <9218493.4xTCRBTuI6@polaris> User-Agent: KMail/4.14.10 (Linux/3.16.7-42-desktop; KDE/4.14.9; x86_64; ; ) MIME-Version: 1.0 Hi, this implements support for signed overflow arithmetic on PowerPC. It's an implementation for Power ISA v2.0x, i.e. it doesn't take account the new OV32 flag introduced in v3.0. It doesn't implement unsigned overflow arithmetic because my understanding is that the generic support already generates optimal code in most cases on PowerPC for unsigned. It introduces a new MODE_CC mode (CCVmode) which represents the OV flag of the XER, and the overflow arithmetic instructions are paired with a mcrxr. The comparisons are written in terms of UNSPECs because I used that for Visium and SPARC, but I can rewrite them a la x86/ARM if requested. There is also a tweak to expand_arith_overflow, because it would otherwise "promote" signed multiplication to unsigned multiplication in some cases and this badly pessimizes for PowerPC. Tested on PowerPC/Linux and PowerPC64/Linux, OK for the mainline? 2016-10-21 Eric Botcazou * internal-fn.c (expand_arith_overflow): Do not promote a signed multiplication done in hardware to an unsigned open-coded one. * config/rs6000/rs6000-modes.def (CCV): New. * config/rs6000/rs6000-protos.h (rs6000_select_cc_mode): Declare. * config/rs6000/rs6000.h (SELECT_CC_MODE): Call it. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Handle CCVmode. (validate_condition_mode): Likewise. (print_operand): Handle %C modifier. (rs6000_select_cc_mode): Likewise. (output_cbranch): Likewise. Tidy up. * config/rs6000/rs6000.md (UNSPEC_{ADD,SUB,NEG,MUL}V): New constants. (addv4): New expander. (add3_overflow): New instruction. (add3_overflow_carry_in): New expander. (add3_overflow_carry_in_internal): New instruction. (add3_overflow_carry_in_0): Likewise. (add3_overflow_carry_in_m1): Likewise. (subv4): New expander. (subf3_overflow): New instruction. (subf3_overflow_carry_in): New expander. (sub3_overflow_carry_in_internal): New instruction. (subf3_overflow_carry_in_0): Likewise. (subf3_overflow_carry_in_m1): Likewise. (negv3): New expander. (neg2_overflow): New instruction. (mulv4): New expander. (mulv3_overflow): New instruction. testsuite/ * gcc.target/powerpc/overflow-1.c: New test. * gcc.target/powerpc/overflow-2.c: Likewise. * gcc.target/powerpc/overflow-3.c: Likewise. * gcc.target/powerpc/overflow-4.c: Likewise. -- Eric Botcazou Index: internal-fn.c =================================================================== --- internal-fn.c (revision 241379) +++ internal-fn.c (working copy) @@ -1772,10 +1772,23 @@ expand_arith_overflow (enum tree_code co int prec1 = TYPE_PRECISION (TREE_TYPE (arg1)); int precres = TYPE_PRECISION (type); location_t loc = gimple_location (stmt); - if (!uns0_p && get_range_pos_neg (arg0) == 1) - uns0_p = true; - if (!uns1_p && get_range_pos_neg (arg1) == 1) - uns1_p = true; + /* Try to promote to unsigned since unsigned overflow is easier to open + code than signed overflow, but not for multiplication if that would + mean not using the hardware because this would very likely result in + doing 2 multiplications instead of only 1, e.g. on PowerPC. */ + if (code == MULT_EXPR + && !unsr_p + && precres <= BITS_PER_WORD + && optab_handler (mulv4_optab, TYPE_MODE (type)) != CODE_FOR_nothing + && optab_handler (umulv4_optab, TYPE_MODE (type)) == CODE_FOR_nothing) + ; + else + { + if (!uns0_p && get_range_pos_neg (arg0) == 1) + uns0_p = true; + if (!uns1_p && get_range_pos_neg (arg1) == 1) + uns1_p = true; + } int pr = get_min_precision (arg0, uns0_p ? UNSIGNED : SIGNED); prec0 = MIN (prec0, pr); pr = get_min_precision (arg1, uns1_p ? UNSIGNED : SIGNED); Index: config/rs6000/rs6000-modes.def =================================================================== --- config/rs6000/rs6000-modes.def (revision 241313) +++ config/rs6000/rs6000-modes.def (working copy) @@ -32,13 +32,15 @@ FLOAT_MODE (TF, 16, ieee_quad_format); /* Add any extra modes needed to represent the condition code. For the RS/6000, we need separate modes when unsigned (logical) comparisons - are being done and we need a separate mode for floating-point. We also - use a mode for the case when we are comparing the results of two - comparisons, as then only the EQ bit is valid in the register. */ + are being done and we need a separate mode for floating-point. We also use + a mode for the case when we are comparing the results of two comparisons, + as then only the EQ bit is valid in the register. We also use a mode for + detecting signed overflow, as only the GT bit is valid in the register. */ CC_MODE (CCUNS); CC_MODE (CCFP); CC_MODE (CCEQ); +CC_MODE (CCV); /* Vector modes. */ VECTOR_MODES (INT, 8); /* V8QI V4HI V2SI */ Index: config/rs6000/rs6000-protos.h =================================================================== --- config/rs6000/rs6000-protos.h (revision 241313) +++ config/rs6000/rs6000-protos.h (working copy) @@ -127,8 +127,8 @@ extern int ccr_bit (rtx, int); extern void rs6000_output_function_entry (FILE *, const char *); extern void print_operand (FILE *, rtx, int); extern void print_operand_address (FILE *, rtx); -extern enum rtx_code rs6000_reverse_condition (machine_mode, - enum rtx_code); +extern machine_mode rs6000_select_cc_mode (enum rtx_code, rtx, rtx); +extern enum rtx_code rs6000_reverse_condition (machine_mode, enum rtx_code); extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx); extern void rs6000_emit_sISEL (machine_mode, rtx[]); extern void rs6000_emit_sCOND (machine_mode, rtx[]); Index: config/rs6000/rs6000.c =================================================================== --- config/rs6000/rs6000.c (revision 241313) +++ config/rs6000/rs6000.c (working copy) @@ -2374,6 +2374,7 @@ rs6000_debug_reg_global (void) CCmode, CCUNSmode, CCEQmode, + CCVmode }; /* Virtual regs we are interested in. */ @@ -19334,6 +19335,7 @@ validate_condition_mode (enum rtx_code c /* These are invalid; the information is not there. */ gcc_assert (mode != CCEQmode || code == EQ || code == NE); + gcc_assert (mode != CCVmode || code == EQ || code == NE); } @@ -21863,6 +21865,14 @@ print_operand (FILE *file, rtx x, int co /* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise output_operand. */ + case 'C': + /* X is a CR register. Print the index number of the CR. */ + if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x))) + output_operand_lossage ("invalid %%E value"); + else + fputs (reg_names[REGNO (x)], file); + return; + case 'D': /* Like 'J' but get to the GT bit only. */ gcc_assert (REG_P (x)); @@ -22638,6 +22648,34 @@ rs6000_assemble_visibility (tree decl, i } #endif +machine_mode +rs6000_select_cc_mode (enum rtx_code op, rtx x, rtx y) +{ + /* For floating-point, CCFPmode should be used. CCUNSmode should be used + for unsigned comparisons. CCEQmode should be used when we are doing an + inequality comparison on the result of a comparison. CCVmode should be + used for the special overflow comparisons. CCmode should be used in all + the other cases. */ + + if (SCALAR_FLOAT_MODE_P (GET_MODE (x))) + return CCFPmode; + + if ((op == GTU || op == LTU || op == GEU || op == LEU)) + return CCUNSmode; + + if ((op == EQ || op == NE) && COMPARISON_P (x)) + return CCEQmode; + + if (GET_CODE (y) == UNSPEC + && (XINT (y, 1) == UNSPEC_ADDV + || XINT (y, 1) == UNSPEC_SUBV + || XINT (y, 1) == UNSPEC_NEGV + || XINT (y, 1) == UNSPEC_MULV)) + return CCVmode; + + return CCmode; +} + enum rtx_code rs6000_reverse_condition (machine_mode mode, enum rtx_code code) { @@ -23558,7 +23596,6 @@ output_cbranch (rtx op, const char *labe enum rtx_code code = GET_CODE (op); rtx cc_reg = XEXP (op, 0); machine_mode mode = GET_MODE (cc_reg); - int cc_regno = REGNO (cc_reg) - CR0_REGNO; int need_longbranch = label != NULL && get_attr_length (insn) == 8; int really_reversed = reversed ^ need_longbranch; char *s = string; @@ -23601,6 +23638,24 @@ output_cbranch (rtx op, const char *labe } } + if (mode == CCVmode) + { + switch (code) + { + case EQ: + /* Opposite of GT. */ + code = LE; + break; + + case NE: + code = GT; + break; + + default: + gcc_unreachable (); + } + } + switch (code) { /* Not all of these are actually distinct opcodes, but @@ -23659,9 +23714,9 @@ output_cbranch (rtx op, const char *labe /* We need to escape any '%' characters in the reg_names string. Assume they'd only be the first character.... */ - if (reg_names[cc_regno + CR0_REGNO][0] == '%') + if (reg_names[REGNO (cc_reg)][0] == '%') *s++ = '%'; - s += sprintf (s, "%s", reg_names[cc_regno + CR0_REGNO]); + s += sprintf (s, "%s", reg_names[REGNO (cc_reg)]); if (label != NULL) { Index: config/rs6000/rs6000.h =================================================================== --- config/rs6000/rs6000.h (revision 241313) +++ config/rs6000/rs6000.h (working copy) @@ -2225,17 +2225,8 @@ extern unsigned rs6000_pmode; /* #define ADJUST_INSN_LENGTH(X,LENGTH) */ /* Given a comparison code (EQ, NE, etc.) and the first operand of a - COMPARE, return the mode to be used for the comparison. For - floating-point, CCFPmode should be used. CCUNSmode should be used - for unsigned comparisons. CCEQmode should be used when we are - doing an inequality comparison on the result of a - comparison. CCmode should be used in all other cases. */ - -#define SELECT_CC_MODE(OP,X,Y) \ - (SCALAR_FLOAT_MODE_P (GET_MODE (X)) ? CCFPmode \ - : (OP) == GTU || (OP) == LTU || (OP) == GEU || (OP) == LEU ? CCUNSmode \ - : (((OP) == EQ || (OP) == NE) && COMPARISON_P (X) \ - ? CCEQmode : CCmode)) + COMPARE, return the mode to be used for the comparison. */ +#define SELECT_CC_MODE(OP,X,Y) rs6000_select_cc_mode (OP, X, Y) /* Can the condition code MODE be safely reversed? This is safe in all cases on this port, because at present it doesn't use the Index: config/rs6000/rs6000.md =================================================================== --- config/rs6000/rs6000.md (revision 241313) +++ config/rs6000/rs6000.md (working copy) @@ -149,6 +149,10 @@ (define_c_enum "unspec" UNSPEC_IEEE128_CONVERT UNSPEC_SIGNBIT UNSPEC_DOLOOP + UNSPEC_ADDV + UNSPEC_SUBV + UNSPEC_NEGV + UNSPEC_MULV ]) ;; @@ -1636,6 +1640,39 @@ (define_expand "add3" } }) +(define_expand "addv4" + [(match_operand:SDI 0 "register_operand") + (match_operand:SDI 1 "register_operand") + (match_operand:SDI 2 "register_operand") + (match_operand 3 "")] + "!(mode == SImode && TARGET_POWERPC64)" +{ + rtx cc_reg = gen_reg_rtx (CCVmode); + + if (mode == DImode && !TARGET_POWERPC64) + { + rtx lo0 = gen_lowpart (SImode, operands[0]); + rtx lo1 = gen_lowpart (SImode, operands[1]); + rtx lo2 = gen_lowpart (SImode, operands[2]); + rtx hi0 = gen_highpart (SImode, operands[0]); + rtx hi1 = gen_highpart (SImode, operands[1]); + rtx hi2 = gen_highpart (SImode, operands[2]); + + emit_insn (gen_addsi3_carry (lo0, lo1, lo2)); + emit_insn (gen_addsi3_overflow_carry_in (hi0, hi1, hi2, cc_reg)); + } + else + emit_insn (gen_add3_overflow (operands[0], operands[1], operands[2], + cc_reg)); + + rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cond, + loc_ref, pc_rtx))); + DONE; +}) + (define_insn "*add3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") @@ -1837,6 +1874,21 @@ (define_insn "*add3_imm_carry_neg" [(set_attr "type" "add")]) +;; The OV flag is set on overflow in Pmode only. +(define_insn "add3_overflow" + [(set (match_operand:CCV 3 "cc_reg_operand" "=y") + (compare:CCV (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "gpc_reg_operand" "r")) + (unspec:P [(match_dup 1) (match_dup 2)] + UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_dup 1) (match_dup 2)))] + "" + "addo %0,%1,%2\;mcrxr %C3" + [(set_attr "type" "add") + (set_attr "length" "8")]) + + (define_expand "add3_carry_in" [(parallel [ (set (match_operand:GPR 0 "gpc_reg_operand") @@ -1888,6 +1940,83 @@ (define_insn "add3_carry_in_m1" [(set_attr "type" "add")]) +(define_expand "add3_overflow_carry_in" + [(parallel [ + (set (match_operand:CCV 3 "cc_reg_operand") + (compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand") + (match_operand:P 2 "adde_operand")) + (reg:P CA_REGNO)) + (unspec:P [(plus:P (match_dup 1) (match_dup 2)) + (reg:P CA_REGNO)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand") + (plus:P (plus:P (match_dup 1) (match_dup 2)) + (reg:P CA_REGNO))) + (clobber (reg:P CA_REGNO))])] + "" +{ + if (operands[2] == const0_rtx) + { + emit_insn (gen_add3_overflow_carry_in_0 (operands[0], + operands[1], + operands[3])); + DONE; + } + if (operands[2] == constm1_rtx) + { + emit_insn (gen_add3_overflow_carry_in_m1 (operands[0], + operands[1], + operands[3])); + DONE; + } +}) + +(define_insn "*add3_overflow_carry_in_internal" + [(set (match_operand:CCV 3 "cc_reg_operand" "=y") + (compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (match_operand:P 2 "gpc_reg_operand" "r")) + (reg:P CA_REGNO)) + (unspec:P [(plus:P (match_dup 1) (match_dup 2)) + (reg:P CA_REGNO)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (plus:P (match_dup 1) (match_dup 2)) + (reg:P CA_REGNO))) + (clobber (reg:GPR CA_REGNO))] + "" + "addeo %0,%1,%2\;mcrxr %C3" + [(set_attr "type" "add") + (set_attr "length" "8")]) + +(define_insn "add3_overflow_carry_in_0" + [(set (match_operand:CCV 2 "cc_reg_operand" "=y") + (compare:CCV (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (reg:P CA_REGNO)) + (unspec:P [(plus:P (match_dup 1) (reg:P CA_REGNO))] + UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (match_dup 1) (reg:P CA_REGNO))) + (clobber (reg:P CA_REGNO))] + "" + "addzeo %0,%1\;mcrxr %C2" + [(set_attr "type" "add") + (set_attr "length" "8")]) + +(define_insn "add3_overflow_carry_in_m1" + [(set (match_operand:CCV 2 "cc_reg_operand" "=y") + (compare:CCV (plus:P (plus:P (match_operand:P 1 "gpc_reg_operand" "r") + (reg:P CA_REGNO)) + (const_int -1)) + (unspec:P [(plus:P (match_dup 1) (reg:P CA_REGNO)) + (const_int -1)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (plus:P (match_dup 1) (reg:P CA_REGNO)) + (const_int -1))) + (clobber (reg:P CA_REGNO))] + "" + "addmeo %0,%1\;mcrxr %C2" + [(set_attr "type" "add") + (set_attr "length" "8")]) + + (define_expand "one_cmpl2" [(set (match_operand:SDI 0 "gpc_reg_operand" "") (not:SDI (match_operand:SDI 1 "gpc_reg_operand" "")))] @@ -1980,6 +2109,39 @@ (define_expand "sub3" } }) +(define_expand "subv4" + [(match_operand:SDI 0 "register_operand") + (match_operand:SDI 1 "register_operand") + (match_operand:SDI 2 "register_operand") + (match_operand 3 "")] + "!(mode == SImode && TARGET_POWERPC64)" +{ + rtx cc_reg = gen_reg_rtx (CCVmode); + + if (mode == DImode && !TARGET_POWERPC64) + { + rtx lo0 = gen_lowpart (SImode, operands[0]); + rtx lo1 = gen_lowpart (SImode, operands[1]); + rtx lo2 = gen_lowpart (SImode, operands[2]); + rtx hi0 = gen_highpart (SImode, operands[0]); + rtx hi1 = gen_highpart (SImode, operands[1]); + rtx hi2 = gen_highpart (SImode, operands[2]); + + emit_insn (gen_subfsi3_carry (lo0, lo2, lo1)); + emit_insn (gen_subfsi3_overflow_carry_in (hi0, hi2, hi1, cc_reg)); + } + else + emit_insn (gen_subf3_overflow (operands[0], operands[2], operands[1], + cc_reg)); + + rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cond, + loc_ref, pc_rtx))); + DONE; +}) + (define_insn "*subf3" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (minus:GPR (match_operand:GPR 2 "gpc_reg_operand" "r") @@ -2075,6 +2237,21 @@ (define_insn "*subf3_imm_carry_m1" [(set_attr "type" "add")]) +;; The OV flag is set on overflow in Pmode only. +(define_insn "subf3_overflow" + [(set (match_operand:CCV 3 "cc_reg_operand" "=y") + (compare:CCV (minus:P (match_operand:P 2 "gpc_reg_operand" "r") + (match_operand:P 1 "gpc_reg_operand" "r")) + (unspec:P [(match_dup 2) (match_dup 1)] + UNSPEC_SUBV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (minus:P (match_dup 2) (match_dup 1)))] + "" + "subfo %0,%1,%2\;mcrxr %C3" + [(set_attr "type" "add") + (set_attr "length" "8")]) + + (define_expand "subf3_carry_in" [(parallel [ (set (match_operand:GPR 0 "gpc_reg_operand") @@ -2135,6 +2312,87 @@ (define_insn "subf3_carry_in_xx" [(set_attr "type" "add")]) +(define_expand "subf3_overflow_carry_in" + [(parallel [ + (set (match_operand:CCV 3 "cc_reg_operand") + (compare:CCV + (plus:P (plus:P (not:P (match_operand:P 1 "gpc_reg_operand")) + (reg:P CA_REGNO)) + (match_operand:P 2 "adde_operand")) + (unspec:P [(plus:P (not:P (match_dup 1)) (reg:P CA_REGNO)) + (match_dup 2)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand") + (plus:P (plus:P (not:P (match_dup 1)) (reg:P CA_REGNO)) + (match_dup 2))) + (clobber (reg:P CA_REGNO))])] + "" +{ + if (operands[2] == const0_rtx) + { + emit_insn (gen_subf3_overflow_carry_in_0 (operands[0], + operands[1], + operands[3])); + DONE; + } + if (operands[2] == constm1_rtx) + { + emit_insn (gen_subf3_overflow_carry_in_m1 (operands[0], + operands[1], + operands[3])); + DONE; + } +}) + +(define_insn "*sub3_overflow_carry_in_internal" + [(set (match_operand:CCV 3 "cc_reg_operand" "=y") + (compare:CCV + (plus:P (plus:P (not:P (match_operand:P 1 "gpc_reg_operand" "r")) + (reg:P CA_REGNO)) + (match_operand:P 2 "gpc_reg_operand" "r")) + (unspec:P [(plus:P (not:P (match_dup 1)) (reg:P CA_REGNO)) + (match_dup 2)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (plus:P (not:P (match_dup 1)) (reg:P CA_REGNO)) + (match_dup 2))) + (clobber (reg:P CA_REGNO))] + "" + "subfeo %0,%1,%2\;mcrxr %C3" + [(set_attr "type" "add") + (set_attr "length" "8")]) + +(define_insn "subf3_overflow_carry_in_0" + [(set (match_operand:CCV 2 "cc_reg_operand" "=y") + (compare:CCV + (plus:P (not:P (match_operand:P 1 "gpc_reg_operand" "r")) + (reg:P CA_REGNO)) + (unspec:P [(not:P (match_dup 1)) (reg:P CA_REGNO)] + UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (not:P (match_dup 1)) (reg:P CA_REGNO))) + (clobber (reg:P CA_REGNO))] + "" + "subfzeo %0,%1\;mcrxr %C2" + [(set_attr "type" "add") + (set_attr "length" "8")]) + +(define_insn "subf3_overflow_carry_in_m1" + [(set (match_operand:CCV 2 "cc_reg_operand" "=y") + (compare:CCV + (plus:P (minus:P (reg:P CA_REGNO) + (match_operand:P 1 "gpc_reg_operand" "r")) + (const_int -2)) + (unspec:P [(minus:P (reg:P CA_REGNO) (match_dup 1)) + (const_int -2)] UNSPEC_ADDV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (plus:P (minus:P (reg:P CA_REGNO) (match_dup 1)) + (const_int -2))) + (clobber (reg:P CA_REGNO))] + "" + "subfmeo %0,%1\;mcrxr %C2" + [(set_attr "type" "add") + (set_attr "length" "8")]) + + (define_insn "neg2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))] @@ -2142,6 +2400,35 @@ (define_insn "neg2" "neg %0,%1" [(set_attr "type" "add")]) +(define_expand "negv3" + [(match_operand:SDI 0 "register_operand") + (match_operand:SDI 1 "register_operand") + (match_operand 2 "")] + "!(mode == SImode && TARGET_POWERPC64)" +{ + rtx cc_reg = gen_reg_rtx (CCVmode); + + if (mode == DImode && !TARGET_POWERPC64) + { + rtx lo0 = gen_lowpart (SImode, operands[0]); + rtx lo1 = gen_lowpart (SImode, operands[1]); + rtx hi0 = gen_highpart (SImode, operands[0]); + rtx hi1 = gen_highpart (SImode, operands[1]); + + emit_insn (gen_subfsi3_carry (lo0, lo1, const0_rtx)); + emit_insn (gen_subfsi3_overflow_carry_in_0 (hi0, hi1, cc_reg)); + } + else + emit_insn (gen_neg2_overflow (operands[0], operands[1], cc_reg)); + + rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[2]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cond, + loc_ref, pc_rtx))); + DONE; +}) + (define_insn_and_split "*neg2_dot" [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y") (compare:CC (neg:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")) @@ -2184,6 +2471,18 @@ (define_insn_and_split "*neg2_dot2 (set_attr "length" "4,8")]) +(define_insn "neg2_overflow" + [(set (match_operand 2 "cc_reg_operand" "=y") + (compare:CCV (neg:P (match_operand:P 1 "gpc_reg_operand" "r")) + (unspec:P [(match_dup 1)] UNSPEC_NEGV))) + (set (match_operand:P 0 "gpc_reg_operand" "=r") + (neg:P (match_dup 1)))] + "" + "nego %0,%1\;mcrxr %C2" + [(set_attr "type" "add") + (set_attr "length" "8")]) + + (define_insn "clz2" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (clz:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")))] @@ -2759,6 +3058,24 @@ (define_insn "mul3" (const_string "16")] (const_string "")))]) +(define_expand "mulv4" + [(match_operand:SDI 0 "register_operand") + (match_operand:SDI 1 "register_operand") + (match_operand:SDI 2 "register_operand") + (match_operand 3 "")] + "mode == SImode || TARGET_POWERPC64" +{ + rtx cc_reg = gen_reg_rtx (CCVmode); + emit_insn (gen_mul3_overflow (operands[0], operands[1], operands[2], + cc_reg)); + rtx cond = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx); + rtx loc_ref = gen_rtx_LABEL_REF (VOIDmode, operands[3]); + emit_jump_insn (gen_rtx_SET (pc_rtx, + gen_rtx_IF_THEN_ELSE (VOIDmode, cond, + loc_ref, pc_rtx))); + DONE; +}) + (define_insn_and_split "*mul3_dot" [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y") (compare:CC (mult:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r") @@ -2807,6 +3124,20 @@ (define_insn_and_split "*mul3_dot2 (set_attr "dot" "yes") (set_attr "length" "4,8")]) +;; The OV flag is set on overflow in SImode for mullw and DImode for mulld. +(define_insn "mul3_overflow" + [(set (match_operand:CCV 3 "cc_reg_operand" "=y") + (compare:CCV (mult:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") + (match_operand:GPR 2 "gpc_reg_operand" "r")) + (unspec:GPR [(match_dup 1) (match_dup 2)] + UNSPEC_MULV))) + (set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (mult:GPR (match_dup 1) (match_dup 2)))] + "" + "mullo %0,%1,%2\;mcrxr %C3" + [(set_attr "type" "mul") + (set_attr "length" "8")]) + (define_expand "mul3_highpart" [(set (match_operand:GPR 0 "gpc_reg_operand") Index: testsuite/gcc.target/powerpc/overflow-1.c =================================================================== --- testsuite/gcc.target/powerpc/overflow-1.c (revision 0) +++ testsuite/gcc.target/powerpc/overflow-1.c (working copy) @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ +/* { dg-require-effective-target ilp32 } */ + +#include +#include + +bool my_add_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_add_overflow (a, b, res); +} + +bool my_sub_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_sub_overflow (a, b, res); +} + +bool my_neg_overflow (int32_t a, int32_t *res) +{ + return __builtin_sub_overflow (0, a, res); +} + +bool my_mul_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_mul_overflow (a, b, res); +} + +/* { dg-final { scan-assembler-times "addo" 1 } } */ +/* { dg-final { scan-assembler-times "subfo" 1 } } */ +/* { dg-final { scan-assembler-times "nego" 1 } } */ +/* { dg-final { scan-assembler-times "mullwo" 1 } } */ +/* { dg-final { scan-assembler-times "mcrxr" 4 } } */ +/* { dg-final { scan-assembler-not "cmp" } } */ Index: testsuite/gcc.target/powerpc/overflow-2.c =================================================================== --- testsuite/gcc.target/powerpc/overflow-2.c (revision 0) +++ testsuite/gcc.target/powerpc/overflow-2.c (working copy) @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ +/* { dg-require-effective-target lp64 } */ + +#include +#include + +bool my_add_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_add_overflow (a, b, res); +} + +bool my_sub_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_sub_overflow (a, b, res); +} + +bool my_neg_overflow (int32_t a, int32_t *res) +{ + return __builtin_sub_overflow (0, a, res); +} + +bool my_mul_overflow (int32_t a, int32_t b, int32_t *res) +{ + return __builtin_mul_overflow (a, b, res); +} + +/* { dg-final { scan-assembler-not "addo" } } */ +/* { dg-final { scan-assembler-not "subfo" } } */ +/* { dg-final { scan-assembler-not "nego" } } */ +/* { dg-final { scan-assembler-times "mullwo" 1 } } */ +/* { dg-final { scan-assembler-times "mcrxr" 1 } } */ Index: testsuite/gcc.target/powerpc/overflow-3.c =================================================================== --- testsuite/gcc.target/powerpc/overflow-3.c (revision 0) +++ testsuite/gcc.target/powerpc/overflow-3.c (working copy) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ +/* { dg-require-effective-target ilp32 } */ + +#include +#include + +bool my_add_overflow (int64_t a, int64_t b, int64_t *res) +{ + return __builtin_add_overflow (a, b, res); +} + +bool my_sub_overflow (int64_t a, int64_t b, int64_t *res) +{ + return __builtin_sub_overflow (a, b, res); +} + +bool my_neg_overflow (int64_t a, int64_t *res) +{ + return __builtin_sub_overflow (0, a, res); +} + +/* { dg-final { scan-assembler-times "addeo" 1 } } */ +/* { dg-final { scan-assembler-times "subfeo" 1 } } */ +/* { dg-final { scan-assembler-times "subfzeo" 1 } } */ +/* { dg-final { scan-assembler-times "mcrxr" 3 } } */ +/* { dg-final { scan-assembler-not "cmp" } } */ Index: testsuite/gcc.target/powerpc/overflow-4.c =================================================================== --- testsuite/gcc.target/powerpc/overflow-4.c (revision 0) +++ testsuite/gcc.target/powerpc/overflow-4.c (working copy) @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ +/* { dg-require-effective-target lp64 } */ + +#include +#include + +bool my_add_overflow (int64_t a, int64_t b, int64_t *res) +{ + return __builtin_add_overflow (a, b, res); +} + +bool my_sub_overflow (int64_t a, int64_t b, int64_t *res) +{ + return __builtin_sub_overflow (a, b, res); +} + +bool my_neg_overflow (int64_t a, int64_t *res) +{ + return __builtin_sub_overflow (0, a, res); +} + +bool my_mul_overflow (int64_t a, int64_t b, int64_t *res) +{ + return __builtin_mul_overflow (a, b, res); +} + +/* { dg-final { scan-assembler-times "addo" 1 } } */ +/* { dg-final { scan-assembler-times "subfo" 1 } } */ +/* { dg-final { scan-assembler-times "nego" 1 } } */ +/* { dg-final { scan-assembler-times "mulldo" 1 } } */ +/* { dg-final { scan-assembler-times "mcrxr" 4 } } */ +/* { dg-final { scan-assembler-not "cmp" } } */