From patchwork Wed Feb 29 12:50:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ulrich Weigand X-Patchwork-Id: 7002 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 3848423E48 for ; Wed, 29 Feb 2012 12:50:31 +0000 (UTC) Received: from mail-iy0-f180.google.com (mail-iy0-f180.google.com [209.85.210.180]) by fiordland.canonical.com (Postfix) with ESMTP id D0A99A18125 for ; Wed, 29 Feb 2012 12:50:30 +0000 (UTC) Received: by iage36 with SMTP id e36so3209477iag.11 for ; Wed, 29 Feb 2012 04:50:30 -0800 (PST) Received: from mr.google.com ([10.50.12.170]) by 10.50.12.170 with SMTP id z10mr6570357igb.55.1330519830259 (num_hops = 1); Wed, 29 Feb 2012 04:50:30 -0800 (PST) Received: by 10.50.12.170 with SMTP id z10mr5454221igb.55.1330519830186; Wed, 29 Feb 2012 04:50:30 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.231.53.18 with SMTP id k18csp2266ibg; Wed, 29 Feb 2012 04:50:29 -0800 (PST) Received: by 10.216.135.193 with SMTP id u43mr66219wei.34.1330519828910; Wed, 29 Feb 2012 04:50:28 -0800 (PST) Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com. [195.75.94.107]) by mx.google.com with ESMTPS id l55si16943674weq.88.2012.02.29.04.50.28 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 29 Feb 2012 04:50:28 -0800 (PST) Received-SPF: pass (google.com: domain of uweigand@de.ibm.com designates 195.75.94.107 as permitted sender) client-ip=195.75.94.107; Authentication-Results: mx.google.com; spf=pass (google.com: domain of uweigand@de.ibm.com designates 195.75.94.107 as permitted sender) smtp.mail=uweigand@de.ibm.com Received: from /spool/local by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 29 Feb 2012 12:50:28 -0000 Received: from d06nrmr1707.portsmouth.uk.ibm.com (9.149.39.225) by e06smtp11.uk.ibm.com (192.168.101.141) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 29 Feb 2012 12:50:26 -0000 Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by d06nrmr1707.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q1TCoP1P2326628; Wed, 29 Feb 2012 12:50:25 GMT Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q1TCoOlN016877; Wed, 29 Feb 2012 05:50:24 -0700 Received: from tuxmaker.boeblingen.de.ibm.com (tuxmaker.boeblingen.de.ibm.com [9.152.85.9]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id q1TCoNTX016768; Wed, 29 Feb 2012 05:50:23 -0700 Message-Id: <201202291250.q1TCoNTX016768@d06av02.portsmouth.uk.ibm.com> Received: by tuxmaker.boeblingen.de.ibm.com (sSMTP sendmail emulation); Wed, 29 Feb 2012 13:50:23 +0100 Subject: [PATCH, ARM] Generate usat/ssat instructions To: gcc-patches@gcc.gnu.org Date: Wed, 29 Feb 2012 13:50:23 +0100 (CET) From: "Ulrich Weigand" Cc: patches@linaro.org, ramana.radhakrishnan@linaro.org X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 x-cbid: 12022912-5024-0000-0000-000001D73738 X-Gm-Message-State: ALoCoQloq7jLju0KBypkypkfcmkFsEJDEwzqao5Oaytcb+IxTGWsyuBiDP4uMFTenEe3MMZQUkJz Hello, this patch adds support for generating usat/ssat instructions to match code along the lines of: if (a < amin) return amin; else if (a > amax) return amax; else return a; for appropriate values of amin/amax. This type code actually occurs in real-life code (e.g. codecs). Above code is already translated into a sequence of SMIN/SMAX RTX operations by expand. The combine pass is able to fold those into a single RTX pattern, so we only need to make such patterns available to match the instruction. Note that usat/ssat may in addition shift their input operand; this is also supported by the patch. There are already pre-existing patterns that use usat/ssat to implement us_truncate/ss_truncate. Those represent special cases of the general instructions, and are left in place by this patch. (However, some minor fixes e.g. to the sat_shift_operator predicate and insn attributes apply to those patterns too.) Tested on arm-linux-gnueabi with no regressions. OK for 4.8? Bye, Ulrich ChangeLog: gcc/ * config/arm/arm.c (arm_sat_operator_match): New function. * config/arm/arm-protos.h (arm_sat_operator_match): Add prototype. * config/arm/arm.md ("insn" attribute): Add "sat" value. ("SAT", "SATrev"): New code iterators. ("SATlo", "SAThi"): New code iterator attributes. ("*satsi_"): New pattern. ("*satsi__shift"): Likewise. * config/arm/arm-fixed.md ("arm_ssatsihi_shift"): Add "insn" and "shift" attributes. ("arm_usatsihi"): Add "insn" attribute. * config/arm/predicates.md (sat_shift_operator): Allow multiplication by powers of two. Do not allow shift by 32. gcc/testsuite/ * gcc.target/arm/sat-1.c: New test. Index: gcc/testsuite/gcc.target/arm/sat-1.c =================================================================== --- gcc/testsuite/gcc.target/arm/sat-1.c (revision 0) +++ gcc/testsuite/gcc.target/arm/sat-1.c (revision 0) @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_arch_v6_ok } */ +/* { dg-options "-O2 -marm" } */ +/* { dg-add-options arm_arch_v6 } */ + + +static inline int sat1 (int a, int amin, int amax) +{ + if (a < amin) return amin; + else if (a > amax) return amax; + else return a; +} + +static inline int sat2 (int a, int amin, int amax) +{ + if (a > amax) return amax; + else if (a < amin) return amin; + else return a; +} + +int u1 (int x) +{ + return sat1 (x, 0, 63); +} + +int us1 (int x) +{ + return sat1 (x >> 5, 0, 63); +} + +int s1 (int x) +{ + return sat1 (x, -64, 63); +} + +int ss1 (int x) +{ + return sat1 (x >> 5, -64, 63); +} + +int u2 (int x) +{ + return sat2 (x, 0, 63); +} + +int us2 (int x) +{ + return sat2 (x >> 5, 0, 63); +} + +int s2 (int x) +{ + return sat2 (x, -64, 63); +} + +int ss2 (int x) +{ + return sat2 (x >> 5, -64, 63); +} + +/* { dg-final { scan-assembler-times "usat" 4 } } */ +/* { dg-final { scan-assembler-times "ssat" 4 } } */ + Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 184553) +++ gcc/config/arm/arm.c (working copy) @@ -10041,6 +10041,42 @@ } } +/* Match pair of min/max operators that can be implemented via usat/ssat. */ + +bool +arm_sat_operator_match (rtx lo_bound, rtx hi_bound, + int *mask, bool *signed_sat) +{ + /* The high bound must be a power of two minus one. */ + int log = exact_log2 (INTVAL (hi_bound) + 1); + if (log == -1) + return false; + + /* The low bound is either zero (for usat) or one less than the + negation of the high bound (for ssat). */ + if (INTVAL (lo_bound) == 0) + { + if (mask) + *mask = log; + if (signed_sat) + *signed_sat = false; + + return true; + } + + if (INTVAL (lo_bound) == -INTVAL (hi_bound) - 1) + { + if (mask) + *mask = log + 1; + if (signed_sat) + *signed_sat = true; + + return true; + } + + return false; +} + /* Return 1 if memory locations are adjacent. */ int adjacent_mem_locations (rtx a, rtx b) Index: gcc/config/arm/arm-fixed.md =================================================================== --- gcc/config/arm/arm-fixed.md (revision 184553) +++ gcc/config/arm/arm-fixed.md (working copy) @@ -374,6 +374,8 @@ "TARGET_32BIT && arm_arch6" "ssat%?\\t%0, #16, %2%S1" [(set_attr "predicable" "yes") + (set_attr "insn" "sat") + (set_attr "shift" "1") (set_attr "type" "alu_shift")]) (define_insn "arm_usatsihi" @@ -381,4 +383,5 @@ (us_truncate:HI (match_operand:SI 1 "s_register_operand")))] "TARGET_INT_SIMD" "usat%?\\t%0, #16, %1" - [(set_attr "predicable" "yes")]) + [(set_attr "predicable" "yes") + (set_attr "insn" "sat")]) Index: gcc/config/arm/arm-protos.h =================================================================== --- gcc/config/arm/arm-protos.h (revision 184553) +++ gcc/config/arm/arm-protos.h (working copy) @@ -107,6 +107,7 @@ extern int symbol_mentioned_p (rtx); extern int label_mentioned_p (rtx); extern RTX_CODE minmax_code (rtx); +extern bool arm_sat_operator_match (rtx, rtx, int *, bool *); extern int adjacent_mem_locations (rtx, rtx); extern bool gen_ldm_seq (rtx *, int, bool); extern bool gen_stm_seq (rtx *, int); Index: gcc/config/arm/predicates.md =================================================================== --- gcc/config/arm/predicates.md (revision 184553) +++ gcc/config/arm/predicates.md (working copy) @@ -243,9 +243,11 @@ ;; True for shift operators which can be used with saturation instructions. (define_special_predicate "sat_shift_operator" - (and (match_code "ashift,ashiftrt") - (match_test "GET_CODE (XEXP (op, 1)) == CONST_INT - && ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1)) <= 32)") + (and (ior (and (match_code "mult") + (match_test "power_of_two_operand (XEXP (op, 1), mode)")) + (and (match_code "ashift,ashiftrt") + (match_test "GET_CODE (XEXP (op, 1)) == CONST_INT + && ((unsigned HOST_WIDE_INT) INTVAL (XEXP (op, 1)) < 32)"))) (match_test "mode == GET_MODE (op)"))) ;; True for MULT, to identify which variant of shift_operator is in use. Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 184553) +++ gcc/config/arm/arm.md (working copy) @@ -283,7 +283,7 @@ ;; scheduling information. (define_attr "insn" - "mov,mvn,smulxy,smlaxy,smlalxy,smulwy,smlawx,mul,muls,mla,mlas,umull,umulls,umlal,umlals,smull,smulls,smlal,smlals,smlawy,smuad,smuadx,smlad,smladx,smusd,smusdx,smlsd,smlsdx,smmul,smmulr,smmla,umaal,smlald,smlsld,clz,mrs,msr,xtab,sdiv,udiv,other" + "mov,mvn,smulxy,smlaxy,smlalxy,smulwy,smlawx,mul,muls,mla,mlas,umull,umulls,umlal,umlals,smull,smulls,smlal,smlals,smlawy,smuad,smuadx,smlad,smladx,smusd,smusdx,smlsd,smlsdx,smmul,smmulr,smmla,umaal,smlald,smlsld,clz,mrs,msr,xtab,sdiv,udiv,sat,other" (const_string "other")) ; TYPE attribute is used to detect floating point instructions which, if @@ -3446,6 +3446,60 @@ (const_int 12)))] ) +(define_code_iterator SAT [smin smax]) +(define_code_iterator SATrev [smin smax]) +(define_code_attr SATlo [(smin "1") (smax "2")]) +(define_code_attr SAThi [(smin "2") (smax "1")]) + +(define_insn "*satsi_" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (SAT:SI (SATrev:SI (match_operand:SI 3 "s_register_operand" "r") + (match_operand:SI 1 "const_int_operand" "i")) + (match_operand:SI 2 "const_int_operand" "i")))] + "TARGET_32BIT && arm_arch6 && != + && arm_sat_operator_match (operands[], operands[], NULL, NULL)" +{ + int mask; + bool signed_sat; + if (!arm_sat_operator_match (operands[], operands[], + &mask, &signed_sat)) + gcc_unreachable (); + + operands[1] = GEN_INT (mask); + if (signed_sat) + return "ssat%?\t%0, %1, %3"; + else + return "usat%?\t%0, %1, %3"; +} + [(set_attr "predicable" "yes") + (set_attr "insn" "sat")]) + +(define_insn "*satsi__shift" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (SAT:SI (SATrev:SI (match_operator:SI 3 "sat_shift_operator" + [(match_operand:SI 4 "s_register_operand" "r") + (match_operand:SI 5 "const_int_operand" "i")]) + (match_operand:SI 1 "const_int_operand" "i")) + (match_operand:SI 2 "const_int_operand" "i")))] + "TARGET_32BIT && arm_arch6 && != + && arm_sat_operator_match (operands[], operands[], NULL, NULL)" +{ + int mask; + bool signed_sat; + if (!arm_sat_operator_match (operands[], operands[], + &mask, &signed_sat)) + gcc_unreachable (); + + operands[1] = GEN_INT (mask); + if (signed_sat) + return "ssat%?\t%0, %1, %4%S3"; + else + return "usat%?\t%0, %1, %4%S3"; +} + [(set_attr "predicable" "yes") + (set_attr "insn" "sat") + (set_attr "shift" "3") + (set_attr "type" "alu_shift")]) ;; Shift and rotation insns