diff mbox

[ARM,5/7,ping3] Adapt other atomic operations to ARMv8-M Baseline

Message ID e2650ed7-32e8-4cdb-40ff-4b5d39896761@foss.arm.com
State New
Headers show

Commit Message

Thomas Preudhomme Oct. 27, 2016, 10:21 a.m. UTC
On 27/10/16 09:50, Kyrill Tkachov wrote:
> Hi Thomas,

>

> On 24/10/16 09:05, Thomas Preudhomme wrote:

>> Ping?

>>

>> Best regards,

>>

>> Thomas

>>

>> On 14/10/16 14:51, Thomas Preudhomme wrote:

>>> Ping?

>>>

>>> Best regards,

>>>

>>> Thomas

>>>

>>> On 03/10/16 17:45, Thomas Preudhomme wrote:

>>>> Ping?

>>>>

>>>> Best regards,

>>>>

>>>> Thomas

>>>>

>>>> On 22/09/16 14:47, Thomas Preudhomme wrote:

>>>>> Hi,

>>>>>

>>>>> This patch is part of a patch series to add support for atomic operations on

>>>>> ARMv8-M Baseline targets in GCC. This specific patch adds support for

>>>>> remaining

>>>>> atomic operations (exchange, addition, substraction, bitwise AND, OR, XOR and

>>>>> NAND to ARMv8-M Baseline, doubleword integers excepted. As with the previous

>>>>> patch in the patch series, this mostly consists adding Thumb-1 specific

>>>>> constraints to atomic_* patterns to match those in thumb1.md for the non

>>>>> atomic

>>>>> operation.

>>>>>

>>>>> ChangeLog entry is as follows:

>>>>>

>>>>> *** gcc/ChangeLog ***

>>>>>

>>>>> 2016-09-02  Thomas Preud'homme <thomas.preudhomme@arm.com>

>>>>>

>>>>>         * config/arm/arm.c (arm_split_atomic_op): Add function comment.  Add

>>>>>         logic to to decide whether to copy over old value to register for new

>>>>>         value.

>>>>>         * config/arm/sync.md: Add comments explaning why mode and code

>>>>>         attribute are not defined in iterators.md

>>>>>         (thumb1_atomic_op_str): New code attribute.

>>>>>         (thumb1_atomic_newop_str): Likewise.

>>>>>         (thumb1_atomic_fetch_op_str): Likewise.

>>>>>         (thumb1_atomic_fetch_newop_str): Likewise.

>>>>>         (thumb1_atomic_fetch_oldop_str): Likewise.

>>>>>         (atomic_exchange<mode>): Add new ARMv8-M Baseline only alternatives to

>>>>>         mirror the more restrictive constraints of the Thumb-1 insns after

>>>>>         split compared to Thumb-2 counterpart insns.

>>>>>         (atomic_<sync_optab><mode>): Likewise. Add comment to keep constraints

>>>>>         in sync with non atomic version.

>>>>>         (atomic_nand<mode>): Likewise.

>>>>>         (atomic_fetch_<sync_optab><mode>): Likewise.

>>>>>         (atomic_fetch_nand<mode>): Likewise.

>>>>>         (atomic_<sync_optab>_fetch<mode>): Likewise.

>>>>>         (atomic_nand_fetch<mode>): Likewise.

>>>>>         * config/arm/thumb1.md (thumb1_addsi3): Add comment to keep contraint

>>>>>         in sync with atomic version.

>>>>>         (thumb1_subsi3_insn): Likewise.

>>>>>         (thumb1_andsi3_insn): Likewise.

>>>>>         (thumb1_iorsi3_insn): Likewise.

>>>>>         (thumb1_xorsi3_insn): Likewise.

>>>>>

>>>>>

>>>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all

>>>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also

>>>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb

>>>>> mode at

>>>>> optimization level -O1 and above [1] without any regression in the

>>>>> testsuite and

>>>>> no code generation difference in libitm and libgomp.

>>>>>

>>>>> Code generation for ARMv8-M Baseline has been manually examined and compared

>>>>> against ARMv8-A Thumb-2 for the following configuration without finding any

>>>>> issue:

>>>>>

>>>>> gcc.dg/atomic-op-2.c at -Os

>>>>> gcc.dg/atomic-compare-exchange-2.c at -Os

>>>>> gcc.dg/atomic-compare-exchange-3.c at -O3

>>>>>

>>>>>

>>>>> Is this ok for trunk?

>>>>>

>>>>> Best regards,

>>>>>

>>>>> Thomas

>>>>>

>>>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3

>>>>> -g" and

>>>>> undefined ("-O2 -g")

>>>>> [2] The exact list is:

>>>>>

>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c

>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c

>>>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c

>>>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c

>>>>> gcc/testsuite/gcc.dg/atomic-fence.c

>>>>> gcc/testsuite/gcc.dg/atomic-flag.c

>>>>> gcc/testsuite/gcc.dg/atomic-generic.c

>>>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c

>>>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-load-1.c

>>>>> gcc/testsuite/gcc.dg/atomic-load-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-load-3.c

>>>>> gcc/testsuite/gcc.dg/atomic-lockfree.c

>>>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c

>>>>> gcc/testsuite/gcc.dg/atomic-noinline.c

>>>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c

>>>>> gcc/testsuite/gcc.dg/atomic-op-1.c

>>>>> gcc/testsuite/gcc.dg/atomic-op-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-op-3.c

>>>>> gcc/testsuite/gcc.dg/atomic-op-6.c

>>>>> gcc/testsuite/gcc.dg/atomic-store-1.c

>>>>> gcc/testsuite/gcc.dg/atomic-store-2.c

>>>>> gcc/testsuite/gcc.dg/atomic-store-3.c

>>>>> gcc/testsuite/g++.dg/ext/atomic-1.C

>>>>> gcc/testsuite/g++.dg/ext/atomic-2.C

>>>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c

>>>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c

>>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c

>>>>> gcc/testsuite/gcc.target/arm/sync-1.c

>>>>> gcc/testsuite/gcc.target/arm/synchronize.c

>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c

>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c

>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c

>>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc

>>>>>

>>>>>

>>>>>

>>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc

>>>>>

>>>>>

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc

>>>>>

>>>>>

>>>>>

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc

>>>>>

>>>>>

>>>>>

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc

>>>>>

>>>>>

>>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc

>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc

>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc

>>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc

>

> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c

> index

> 9e4ff0191358f9143ee487ecc0cd60eeb91950c8..fb09dcaf5b8bf322afa9c12446983e833e9d7898

> 100644

> --- a/gcc/config/arm/arm.c

> +++ b/gcc/config/arm/arm.c

> @@ -28307,6 +28307,15 @@ arm_split_compare_and_swap (rtx operands[])

>      emit_label (label2);

>  }

>

> +/* Split an atomic operation pattern.  Operation is given by MODE and is one

> +   of PLUS, MINUS, IOR, XOR, SET (for an exchange operation) or NOT (for a nand

>

> s/MODE/CODE/.

>

> +   operation).  Operation is performed on the content at MEM and on VALUE

> +   following the memory model MODEL_RTX.  The content at MEM before and after

> +   the operation is returned in OLD_OUT and NEW_OUT respectively while the

> +   success of the operation is returned in COND.  Using a scratch register or

> +   an operand register for these determines what result is returned for that

> +   pattern.  */

> +

>  void

>  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,

>

>

> <snip>

>

> This is ok with that change.


Done. Attached the patch that got committed.

Thanks.

Best regards,

Thomas
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 781353e6f64b3fc6622f699dd7bad447192888aa..3c4c7042d9c2101619722b5822b3d1ca37d637b9 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28342,6 +28342,15 @@  arm_split_compare_and_swap (rtx operands[])
     emit_label (label2);
 }
 
+/* Split an atomic operation pattern.  Operation is given by CODE and is one
+   of PLUS, MINUS, IOR, XOR, SET (for an exchange operation) or NOT (for a nand
+   operation).  Operation is performed on the content at MEM and on VALUE
+   following the memory model MODEL_RTX.  The content at MEM before and after
+   the operation is returned in OLD_OUT and NEW_OUT respectively while the
+   success of the operation is returned in COND.  Using a scratch register or
+   an operand register for these determines what result is returned for that
+   pattern.  */
+
 void
 arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 		     rtx value, rtx model_rtx, rtx cond)
@@ -28350,6 +28359,7 @@  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
   rtx_code_label *label;
+  bool all_low_regs, bind_old_new;
   rtx x;
 
   bool is_armv8_sync = arm_arch8 && is_mm_sync (model);
@@ -28384,6 +28394,28 @@  arm_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 
   arm_emit_load_exclusive (mode, old_out, mem, use_acquire);
 
+  /* Does the operation require destination and first operand to use the same
+     register?  This is decided by register constraints of relevant insn
+     patterns in thumb1.md.  */
+  gcc_assert (!new_out || REG_P (new_out));
+  all_low_regs = REG_P (value) && REGNO_REG_CLASS (REGNO (value)) == LO_REGS
+		 && new_out && REGNO_REG_CLASS (REGNO (new_out)) == LO_REGS
+		 && REGNO_REG_CLASS (REGNO (old_out)) == LO_REGS;
+  bind_old_new =
+    (TARGET_THUMB1
+     && code != SET
+     && code != MINUS
+     && (code != PLUS || (!all_low_regs && !satisfies_constraint_L (value))));
+
+  /* We want to return the old value while putting the result of the operation
+     in the same register as the old value so copy the old value over to the
+     destination register and use that register for the operation.  */
+  if (old_out && bind_old_new)
+    {
+      emit_move_insn (new_out, old_out);
+      old_out = new_out;
+    }
+
   switch (code)
     {
     case SET:
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index 583b0af27d04ee8fba174e034a4c7b20760c38aa..e7be492ba60f9a4d1964922b53955866960c0450 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -248,15 +248,15 @@ 
   [(set_attr "arch" "32,v8mb,v8mb,v8mb")])
 
 (define_insn_and_split "atomic_exchange<mode>"
-  [(set (match_operand:QHSD 0 "s_register_operand" "=&r")	;; output
-	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua"))	;; memory
+  [(set (match_operand:QHSD 0 "s_register_operand" "=&r,&r")	;; output
+	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua,Ua"))	;; memory
    (set (match_dup 1)
 	(unspec_volatile:QHSD
-	  [(match_operand:QHSD 2 "s_register_operand" "r")	;; input
+	  [(match_operand:QHSD 2 "s_register_operand" "r,r")	;; input
 	   (match_operand:SI 3 "const_int_operand" "")]		;; model
 	  VUNSPEC_ATOMIC_XCHG))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:SI 4 "=&r"))]
+   (clobber (match_scratch:SI 4 "=&r,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -265,7 +265,11 @@ 
     arm_split_atomic_op (SET, operands[0], NULL, operands[1],
 			 operands[2], operands[3], operands[4]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb")])
+
+;; The following mode and code attribute are defined here because they are
+;; specific to atomics and are not needed anywhere else.
 
 (define_mode_attr atomic_op_operand
   [(QI "reg_or_int_operand")
@@ -276,16 +280,24 @@ 
 (define_mode_attr atomic_op_str
   [(QI "rn") (HI "rn") (SI "rn") (DI "r")])
 
+(define_code_attr thumb1_atomic_op_str
+  [(ior "l,l") (xor "l,l") (and "l,l") (plus "lIJL,r") (minus "lPd,lPd")])
+
+(define_code_attr thumb1_atomic_newop_str
+  [(ior "&l,&l") (xor "&l,&l") (and "&l,&l") (plus "&l,&r") (minus "&l,&l")])
+
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic operations in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_<sync_optab><mode>"
-  [(set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua")
+  [(set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua,Ua,Ua")
 	(unspec_volatile:QHSD
 	  [(syncop:QHSD (match_dup 0)
-	     (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>"))
+	     (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>,<thumb1_atomic_op_str>"))
 	   (match_operand:SI 2 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:QHSD 3 "=&r"))
-   (clobber (match_scratch:SI 4 "=&r"))]
+   (clobber (match_scratch:QHSD 3 "=&r,<thumb1_atomic_newop_str>"))
+   (clobber (match_scratch:SI 4 "=&r,&l,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -294,19 +306,22 @@ 
     arm_split_atomic_op (<CODE>, NULL, operands[3], operands[0],
 			 operands[1], operands[2], operands[4]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb,v8mb")])
 
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic NANDs in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_nand<mode>"
-  [(set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua")
+  [(set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua,Ua")
 	(unspec_volatile:QHSD
 	  [(not:QHSD
 	     (and:QHSD (match_dup 0)
-	       (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>")))
+	       (match_operand:QHSD 1 "<atomic_op_operand>" "<atomic_op_str>,l")))
 	   (match_operand:SI 2 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:QHSD 3 "=&r"))
-   (clobber (match_scratch:SI 4 "=&r"))]
+   (clobber (match_scratch:QHSD 3 "=&r,&l"))
+   (clobber (match_scratch:SI 4 "=&r,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -315,20 +330,38 @@ 
     arm_split_atomic_op (NOT, NULL, operands[3], operands[0],
 			 operands[1], operands[2], operands[4]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb")])
+
+;; 3 alternatives are needed to represent constraints after split from
+;; thumb1_addsi3: (i) case where operand1 and destination can be in different
+;; registers, (ii) case where they are in the same low register and (iii) case
+;; when they are in the same register without restriction on the register.  We
+;; disparage slightly alternatives that require copying the old value into the
+;; register for the new value (see bind_old_new in arm_split_atomic_op).
+(define_code_attr thumb1_atomic_fetch_op_str
+  [(ior "l,l,l") (xor "l,l,l") (and "l,l,l") (plus "lL,?IJ,?r") (minus "lPd,lPd,lPd")])
 
+(define_code_attr thumb1_atomic_fetch_newop_str
+  [(ior "&l,&l,&l") (xor "&l,&l,&l") (and "&l,&l,&l") (plus "&l,&l,&r") (minus "&l,&l,&l")])
+
+(define_code_attr thumb1_atomic_fetch_oldop_str
+  [(ior "&r,&r,&r") (xor "&r,&r,&r") (and "&r,&r,&r") (plus "&l,&r,&r") (minus "&l,&l,&l")])
+
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic operations in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_fetch_<sync_optab><mode>"
-  [(set (match_operand:QHSD 0 "s_register_operand" "=&r")
-	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua"))
+  [(set (match_operand:QHSD 0 "s_register_operand" "=&r,<thumb1_atomic_fetch_oldop_str>")
+	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua,Ua,Ua,Ua"))
    (set (match_dup 1)
 	(unspec_volatile:QHSD
 	  [(syncop:QHSD (match_dup 1)
-	     (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>"))
+	     (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>,<thumb1_atomic_fetch_op_str>"))
 	   (match_operand:SI 3 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:QHSD 4 "=&r"))
-   (clobber (match_scratch:SI 5 "=&r"))]
+   (clobber (match_scratch:QHSD 4 "=&r,<thumb1_atomic_fetch_newop_str>"))
+   (clobber (match_scratch:SI 5 "=&r,&l,&l,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -337,21 +370,24 @@ 
     arm_split_atomic_op (<CODE>, operands[0], operands[4], operands[1],
 			 operands[2], operands[3], operands[5]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb,v8mb,v8mb")])
 
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic NANDs in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_fetch_nand<mode>"
-  [(set (match_operand:QHSD 0 "s_register_operand" "=&r")
-	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua"))
+  [(set (match_operand:QHSD 0 "s_register_operand" "=&r,&r")
+	(match_operand:QHSD 1 "mem_noofs_operand" "+Ua,Ua"))
    (set (match_dup 1)
 	(unspec_volatile:QHSD
 	  [(not:QHSD
 	     (and:QHSD (match_dup 1)
-	       (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")))
+	       (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>,l")))
 	   (match_operand:SI 3 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:QHSD 4 "=&r"))
-   (clobber (match_scratch:SI 5 "=&r"))]
+   (clobber (match_scratch:QHSD 4 "=&r,&l"))
+   (clobber (match_scratch:SI 5 "=&r,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -360,20 +396,23 @@ 
     arm_split_atomic_op (NOT, operands[0], operands[4], operands[1],
 			 operands[2], operands[3], operands[5]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb")])
 
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic operations in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_<sync_optab>_fetch<mode>"
-  [(set (match_operand:QHSD 0 "s_register_operand" "=&r")
+  [(set (match_operand:QHSD 0 "s_register_operand" "=&r,<thumb1_atomic_newop_str>")
 	(syncop:QHSD
-	  (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")
-	  (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>")))
+	  (match_operand:QHSD 1 "mem_noofs_operand" "+Ua,Ua,Ua")
+	  (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>,<thumb1_atomic_op_str>")))
    (set (match_dup 1)
 	(unspec_volatile:QHSD
 	  [(match_dup 1) (match_dup 2)
 	   (match_operand:SI 3 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:SI 4 "=&r"))]
+   (clobber (match_scratch:SI 4 "=&r,&l,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -382,21 +421,24 @@ 
     arm_split_atomic_op (<CODE>, NULL, operands[0], operands[1],
 			 operands[2], operands[3], operands[4]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb,v8mb")])
 
+;; Constraints of this pattern must be at least as strict as those of the non
+;; atomic NANDs in thumb1.md and aim to be as permissive.
 (define_insn_and_split "atomic_nand_fetch<mode>"
-  [(set (match_operand:QHSD 0 "s_register_operand" "=&r")
+  [(set (match_operand:QHSD 0 "s_register_operand" "=&r,&l")
 	(not:QHSD
 	  (and:QHSD
-	    (match_operand:QHSD 1 "mem_noofs_operand" "+Ua")
-	    (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>"))))
+	    (match_operand:QHSD 1 "mem_noofs_operand" "+Ua,Ua")
+	    (match_operand:QHSD 2 "<atomic_op_operand>" "<atomic_op_str>,l"))))
    (set (match_dup 1)
 	(unspec_volatile:QHSD
 	  [(match_dup 1) (match_dup 2)
 	   (match_operand:SI 3 "const_int_operand")]		;; model
 	  VUNSPEC_ATOMIC_OP))
    (clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:SI 4 "=&r"))]
+   (clobber (match_scratch:SI 4 "=&r,&l"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -405,7 +447,8 @@ 
     arm_split_atomic_op (NOT, NULL, operands[0], operands[1],
 			 operands[2], operands[3], operands[4]);
     DONE;
-  })
+  }
+  [(set_attr "arch" "32,v8mb")])
 
 (define_insn "arm_load_exclusive<mode>"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 67f2878b45fe47abaaf24d97213613d1572dcd91..5f0dffba89145321351331db821bdeaa0df54b10 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -55,6 +55,10 @@ 
    (set_attr "type" "multiple")]
 )
 
+;; Changes to the constraints of this pattern must be propagated to those of
+;; atomic additions in sync.md and to the logic for bind_old_new in
+;; arm_split_atomic_op in arm.c.  These must be at least as strict as the
+;; constraints here and aim to be as permissive.
 (define_insn_and_split "*thumb1_addsi3"
   [(set (match_operand:SI          0 "register_operand" "=l,l,l,*rk,*hk,l,k,l,l,l")
 	(plus:SI (match_operand:SI 1 "register_operand" "%0,0,l,*0,*0,k,k,0,l,k")
@@ -131,6 +135,10 @@ 
    (set_attr "type" "multiple")]
 )
 
+;; Changes to the constraints of this pattern must be propagated to those of
+;; atomic subtractions in sync.md and to the logic for bind_old_new in
+;; arm_split_atomic_op in arm.c.  These must be at least as strict as the
+;; constraints here and aim to be as permissive.
 (define_insn "thumb1_subsi3_insn"
   [(set (match_operand:SI           0 "register_operand" "=l")
 	(minus:SI (match_operand:SI 1 "register_operand" "l")
@@ -173,6 +181,10 @@ 
    (set_attr "type" "muls")]
 )
 
+;; Changes to the constraints of this pattern must be propagated to those of
+;; atomic bitwise ANDs and NANDs in sync.md and to the logic for bind_old_new
+;; in arm_split_atomic_op in arm.c.  These must be at least as strict as the
+;; constraints here and aim to be as permissive.
 (define_insn "*thumb1_andsi3_insn"
   [(set (match_operand:SI         0 "register_operand" "=l")
 	(and:SI (match_operand:SI 1 "register_operand" "%0")
@@ -227,6 +239,10 @@ 
    (set_attr "type" "logics_reg")]
 )
 
+;; Changes to the constraints of this pattern must be propagated to those of
+;; atomic inclusive ORs in sync.md and to the logic for bind_old_new in
+;; arm_split_atomic_op in arm.c.  These must be at least as strict as the
+;; constraints here and aim to be as permissive.
 (define_insn "*thumb1_iorsi3_insn"
   [(set (match_operand:SI         0 "register_operand" "=l")
 	(ior:SI (match_operand:SI 1 "register_operand" "%0")
@@ -237,6 +253,10 @@ 
    (set_attr "conds" "set")
    (set_attr "type" "logics_reg")])
 
+;; Changes to the constraints of this pattern must be propagated to those of
+;; atomic exclusive ORs in sync.md and to the logic for bind_old_new in
+;; arm_split_atomic_op in arm.c.  These must be at least as strict as the
+;; constraints here and aim to be as permissive.
 (define_insn "*thumb1_xorsi3_insn"
   [(set (match_operand:SI         0 "register_operand" "=l")
 	(xor:SI (match_operand:SI 1 "register_operand" "%0")