Message ID | 20180520001726.27808-1-msalter@redhat.com |
---|---|
State | New |
Headers | show |
Series | [RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive() | expand |
Hi Mark, Thanks for reporting this. On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote: > When running a kernel compiled with gcc8 on a machine using LSE, I > get: > > Unable to handle kernel paging request at virtual address 11111122222221 [...] > The fault happens at the casal insn of inlined atomic64_dec_if_positive(). > The inline asm code in that function has: > > "1: ldr x30, %[v]\n" > " subs %[ret], x30, #1\n" > " b.lt 2f\n" > " casal x30, %[ret], %[v]\n" > " sub x30, x30, #1\n" > " sub x30, x30, %[ret]\n" > " cbnz x30, 1b\n" > "2:") > : [ret] "+r" (x0), [v] "+Q" (v->counter) > > gcc8 used register x0 for both [ret] and [v] and the subs was > clobbering [v] before it was used for casal. Gcc is free to do > this because [ret] lacks an early clobber modifier. So add one > to tell gcc a separate register is needed for [v]. Oh blimey, it looks like GCC is realising that counter is at offset 0 of atomic_t and therefore assigns the same register for [ret] and [v], which is actually forced to be x0 by the 'register' local variable in C code. The "+Q" constraint only says that the memory is read/write, so the pointer is fair game. I agree with your fix, but we also need to fix up the other places relying on this. Patch below -- please yell if you think I missed any. Cheers, Will --->8 From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@arm.com> Date: Mon, 21 May 2018 17:44:57 +0100 Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm operands For LSE atomics that read and write a register operand, we need to ensure that these operands are annotated as "early clobber" if the register is written before all of the input operands have been consumed. Failure to do so can result in the compiler allocating the same register to both operands, leading to splats such as: Unable to handle kernel paging request at virtual address 11111122222221 [...] x1 : 1111111122222222 x0 : 1111111122222221 Process swapper/0 (pid: 1, stack limit = 0x000000008209f908) Call trace: test_atomic64+0x1360/0x155c where x0 has been allocated as both the value to be stored and also the atomic_t pointer. This patch adds the missing clobbers. Cc: <stable@vger.kernel.org> Cc: Dave Martin <dave.martin@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Reported-by: Mark Salter <msalter@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm64/include/asm/atomic_lse.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..f9b0b09153e0 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v) /* LSE atomics */ " mvn %w[i], %w[i]\n" " stclr %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+&r" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t *v) \ /* LSE atomics */ \ " mvn %w[i], %w[i]\n" \ " ldclr" #mb " %w[i], %w[i], %[v]") \ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v) /* LSE atomics */ " neg %w[i], %w[i]\n" " stadd %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+&r" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t *v) \ " neg %w[i], %w[i]\n" \ " ldadd" #mb " %w[i], w30, %[v]\n" \ " add %w[i], %w[i], w30") \ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS , ##cl); \ \ @@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t *v) \ /* LSE atomics */ \ " neg %w[i], %w[i]\n" \ " ldadd" #mb " %w[i], %w[i], %[v]") \ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -314,7 +314,7 @@ static inline void atomic64_and(long i, atomic64_t *v) /* LSE atomics */ " mvn %[i], %[i]\n" " stclr %[i], %[v]") - : [i] "+r" (x0), [v] "+Q" (v->counter) + : [i] "+&r" (x0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -332,7 +332,7 @@ static inline long atomic64_fetch_and##name(long i, atomic64_t *v) \ /* LSE atomics */ \ " mvn %[i], %[i]\n" \ " ldclr" #mb " %[i], %[i], %[v]") \ - : [i] "+r" (x0), [v] "+Q" (v->counter) \ + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -358,7 +358,7 @@ static inline void atomic64_sub(long i, atomic64_t *v) /* LSE atomics */ " neg %[i], %[i]\n" " stadd %[i], %[v]") - : [i] "+r" (x0), [v] "+Q" (v->counter) + : [i] "+&r" (x0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -377,7 +377,7 @@ static inline long atomic64_sub_return##name(long i, atomic64_t *v) \ " neg %[i], %[i]\n" \ " ldadd" #mb " %[i], x30, %[v]\n" \ " add %[i], %[i], x30") \ - : [i] "+r" (x0), [v] "+Q" (v->counter) \ + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -404,7 +404,7 @@ static inline long atomic64_fetch_sub##name(long i, atomic64_t *v) \ /* LSE atomics */ \ " neg %[i], %[i]\n" \ " ldadd" #mb " %[i], %[i], %[v]") \ - : [i] "+r" (x0), [v] "+Q" (v->counter) \ + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v) " sub x30, x30, %[ret]\n" " cbnz x30, 1b\n" "2:") - : [ret] "+r" (x0), [v] "+Q" (v->counter) + : [ret] "+&r" (x0), [v] "+Q" (v->counter) : : __LL_SC_CLOBBERS, "cc", "memory"); @@ -516,7 +516,7 @@ static inline long __cmpxchg_double##name(unsigned long old1, \ " eor %[old1], %[old1], %[oldval1]\n" \ " eor %[old2], %[old2], %[oldval2]\n" \ " orr %[old1], %[old1], %[old2]") \ - : [old1] "+r" (x0), [old2] "+r" (x1), \ + : [old1] "+&r" (x0), [old2] "+&r" (x1), \ [v] "+Q" (*(unsigned long *)ptr) \ : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \ -- 2.1.4
On Mon, 2018-05-21 at 18:00 +0100, Will Deacon wrote: > Hi Mark, > > Thanks for reporting this. > > On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote: > > When running a kernel compiled with gcc8 on a machine using LSE, I > > get: > > > > Unable to handle kernel paging request at virtual address 11111122222221 > > [...] > > > The fault happens at the casal insn of inlined atomic64_dec_if_positive(). > > The inline asm code in that function has: > > > > "1: ldr x30, %[v]\n" > > " subs %[ret], x30, #1\n" > > " b.lt 2f\n" > > " casal x30, %[ret], %[v]\n" > > " sub x30, x30, #1\n" > > " sub x30, x30, %[ret]\n" > > " cbnz x30, 1b\n" > > "2:") > > : [ret] "+r" (x0), [v] "+Q" (v->counter) > > > > gcc8 used register x0 for both [ret] and [v] and the subs was > > clobbering [v] before it was used for casal. Gcc is free to do > > this because [ret] lacks an early clobber modifier. So add one > > to tell gcc a separate register is needed for [v]. > > Oh blimey, it looks like GCC is realising that counter is at offset 0 > of atomic_t and therefore assigns the same register for [ret] and [v], > which is actually forced to be x0 by the 'register' local variable in > C code. The "+Q" constraint only says that the memory is read/write, so > the pointer is fair game. > > I agree with your fix, but we also need to fix up the other places relying > on this. Patch below -- please yell if you think I missed any. I looked at the other places but figured they were okay because we're explicitly using separate registers. But I suppose the early clobber is the right thing to do in any case. > > Cheers, > > Will > > --->8 > > From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001 > From: Will Deacon <will.deacon@arm.com> > Date: Mon, 21 May 2018 17:44:57 +0100 > Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm > operands > > For LSE atomics that read and write a register operand, we need to > ensure that these operands are annotated as "early clobber" if the > register is written before all of the input operands have been consumed. > Failure to do so can result in the compiler allocating the same register > to both operands, leading to splats such as: > > Unable to handle kernel paging request at virtual address 11111122222221 > [...] > x1 : 1111111122222222 x0 : 1111111122222221 > Process swapper/0 (pid: 1, stack limit = 0x000000008209f908) > Call trace: > test_atomic64+0x1360/0x155c > > where x0 has been allocated as both the value to be stored and also the > atomic_t pointer. > > This patch adds the missing clobbers. > > Cc: <stable@vger.kernel.org> > Cc: Dave Martin <dave.martin@arm.com> > Cc: Robin Murphy <robin.murphy@arm.com> > Reported-by: Mark Salter <msalter@redhat.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > arch/arm64/include/asm/atomic_lse.h | 24 ++++++++++++------------ > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h > index 9ef0797380cb..f9b0b09153e0 100644 > --- a/arch/arm64/include/asm/atomic_lse.h > +++ b/arch/arm64/include/asm/atomic_lse.h > @@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v) > /* LSE atomics */ > " mvn %w[i], %w[i]\n" > " stclr %w[i], %[v]") > - : [i] "+r" (w0), [v] "+Q" (v->counter) > + : [i] "+&r" (w0), [v] "+Q" (v->counter) > : "r" (x1) > : __LL_SC_CLOBBERS); > } > @@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t *v) \ > /* LSE atomics */ \ > " mvn %w[i], %w[i]\n" \ > " ldclr" #mb " %w[i], %w[i], %[v]") \ > - : [i] "+r" (w0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS, ##cl); \ > \ > @@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v) > /* LSE atomics */ > " neg %w[i], %w[i]\n" > " stadd %w[i], %[v]") > - : [i] "+r" (w0), [v] "+Q" (v->counter) > + : [i] "+&r" (w0), [v] "+Q" (v->counter) > : "r" (x1) > : __LL_SC_CLOBBERS); > } > @@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t *v) \ > " neg %w[i], %w[i]\n" \ > " ldadd" #mb " %w[i], w30, %[v]\n" \ > " add %w[i], %w[i], w30") \ > - : [i] "+r" (w0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS , ##cl); \ > \ > @@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t *v) \ > /* LSE atomics */ \ > " neg %w[i], %w[i]\n" \ > " ldadd" #mb " %w[i], %w[i], %[v]") \ > - : [i] "+r" (w0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (w0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS, ##cl); \ > \ > @@ -314,7 +314,7 @@ static inline void atomic64_and(long i, atomic64_t *v) > /* LSE atomics */ > " mvn %[i], %[i]\n" > " stclr %[i], %[v]") > - : [i] "+r" (x0), [v] "+Q" (v->counter) > + : [i] "+&r" (x0), [v] "+Q" (v->counter) > : "r" (x1) > : __LL_SC_CLOBBERS); > } > @@ -332,7 +332,7 @@ static inline long atomic64_fetch_and##name(long i, atomic64_t *v) \ > /* LSE atomics */ \ > " mvn %[i], %[i]\n" \ > " ldclr" #mb " %[i], %[i], %[v]") \ > - : [i] "+r" (x0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS, ##cl); \ > \ > @@ -358,7 +358,7 @@ static inline void atomic64_sub(long i, atomic64_t *v) > /* LSE atomics */ > " neg %[i], %[i]\n" > " stadd %[i], %[v]") > - : [i] "+r" (x0), [v] "+Q" (v->counter) > + : [i] "+&r" (x0), [v] "+Q" (v->counter) > : "r" (x1) > : __LL_SC_CLOBBERS); > } > @@ -377,7 +377,7 @@ static inline long atomic64_sub_return##name(long i, atomic64_t *v) \ > " neg %[i], %[i]\n" \ > " ldadd" #mb " %[i], x30, %[v]\n" \ > " add %[i], %[i], x30") \ > - : [i] "+r" (x0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS, ##cl); \ > \ > @@ -404,7 +404,7 @@ static inline long atomic64_fetch_sub##name(long i, atomic64_t *v) \ > /* LSE atomics */ \ > " neg %[i], %[i]\n" \ > " ldadd" #mb " %[i], %[i], %[v]") \ > - : [i] "+r" (x0), [v] "+Q" (v->counter) \ > + : [i] "+&r" (x0), [v] "+Q" (v->counter) \ > : "r" (x1) \ > : __LL_SC_CLOBBERS, ##cl); \ > \ > @@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v) > " sub x30, x30, %[ret]\n" > " cbnz x30, 1b\n" > "2:") > - : [ret] "+r" (x0), [v] "+Q" (v->counter) > + : [ret] "+&r" (x0), [v] "+Q" (v->counter) > : > : __LL_SC_CLOBBERS, "cc", "memory"); > > @@ -516,7 +516,7 @@ static inline long __cmpxchg_double##name(unsigned long old1, \ > " eor %[old1], %[old1], %[oldval1]\n" \ > " eor %[old2], %[old2], %[oldval2]\n" \ > " orr %[old1], %[old1], %[old2]") \ > - : [old1] "+r" (x0), [old2] "+r" (x1), \ > + : [old1] "+&r" (x0), [old2] "+&r" (x1), \ > [v] "+Q" (*(unsigned long *)ptr) \ > : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ > [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \
On Mon, May 21, 2018 at 01:18:39PM -0400, Mark Salter wrote: > On Mon, 2018-05-21 at 18:00 +0100, Will Deacon wrote: > > Hi Mark, > > > > Thanks for reporting this. > > > > On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote: > > > When running a kernel compiled with gcc8 on a machine using LSE, I > > > get: > > > > > > Unable to handle kernel paging request at virtual address 11111122222221 > > > > [...] > > > > > The fault happens at the casal insn of inlined atomic64_dec_if_positive(). > > > The inline asm code in that function has: > > > > > > "1: ldr x30, %[v]\n" > > > " subs %[ret], x30, #1\n" > > > " b.lt 2f\n" > > > " casal x30, %[ret], %[v]\n" > > > " sub x30, x30, #1\n" > > > " sub x30, x30, %[ret]\n" > > > " cbnz x30, 1b\n" > > > "2:") > > > : [ret] "+r" (x0), [v] "+Q" (v->counter) > > > > > > gcc8 used register x0 for both [ret] and [v] and the subs was > > > clobbering [v] before it was used for casal. Gcc is free to do > > > this because [ret] lacks an early clobber modifier. So add one > > > to tell gcc a separate register is needed for [v]. > > > > Oh blimey, it looks like GCC is realising that counter is at offset 0 > > of atomic_t and therefore assigns the same register for [ret] and [v], > > which is actually forced to be x0 by the 'register' local variable in > > C code. The "+Q" constraint only says that the memory is read/write, so > > the pointer is fair game. > > > > I agree with your fix, but we also need to fix up the other places relying > > on this. Patch below -- please yell if you think I missed any. > > I looked at the other places but figured they were okay because we're > explicitly using separate registers. But I suppose the early clobber > is the right thing to do in any case. I was worried about silly things like a caller doing: atomic64_and((long)v, v); and then GCC figuring out that the two values were equal and allocating the same register.. Will
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..99fa69c9c3cf 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v) " sub x30, x30, %[ret]\n" " cbnz x30, 1b\n" "2:") - : [ret] "+r" (x0), [v] "+Q" (v->counter) + : [ret] "+&r" (x0), [v] "+Q" (v->counter) : : __LL_SC_CLOBBERS, "cc", "memory");
When running a kernel compiled with gcc8 on a machine using LSE, I get: Unable to handle kernel paging request at virtual address 11111122222221 Mem abort info: ESR = 0x96000021 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000021 CM = 0, WnR = 0 [0011111122222221] address between user and kernel address ranges Internal error: Oops: 96000021 [#1] SMP ... pstate: 20400009 (nzCv daif +PAN -UAO) pc : test_atomic64+0x1360/0x155c lr : 0x1111111122222222 sp : ffff00000bc6fd60 x29: ffff00000bc6fd60 x28: 0000000000000000 x27: 0000000000000000 x26: ffff000008f04460 x25: ffff000008de0584 x24: ffff000008e91060 x23: aaa31337c001d00e x22: 999202269ddfadeb x21: aaa31337c001d00c x20: bbb42448e223f22f x19: aaa31337c001d00d x18: 0000000000000010 x17: 0000000000000222 x16: 00000000000010e0 x15: ffffffffffffffff x14: ffff000009233c08 x13: ffff000089925a8f x12: ffff000009925a97 x11: ffff00000927f000 x10: ffff00000bc6fac0 x9 : 00000000ffffffd0 x8 : ffff00000853fdf8 x7 : 00000000deadbeef x6 : ffff00000bc6fda0 x5 : aaa31337c001d00d x4 : deadbeefdeafcafe x3 : aaa31337c001d00d x2 : aaa31337c001d00e x1 : 1111111122222222 x0 : 1111111122222221 Process swapper/0 (pid: 1, stack limit = 0x000000008209f908) Call trace: test_atomic64+0x1360/0x155c test_atomics_init+0x10/0x28 do_one_initcall+0x134/0x160 kernel_init_freeable+0x18c/0x21c kernel_init+0x18/0x108 ret_from_fork+0x10/0x1c Code: f90023e1 f940001e f10007c0 540000ab (c8fefc00) ---[ end trace 29569e7320c6e926 ]--- The fault happens at the casal insn of inlined atomic64_dec_if_positive(). The inline asm code in that function has: "1: ldr x30, %[v]\n" " subs %[ret], x30, #1\n" " b.lt 2f\n" " casal x30, %[ret], %[v]\n" " sub x30, x30, #1\n" " sub x30, x30, %[ret]\n" " cbnz x30, 1b\n" "2:") : [ret] "+r" (x0), [v] "+Q" (v->counter) gcc8 used register x0 for both [ret] and [v] and the subs was clobbering [v] before it was used for casal. Gcc is free to do this because [ret] lacks an early clobber modifier. So add one to tell gcc a separate register is needed for [v]. Signed-off-by: Mark Salter <msalter@redhat.com> --- arch/arm64/include/asm/atomic_lse.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.17.0