Message ID | 20151211120419.GD18828@arm.com |
---|---|
State | New |
Headers | show |
On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote: > On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote: > > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock, > > as opposed to "compare and swap". In which case, it does look like > > there's a bug here because there is nothing to order the initialisation > > of the node fields with publishing of the node, whether that's > > indirectly as a result of setting the tail to the current CPU or > > directly as a result of the WRITE_ONCE. > > Agreed, this does indeed look like a bug. If confirmed please write a > shiny changelog and I'll queue asap. Yup. I've failed to reproduce the issue locally, so we'll need to wait for Andrew and/or David to get back to us first. Will > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c > > index d092a0c9c2d4..05a37857ab55 100644 > > --- a/kernel/locking/osq_lock.c > > +++ b/kernel/locking/osq_lock.c > > @@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock) > > node->cpu = curr; > > > > /* > > - * ACQUIRE semantics, pairs with corresponding RELEASE > > - * in unlock() uncontended, or fastpath. > > + * We need both ACQUIRE (pairs with corresponding RELEASE in > > + * unlock() uncontended, or fastpath) and RELEASE (to publish > > + * the node fields we just initialised) semantics when updating > > + * the lock tail. > > */ > > - old = atomic_xchg_acquire(&lock->tail, curr); > > + old = atomic_xchg(&lock->tail, curr); > > if (old == OSQ_UNLOCKED_VAL) > > return true; > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Fri, Dec 11, 2015 at 01:26:47PM +0100, Peter Zijlstra wrote: > On Fri, Dec 11, 2015 at 12:18:00PM +0000, Will Deacon wrote: > > On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote: > > > On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote: > > > > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock, > > > > as opposed to "compare and swap". In which case, it does look like > > > > there's a bug here because there is nothing to order the initialisation > > > > of the node fields with publishing of the node, whether that's > > > > indirectly as a result of setting the tail to the current CPU or > > > > directly as a result of the WRITE_ONCE. > > > > > > Agreed, this does indeed look like a bug. If confirmed please write a > > > shiny changelog and I'll queue asap. > > > > Yup. I've failed to reproduce the issue locally, so we'll need to wait > > for Andrew and/or David to get back to us first. > > While we're there, the acquire in osq_wait_next() seems somewhat ill > documented too. > > I _think_ we need ACQUIRE semantics there because we want to strictly > order the lock-unqueue A,B,C steps and we get that with: > > A: SC > B: ACQ > C: Relaxed > > Similarly for unlock we want the WRITE_ONCE to happen after > osq_wait_next, but in that case we can even rely on the control > dependency there. Even for the lock-unqueue case, isn't B->C ordered by a control dependency because C consists only of stores? Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Fri, Dec 11, 2015 at 02:48:03PM +0100, Peter Zijlstra wrote: > On Fri, Dec 11, 2015 at 01:33:14PM +0000, Will Deacon wrote: > > On Fri, Dec 11, 2015 at 01:26:47PM +0100, Peter Zijlstra wrote: > > > > While we're there, the acquire in osq_wait_next() seems somewhat ill > > > documented too. > > > > > > I _think_ we need ACQUIRE semantics there because we want to strictly > > > order the lock-unqueue A,B,C steps and we get that with: > > > > > > A: SC > > > B: ACQ > > > C: Relaxed > > > > > > Similarly for unlock we want the WRITE_ONCE to happen after > > > osq_wait_next, but in that case we can even rely on the control > > > dependency there. > > > > Even for the lock-unqueue case, isn't B->C ordered by a control dependency > > because C consists only of stores? > > Hmm, indeed. So we could go fully relaxed on it I suppose, since the > same is true for the unlock site. In which case, we should be able to relax the xchg in there (osq_wait_next) too, right? Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Fri, Dec 11, 2015 at 06:11:28PM +0100, Peter Zijlstra wrote: > On Fri, Dec 11, 2015 at 02:06:49PM +0000, Will Deacon wrote: > > On Fri, Dec 11, 2015 at 02:48:03PM +0100, Peter Zijlstra wrote: > > > On Fri, Dec 11, 2015 at 01:33:14PM +0000, Will Deacon wrote: > > > > On Fri, Dec 11, 2015 at 01:26:47PM +0100, Peter Zijlstra wrote: > > > > > > > > While we're there, the acquire in osq_wait_next() seems somewhat ill > > > > > documented too. > > > > > > > > > > I _think_ we need ACQUIRE semantics there because we want to strictly > > > > > order the lock-unqueue A,B,C steps and we get that with: > > > > > > > > > > A: SC > > > > > B: ACQ > > > > > C: Relaxed > > > > > > > > > > Similarly for unlock we want the WRITE_ONCE to happen after > > > > > osq_wait_next, but in that case we can even rely on the control > > > > > dependency there. > > > > > > > > Even for the lock-unqueue case, isn't B->C ordered by a control dependency > > > > because C consists only of stores? > > > > > > Hmm, indeed. So we could go fully relaxed on it I suppose, since the > > > same is true for the unlock site. > > > > In which case, we should be able to relax the xchg in there (osq_wait_next) > > too, right? > > Can I have second thoughts an confuse matters again? ;-) > > A RmW-acq is a load-acquire+store. That means the store is _after_ the > load and thus not required for the completion of the control dependency. > > Therefore the store in question can reorder inside the conditional > control block's stores. > > Hmm? Ah yeah, it's the same thing we were discussing the other day! Whilst there is a form of control dependency from the SC part of the LL/SC sequence, it doesn't guarantee ordering in the same way that a load->store control dependency does. That is, it orders subsequent writes to be afterwards in the coherence order but it doesn't ensure multi-copy atomicity for readers. Now, in this case, &lock->tail is only ever accessed by other cmpxchg operations, so I think it does actually work using just the control dependency. Worst case, a concurrent osq_wait_next gets a stale value in the atomic_read, but that's not a correctness problem. Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index d092a0c9c2d4..05a37857ab55 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -93,10 +93,12 @@ bool osq_lock(struct optimistic_spin_queue *lock) node->cpu = curr; /* - * ACQUIRE semantics, pairs with corresponding RELEASE - * in unlock() uncontended, or fastpath. + * We need both ACQUIRE (pairs with corresponding RELEASE in + * unlock() uncontended, or fastpath) and RELEASE (to publish + * the node fields we just initialised) semantics when updating + * the lock tail. */ - old = atomic_xchg_acquire(&lock->tail, curr); + old = atomic_xchg(&lock->tail, curr); if (old == OSQ_UNLOCKED_VAL) return true;