Message ID | 20210415142552.30916-1-alisaidi@amazon.com |
---|---|
State | Superseded |
Headers | show |
Series | locking/qrwlock: Fix ordering in queued_write_lock_slowpath | expand |
On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote: > While this code is executed with the wait_lock held, a reader can > acquire the lock without holding wait_lock. The writer side loops > checking the value with the atomic_cond_read_acquire(), but only truly > acquires the lock when the compare-and-exchange is completed > successfully which isn’t ordered. The other atomic operations from this > point are release-ordered and thus reads after the lock acquisition can > be completed before the lock is truly acquired which violates the > guarantees the lock should be making. I think it would be worth spelling this out with an example. The issue appears to be a concurrent reader in interrupt context taking and releasing the lock in the window where the writer has returned from the atomic_cond_read_acquire() but has not yet performed the cmpxchg(). Loads can be speculated during this time, but the A-B-A of the lock word from _QW_WAITING to (_QW_WAITING | _QR_BIAS) and back to _QW_WAITING allows the atomic_cmpxchg_relaxed() to succeed. Is that right? With that in mind, it would probably be a good idea to eyeball the qspinlock slowpath as well, as that uses both atomic_cond_read_acquire() and atomic_try_cmpxchg_relaxed(). > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") Typo in the quoted subject ('qrwloc'). > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Cc: stable@vger.kernel.org > --- > kernel/locking/qrwlock.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > index 4786dd271b45..10770f6ac4d9 100644 > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > /* When no more readers or writers, set the locked flag */ > do { > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); > + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, > _QW_LOCKED) != _QW_WAITING); Patch looks good, so with an updated message: Acked-by: Will Deacon <will@kernel.org> Will
On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote: > While this code is executed with the wait_lock held, a reader can > acquire the lock without holding wait_lock. The writer side loops > checking the value with the atomic_cond_read_acquire(), but only truly > acquires the lock when the compare-and-exchange is completed > successfully which isn’t ordered. The other atomic operations from this > point are release-ordered and thus reads after the lock acquisition can > be completed before the lock is truly acquired which violates the > guarantees the lock should be making. Should be per who? We explicitly do not order the lock acquire store. qspinlock has the exact same issue. If you look in the git history surrounding spin_is_locked(), you'll find 'lovely' things. Basically, if you're doing crazy things with spin_is_locked() you get to keep the pieces. > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Cc: stable@vger.kernel.org > --- > kernel/locking/qrwlock.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > index 4786dd271b45..10770f6ac4d9 100644 > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > /* When no more readers or writers, set the locked flag */ > do { > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); > + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, > _QW_LOCKED) != _QW_WAITING); > unlock: > arch_spin_unlock(&lock->wait_lock); This doesn't make sense, there is no such thing as a store-acquire. What you're doing here is moving the acquire from one load to the next. A load we know will load the exact same value. Also see Documentation/atomic_t.txt: {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE If anything this code wants to be written like so. --- diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 4786dd271b45..22aeccc363ca 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); */ void queued_write_lock_slowpath(struct qrwlock *lock) { + u32 cnt; + /* Put the writer into the wait queue */ arch_spin_lock(&lock->wait_lock); @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) /* When no more readers or writers, set the locked flag */ do { - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, - _QW_LOCKED) != _QW_WAITING); + cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); + } while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED)); unlock: arch_spin_unlock(&lock->wait_lock); }
On 4/15/21 10:25 AM, Ali Saidi wrote: > While this code is executed with the wait_lock held, a reader can > acquire the lock without holding wait_lock. The writer side loops > checking the value with the atomic_cond_read_acquire(), but only truly > acquires the lock when the compare-and-exchange is completed > successfully which isn’t ordered. The other atomic operations from this > point are release-ordered and thus reads after the lock acquisition can > be completed before the lock is truly acquired which violates the > guarantees the lock should be making. > > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > Cc: stable@vger.kernel.org > --- > kernel/locking/qrwlock.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > index 4786dd271b45..10770f6ac4d9 100644 > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > /* When no more readers or writers, set the locked flag */ > do { > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); > + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, > _QW_LOCKED) != _QW_WAITING); > unlock: > arch_spin_unlock(&lock->wait_lock); I think the original code isn't wrong. The read_acquire provides the acquire barrier for cmpxchg. Because of conditional dependency, the wait_lock unlock won't happen until the cmpxchg succeeds. Without doing a read_acquire, there may be a higher likelihood that the cmpxchg may fail. Anyway, I will let Will or Peter chime in on this as I am not as proficient as them on this topic. Cheers, Longman
On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote: > On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote: > > While this code is executed with the wait_lock held, a reader can > > acquire the lock without holding wait_lock. The writer side loops > > checking the value with the atomic_cond_read_acquire(), but only truly > > acquires the lock when the compare-and-exchange is completed > > successfully which isn’t ordered. The other atomic operations from this > > point are release-ordered and thus reads after the lock acquisition can > > be completed before the lock is truly acquired which violates the > > guarantees the lock should be making. > > Should be per who? We explicitly do not order the lock acquire store. > qspinlock has the exact same issue. > > If you look in the git history surrounding spin_is_locked(), you'll find > 'lovely' things. > > Basically, if you're doing crazy things with spin_is_locked() you get to > keep the pieces. > > > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > Cc: stable@vger.kernel.org > > --- > > kernel/locking/qrwlock.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > > index 4786dd271b45..10770f6ac4d9 100644 > > --- a/kernel/locking/qrwlock.c > > +++ b/kernel/locking/qrwlock.c > > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > > > /* When no more readers or writers, set the locked flag */ > > do { > > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > > + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); > > + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, > > _QW_LOCKED) != _QW_WAITING); > > unlock: > > arch_spin_unlock(&lock->wait_lock); > > This doesn't make sense, there is no such thing as a store-acquire. What > you're doing here is moving the acquire from one load to the next. A > load we know will load the exact same value. > > Also see Documentation/atomic_t.txt: > > {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE > > > If anything this code wants to be written like so. > > --- > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > index 4786dd271b45..22aeccc363ca 100644 > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); > */ > void queued_write_lock_slowpath(struct qrwlock *lock) > { > + u32 cnt; > + > /* Put the writer into the wait queue */ > arch_spin_lock(&lock->wait_lock); > > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > /* When no more readers or writers, set the locked flag */ > do { > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > - _QW_LOCKED) != _QW_WAITING); > + cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); I think the issue is that >here< a concurrent reader in interrupt context can take the lock and release it again, but we could speculate reads from the critical section up over the later release and up before the control dependency here... > + } while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED)); ... and then this cmpxchg() will succeed, so our speculated stale reads could be used. *HOWEVER* Speculating a read should be fine in the face of a concurrent _reader_, so for this to be an issue it implies that the reader is also doing some (atomic?) updates. Ali? Will
On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote: > On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote: > > On Thu, Apr 15, 2021 at 02:25:52PM +0000, Ali Saidi wrote: > > > While this code is executed with the wait_lock held, a reader can > > > acquire the lock without holding wait_lock. The writer side loops > > > checking the value with the atomic_cond_read_acquire(), but only truly > > > acquires the lock when the compare-and-exchange is completed > > > successfully which isn’t ordered. The other atomic operations from this > > > point are release-ordered and thus reads after the lock acquisition can > > > be completed before the lock is truly acquired which violates the > > > guarantees the lock should be making. [...] > > > Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") > > > Signed-off-by: Ali Saidi <alisaidi@amazon.com> > > > Cc: stable@vger.kernel.org > > > --- > > > kernel/locking/qrwlock.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > > > index 4786dd271b45..10770f6ac4d9 100644 > > > --- a/kernel/locking/qrwlock.c > > > +++ b/kernel/locking/qrwlock.c > > > @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > > > > > /* When no more readers or writers, set the locked flag */ > > > do { > > > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > > > + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); > > > + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, > > > _QW_LOCKED) != _QW_WAITING); > > > unlock: > > > arch_spin_unlock(&lock->wait_lock); > > > > This doesn't make sense, there is no such thing as a store-acquire. What > > you're doing here is moving the acquire from one load to the next. A > > load we know will load the exact same value. > > > > Also see Documentation/atomic_t.txt: > > > > {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE > > > > > > If anything this code wants to be written like so. > > > > --- > > > > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > > index 4786dd271b45..22aeccc363ca 100644 > > --- a/kernel/locking/qrwlock.c > > +++ b/kernel/locking/qrwlock.c > > @@ -60,6 +60,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); > > */ > > void queued_write_lock_slowpath(struct qrwlock *lock) > > { > > + u32 cnt; > > + > > /* Put the writer into the wait queue */ > > arch_spin_lock(&lock->wait_lock); > > > > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > > > /* When no more readers or writers, set the locked flag */ > > do { > > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > > - _QW_LOCKED) != _QW_WAITING); > > + cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > I think the issue is that >here< a concurrent reader in interrupt context > can take the lock and release it again, but we could speculate reads from > the critical section up over the later release and up before the control > dependency here... > > > + } while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED)); > > ... and then this cmpxchg() will succeed, so our speculated stale reads > could be used. > > *HOWEVER* > > Speculating a read should be fine in the face of a concurrent _reader_, > so for this to be an issue it implies that the reader is also doing some > (atomic?) updates. There's at least one such case: see chain_epi_lockless() updating epi->next, called from ep_poll_callback() with a read_lock held. This races with ep_done_scan() which has the write_lock held. I think the authors of the above code interpreted the read_lock as something that multiple threads can own disregarding the _read_ part.
On Thu, Apr 15, 2021 at 04:28:21PM +0100, Will Deacon wrote: > On Thu, Apr 15, 2021 at 05:03:58PM +0200, Peter Zijlstra wrote: > > @@ -73,9 +75,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) > > > > /* When no more readers or writers, set the locked flag */ > > do { > > - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, > > - _QW_LOCKED) != _QW_WAITING); > > + cnt = atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); > > I think the issue is that >here< a concurrent reader in interrupt context > can take the lock and release it again, but we could speculate reads from > the critical section up over the later release and up before the control > dependency here... OK, so that was totally not clear from the original changelog. > > + } while (!atomic_try_cmpxchg_relaxed(&lock->cnts, &cnt, _QW_LOCKED)); > > ... and then this cmpxchg() will succeed, so our speculated stale reads > could be used. > > *HOWEVER* > > Speculating a read should be fine in the face of a concurrent _reader_, > so for this to be an issue it implies that the reader is also doing some > (atomic?) updates. So we're having something like: CPU0 CPU1 queue_write_lock_slowpath() atomic_cond_read_acquire() == _QW_WAITING queued_read_lock_slowpath() atomic_cond_read_acquire() return; ,--> (before X's store) | X = y; | | queued_read_unlock() | (void)atomic_sub_return_release() | | atomic_cmpxchg_relaxed(.old = _QW_WAITING) | `-- r = X; Which as Will said is a cmpxchg ABA, however for it to be ABA, we have to observe that unlock-store, and won't we then also have to observe the whole critical section? Ah, the issue is yet another load inside our own (CPU0)'s critical section. which isn't ordered against the cmpxchg_relaxed() and can be issued before. So yes, then making the cmpxchg an acquire, will force all our own loads to happen later.
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 4786dd271b45..10770f6ac4d9 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -73,8 +73,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) /* When no more readers or writers, set the locked flag */ do { - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, _QW_LOCKED) != _QW_WAITING); unlock: arch_spin_unlock(&lock->wait_lock);
While this code is executed with the wait_lock held, a reader can acquire the lock without holding wait_lock. The writer side loops checking the value with the atomic_cond_read_acquire(), but only truly acquires the lock when the compare-and-exchange is completed successfully which isn’t ordered. The other atomic operations from this point are release-ordered and thus reads after the lock acquisition can be completed before the lock is truly acquired which violates the guarantees the lock should be making. Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwloc") Signed-off-by: Ali Saidi <alisaidi@amazon.com> Cc: stable@vger.kernel.org --- kernel/locking/qrwlock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)