Message ID | 1481840946.14990.588.camel@redhat.com |
---|---|
State | New |
Headers | show |
On 12/15/2016 11:29 PM, Torvald Riegel wrote: > Assume that Thread 1 waits to acquire a robust mutex using futexes to > block (and thus sets the FUTEX_WAITERS flag), and is unblocked when this > mutex is released. If Thread 2 concurrently acquires the lock and is > killed, Thread 1 can recover from the died owner but fail to restore the > FUTEX_WAITERS flag. This can lead to a Thread 3 that also blocked using > futexes at the same time as Thread 1 to not get woken up because > FUTEX_WAITERS is not set anymore. > > The fix for this is to ensure that we continue to preserve the > FUTEX_WAITERS flag whenever we may have set it or shared it with another > thread. This is the same requirement as in the algorithm for normal > mutexes, only that the robust mutexes need additional handling for died > owners and thus preserving the FUTEX_WAITERS flag cannot be done just in > the futex slowpath code. Description and change look good to me in general. > [BZ #20973] > * nptl/pthread_mutex_lock.c (__pthread_mutex_lock_full): Fix lost > wake-up in robust mutexes. > * nptl/pthread_mutex_timedlock.c (pthread_mutex_timedlock): Likewise. > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > index bdfa529..01ac75e 100644 > --- a/nptl/pthread_mutex_lock.c > +++ b/nptl/pthread_mutex_lock.c > @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > &mutex->__data.__list.__next); > > oldval = mutex->__data.__lock; > + /* This is set to FUTEX_WAITERS iff we might have shared the “iff” doesn't seem to be correct here because it's not an exact equivalence, “if” is sufficient. > + FUTEX_WAITERS flag with other threads, and therefore need to keep it > + set to avoid lost wake-ups. We have the same requirement in the > + simple mutex algorithm. */ > + unsigned int assume_other_futex_waiters = 0; > do > { > again: > @@ -190,9 +195,9 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > /* The previous owner died. Try locking the mutex. */ > int newval = id; > #ifdef NO_INCR > - newval |= FUTEX_WAITERS; > + newval |= FUTEX_WAITERS | assume_other_futex_waiters; > #else > - newval |= (oldval & FUTEX_WAITERS); > + newval |= (oldval & FUTEX_WAITERS) | assume_other_futex_waiters; > #endif The NO_INCR change is quite confusing. Perhaps drop it and add a comment? VL, what is the copyright status of your test case? Thanks, Florian
On Fri, 2016-12-16 at 15:11 +0100, Florian Weimer wrote: > On 12/15/2016 11:29 PM, Torvald Riegel wrote: > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > > index bdfa529..01ac75e 100644 > > --- a/nptl/pthread_mutex_lock.c > > +++ b/nptl/pthread_mutex_lock.c > > @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > > &mutex->__data.__list.__next); > > > > oldval = mutex->__data.__lock; > > + /* This is set to FUTEX_WAITERS iff we might have shared the > > “iff” doesn't seem to be correct here because it's not an exact > equivalence, “if” is sufficient. No, I think the iff is correct. We do only set it if we may have shared the flag. > > @@ -190,9 +195,9 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > > /* The previous owner died. Try locking the mutex. */ > > int newval = id; > > #ifdef NO_INCR > > - newval |= FUTEX_WAITERS; > > + newval |= FUTEX_WAITERS | assume_other_futex_waiters; > > #else > > - newval |= (oldval & FUTEX_WAITERS); > > + newval |= (oldval & FUTEX_WAITERS) | assume_other_futex_waiters; > > #endif > > The NO_INCR change is quite confusing. Perhaps drop it and add a comment? Yes. > VL, what is the copyright status of your test case? So, I'll wait a little to see how the test case question resolved, and then commit it with the one change above either with or without a test case. If anyone objects to that, please speak up.
On 12/17/2016 05:52 AM, Amitay Isaacs wrote: > On Sat, Dec 17, 2016 at 1:44 AM, Volker Lendecke <vl@samba.org> wrote: > >> On Fri, Dec 16, 2016 at 03:11:12PM +0100, Florian Weimer wrote: >>> VL, what is the copyright status of your test case? >> >> This is not really mine, it was written by Amitay Isaacs >> <amitay@gmail.com>. Amitay does a lot of Samba and ctdb coding, so I >> would expect this to be his (C) under GPLv3+. If you need that as a >> regression under something more liberal, while I can't speak for him, >> I would expect this not to be a problem. >> >> > Yes, the test has GPLv3+ license and my copyright. > > Copyright (C) Amitay Isaacs 2016 > > Is there any particular reason for asking this? We'd like to include it as a test case in glibc, but that requires a copyright assignment, which is usually not worth the trouble for such a change. Thanks, Florian
On 12/15/2016 05:29 PM, Torvald Riegel wrote: > On Thu, 2016-12-15 at 23:27 +0100, Torvald Riegel wrote: >> See patch for a description. >> >> Tested on x86_64-linux with our tests and the test case from the >> original bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1401665 >> >> OK? This looks good to me. Detailed review below. > Now with an attached patch... > > > robust-mutex-lwu.patch > > > commit 74be3b28d962a5a6d2eeff51b93d61ddf91e2d49 > Author: Torvald Riegel <triegel@redhat.com> > Date: Thu Dec 15 16:06:28 2016 +0100 > > Robust mutexes: Fix lost wake-up. > > Assume that Thread 1 waits to acquire a robust mutex using futexes to > block (and thus sets the FUTEX_WAITERS flag), and is unblocked when this > mutex is released. If Thread 2 concurrently acquires the lock and is > killed, Thread 1 can recover from the died owner but fail to restore the > FUTEX_WAITERS flag. This can lead to a Thread 3 that also blocked using > futexes at the same time as Thread 1 to not get woken up because > FUTEX_WAITERS is not set anymore. Just to reiterate: T0 T1 T2 T3 | | | | LOCK | | | | WAIT | | | s | WAIT UNLOCK s | s WAKE AWAKE | s | | LOCK s | | DIES s | LOCK s | UNLOCK s | NOWAKE * At the point T0 unlocks and wakes T1 we have both T1 and T2 attempting to acquire the same lock which has a value of 0 (no waiters since the shared waiter flag was cleared). If T2 take the lock (no FUTEX_WAITERS set) and dies, the futex value is going to be reset to FUTEX_OWNER_DIED by the kernel (note that the kernel preserves FUTEX_WAITERS, but it was not set in this case because T0 unlocked). In the case of T1, the following loop is executing 27 int 28 __lll_robust_lock_wait (int *futex, int private) 29 { 37 do 38 { 39 /* If the owner died, return the present value of the futex. */ 40 if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED)) 41 return oldval; 42 43 /* Try to put the lock into state 'acquired, possibly with waiters'. */ 44 int newval = oldval | FUTEX_WAITERS; 45 if (oldval != newval 46 && atomic_compare_and_exchange_bool_acq (futex, newval, oldval)) 47 continue; 48 49 /* If *futex == 2, wait until woken. */ 50 lll_futex_wait (futex, newval, private); We have just been woken. T2 acquires the lock, dies, and the lock is cleaned up. Nobody is waiting on the lock. 52 try: 53 ; 54 } 55 while ((oldval = atomic_compare_and_exchange_val_acq (futex, 56 tid | FUTEX_WAITERS, 57 0)) != 0); This fails because the value is not 0 it's FUTEX_OWNER_DIED. We loop up to the top and check FUTEX_OWNER_DIED, which matches, and then we return to the higher-level code which has now lost FUTEX_WAITERS. At this point T3 has lost the wakeup it needs to continue with the lock because T1 has "forgotten" there were waiters. > The fix for this is to ensure that we continue to preserve the > FUTEX_WAITERS flag whenever we may have set it or shared it with another > thread. This is the same requirement as in the algorithm for normal > mutexes, only that the robust mutexes need additional handling for died > owners and thus preserving the FUTEX_WAITERS flag cannot be done just in > the futex slowpath code. Agreed. Just highlighting from above: 39 /* If the owner died, return the present value of the futex. */ 40 if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED)) 41 return oldval; Here we return the shared oldval with FUTEX_WAITERS lost. I struggled with the possibility of a simpler fix that just changed the returned oldval, but you correctly pointed out to me that such a fix breaks the abstraction and algorithms that consider oldval to be the previously read value from memory. Therefore I agree that this needs to be handled at a higher level which also has to handle re-inserting FUTEX_WAITERS as required as is done here. With your patch the only two lock patch we have are now covered and correctly re-add FUTEX_WAITERS if we'd had a slowpath wait that might potentially have lost the FUTEX_WAITERS flag due to a concurrent lock/death sequence. I don't immediately see any other broken paths where FUTEX_WAITERS can be dropped and the wakeup lost. > [BZ #20973] > * nptl/pthread_mutex_lock.c (__pthread_mutex_lock_full): Fix lost > wake-up in robust mutexes. > * nptl/pthread_mutex_timedlock.c (pthread_mutex_timedlock): Likewise. > > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > index bdfa529..01ac75e 100644 > --- a/nptl/pthread_mutex_lock.c > +++ b/nptl/pthread_mutex_lock.c > @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > &mutex->__data.__list.__next); > > oldval = mutex->__data.__lock; > + /* This is set to FUTEX_WAITERS iff we might have shared the > + FUTEX_WAITERS flag with other threads, and therefore need to keep it > + set to avoid lost wake-ups. We have the same requirement in the > + simple mutex algorithm. */ > + unsigned int assume_other_futex_waiters = 0; OK. > do > { > again: > @@ -190,9 +195,9 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > /* The previous owner died. Try locking the mutex. */ > int newval = id; > #ifdef NO_INCR > - newval |= FUTEX_WAITERS; > + newval |= FUTEX_WAITERS | assume_other_futex_waiters; > #else > - newval |= (oldval & FUTEX_WAITERS); > + newval |= (oldval & FUTEX_WAITERS) | assume_other_futex_waiters; OK. > #endif > > newval > @@ -253,7 +258,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > } > } > > - oldval = LLL_ROBUST_MUTEX_LOCK (mutex, id); > + oldval = LLL_ROBUST_MUTEX_LOCK (mutex, > + id | assume_other_futex_waiters); > + /* See above. We set FUTEX_WAITERS and might have shared this flag > + with other threads; thus, we need to preserve it. */ > + assume_other_futex_waiters = FUTEX_WAITERS; OK. > > if (__builtin_expect (mutex->__data.__owner > == PTHREAD_MUTEX_NOTRECOVERABLE, 0)) > diff --git a/nptl/pthread_mutex_timedlock.c b/nptl/pthread_mutex_timedlock.c > index 07f0901..21e9eea 100644 > --- a/nptl/pthread_mutex_timedlock.c > +++ b/nptl/pthread_mutex_timedlock.c > @@ -142,13 +142,19 @@ pthread_mutex_timedlock (pthread_mutex_t *mutex, > &mutex->__data.__list.__next); > > oldval = mutex->__data.__lock; > + /* This is set to FUTEX_WAITERS iff we might have shared the > + FUTEX_WAITERS flag with other threads, and therefore need to keep it > + set to avoid lost wake-ups. We have the same requirement in the > + simple mutex algorithm. */ > + unsigned int assume_other_futex_waiters = 0; OK. > do > { > again: > if ((oldval & FUTEX_OWNER_DIED) != 0) > { > /* The previous owner died. Try locking the mutex. */ > - int newval = id | (oldval & FUTEX_WAITERS); > + int newval = id | (oldval & FUTEX_WAITERS) > + | assume_other_futex_waiters; OK. > > newval > = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, > @@ -203,8 +209,12 @@ pthread_mutex_timedlock (pthread_mutex_t *mutex, > } > } > > - result = lll_robust_timedlock (mutex->__data.__lock, abstime, id, > + result = lll_robust_timedlock (mutex->__data.__lock, abstime, > + id | assume_other_futex_waiters, > PTHREAD_ROBUST_MUTEX_PSHARED (mutex)); > + /* See above. We set FUTEX_WAITERS and might have shared this flag > + with other threads; thus, we need to preserve it. */ > + assume_other_futex_waiters = FUTEX_WAITERS; OK. > > if (__builtin_expect (mutex->__data.__owner > == PTHREAD_MUTEX_NOTRECOVERABLE, 0)) -- Cheers, Carlos.
On 12/16/2016 11:13 PM, Torvald Riegel wrote: > On Fri, 2016-12-16 at 15:11 +0100, Florian Weimer wrote: >> On 12/15/2016 11:29 PM, Torvald Riegel wrote: >>> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c >>> index bdfa529..01ac75e 100644 >>> --- a/nptl/pthread_mutex_lock.c >>> +++ b/nptl/pthread_mutex_lock.c >>> @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) >>> &mutex->__data.__list.__next); >>> >>> oldval = mutex->__data.__lock; >>> + /* This is set to FUTEX_WAITERS iff we might have shared the >> >> “iff” doesn't seem to be correct here because it's not an exact >> equivalence, “if” is sufficient. > > No, I think the iff is correct. We do only set it if we may have shared > the flag. Then please change it to “This is set to FUTEX_WAITERS iff we have shared” (i.e. drop the “might”). Based on the source code, I'm still not sure if this is an exact equivalence. The part which confuses me is the unconditional assignment assume_other_futex_waiters = FUTEX_WAITERS further below. But I think lll_robust_lock returns 0 if we did not share FUTEX_WAITERS, and the code never retries with the assigned assume_other_futex_waiters value, ensuring the equivalence. I think it would be clearer if you switched from a do-while loop to a loop with an exit condition in the middle, right after the call to lll_robust_lock. Putting the FUTEX_WAITERS into the ID passed to lll_robust_lock is a violation of its precondition documented in sysdeps/nptl/lowlevellock.h, so please update the comment. Thanks, Florian
On 12/19/2016 02:47 PM, Florian Weimer wrote: > On 12/16/2016 11:13 PM, Torvald Riegel wrote: >> On Fri, 2016-12-16 at 15:11 +0100, Florian Weimer wrote: >>> On 12/15/2016 11:29 PM, Torvald Riegel wrote: >>>> diff --git a/nptl/pthread_mutex_lock.c >>>> b/nptl/pthread_mutex_lock.c index bdfa529..01ac75e 100644 --- >>>> a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@ >>>> -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t >>>> *mutex) &mutex->__data.__list.__next); >>>> >>>> oldval = mutex->__data.__lock; + /* This is set to >>>> FUTEX_WAITERS iff we might have shared the >>> >>> “iff” doesn't seem to be correct here because it's not an exact >>> equivalence, “if” is sufficient. >> >> No, I think the iff is correct. We do only set it if we may have >> shared the flag. > > Then please change it to “This is set to FUTEX_WAITERS iff we have > shared” (i.e. drop the “might”). Based on the source code, I'm still > not sure if this is an exact equivalence. OK. I agree I'm also not sure it's an exact equivalence, I'd also probably say "This _will_ be set to..." since technically we share FUTEX_WAITERS in the inner __lll_robust_lock_wait before we come back and set assume_other_futex_waiters. > The part which confuses me is the unconditional assignment > assume_other_futex_waiters = FUTEX_WAITERS further below. It _must_ be assigned unconditionally exactly because of the problem outlined in this issue (and expanded in my review). Because the value of FUTEX_WAITERS is shared, and because of the recovery semantics of the mutex, there is a path through __lll_robust_lock_wait which needs to avoid the loss of FUTEX_WAITERS (and the required wakeup which happens on unlock). Because the unlock and wakeup by T0 clears the FUTEX_WAITERS flag, and you don't know if other threads queued up after you, you must assume the worst and do the wakeup also. I don't see a way to avoid the spurious wakeup. > But I > think lll_robust_lock returns 0 if we did not share FUTEX_WAITERS, > and the code never retries with the assigned > assume_other_futex_waiters value, ensuring the equivalence. I think > it would be clearer if you switched from a do-while loop to a loop > with an exit condition in the middle, right after the call to > lll_robust_lock. Yes, this is a case where assume_other_futex_waiters is zero and we have potentially shared FUTEX_WAITERS via the inner __lll_robust_lock_wait code, so it would appear this example contradicts using "iff". I don't know about restructuring this loop. I'd have to see a concrete example of the cleanup, and I think I'd like to avoid such a cleanup to keep this patch minimal for now. Future cleanup could happen in a lot of ways. > Putting the FUTEX_WAITERS into the ID passed to lll_robust_lock is a > violation of its precondition documented in > sysdeps/nptl/lowlevellock.h, so please update the comment. The precondition is a bit loose about this, but I agree the language could be tightened up. -- Cheers, Carlos.
On Mon, 2016-12-19 at 20:47 +0100, Florian Weimer wrote: > On 12/16/2016 11:13 PM, Torvald Riegel wrote: > > On Fri, 2016-12-16 at 15:11 +0100, Florian Weimer wrote: > >> On 12/15/2016 11:29 PM, Torvald Riegel wrote: > >>> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c > >>> index bdfa529..01ac75e 100644 > >>> --- a/nptl/pthread_mutex_lock.c > >>> +++ b/nptl/pthread_mutex_lock.c > >>> @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) > >>> &mutex->__data.__list.__next); > >>> > >>> oldval = mutex->__data.__lock; > >>> + /* This is set to FUTEX_WAITERS iff we might have shared the > >> > >> “iff” doesn't seem to be correct here because it's not an exact > >> equivalence, “if” is sufficient. > > > > No, I think the iff is correct. We do only set it if we may have shared > > the flag. > > Then please change it to “This is set to FUTEX_WAITERS iff we have > shared” (i.e. drop the “might”). Based on the source code, I'm still > not sure if this is an exact equivalence. I still think the original comment is correct. When we start, we will not have shared, so assume_... starts as false. After we have set F_W, we *might* have shared, so we set assume_... to true. You're right that we might have just succeeded to acquire the lock, and then we might not have shared, but then we're not using assume_... anymore. > The part which confuses me is the unconditional assignment > assume_other_futex_waiters = FUTEX_WAITERS further below. But I think > lll_robust_lock returns 0 if we did not share FUTEX_WAITERS, and the > code never retries with the assigned assume_other_futex_waiters value, > ensuring the equivalence. I think it would be clearer if you switched > from a do-while loop to a loop with an exit condition in the middle, > right after the call to lll_robust_lock. > > Putting the FUTEX_WAITERS into the ID passed to lll_robust_lock is a > violation of its precondition documented in sysdeps/nptl/lowlevellock.h, > so please update the comment. I've already committed the patch after what sounded like approval. As I've said before, I plan to follow this up with a cleanup of the whole robust mutex code. In that cleanup patch we can fine-tune the documentation.
commit 74be3b28d962a5a6d2eeff51b93d61ddf91e2d49 Author: Torvald Riegel <triegel@redhat.com> Date: Thu Dec 15 16:06:28 2016 +0100 Robust mutexes: Fix lost wake-up. Assume that Thread 1 waits to acquire a robust mutex using futexes to block (and thus sets the FUTEX_WAITERS flag), and is unblocked when this mutex is released. If Thread 2 concurrently acquires the lock and is killed, Thread 1 can recover from the died owner but fail to restore the FUTEX_WAITERS flag. This can lead to a Thread 3 that also blocked using futexes at the same time as Thread 1 to not get woken up because FUTEX_WAITERS is not set anymore. The fix for this is to ensure that we continue to preserve the FUTEX_WAITERS flag whenever we may have set it or shared it with another thread. This is the same requirement as in the algorithm for normal mutexes, only that the robust mutexes need additional handling for died owners and thus preserving the FUTEX_WAITERS flag cannot be done just in the futex slowpath code. [BZ #20973] * nptl/pthread_mutex_lock.c (__pthread_mutex_lock_full): Fix lost wake-up in robust mutexes. * nptl/pthread_mutex_timedlock.c (pthread_mutex_timedlock): Likewise. diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c index bdfa529..01ac75e 100644 --- a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@ -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) &mutex->__data.__list.__next); oldval = mutex->__data.__lock; + /* This is set to FUTEX_WAITERS iff we might have shared the + FUTEX_WAITERS flag with other threads, and therefore need to keep it + set to avoid lost wake-ups. We have the same requirement in the + simple mutex algorithm. */ + unsigned int assume_other_futex_waiters = 0; do { again: @@ -190,9 +195,9 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) /* The previous owner died. Try locking the mutex. */ int newval = id; #ifdef NO_INCR - newval |= FUTEX_WAITERS; + newval |= FUTEX_WAITERS | assume_other_futex_waiters; #else - newval |= (oldval & FUTEX_WAITERS); + newval |= (oldval & FUTEX_WAITERS) | assume_other_futex_waiters; #endif newval @@ -253,7 +258,11 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) } } - oldval = LLL_ROBUST_MUTEX_LOCK (mutex, id); + oldval = LLL_ROBUST_MUTEX_LOCK (mutex, + id | assume_other_futex_waiters); + /* See above. We set FUTEX_WAITERS and might have shared this flag + with other threads; thus, we need to preserve it. */ + assume_other_futex_waiters = FUTEX_WAITERS; if (__builtin_expect (mutex->__data.__owner == PTHREAD_MUTEX_NOTRECOVERABLE, 0)) diff --git a/nptl/pthread_mutex_timedlock.c b/nptl/pthread_mutex_timedlock.c index 07f0901..21e9eea 100644 --- a/nptl/pthread_mutex_timedlock.c +++ b/nptl/pthread_mutex_timedlock.c @@ -142,13 +142,19 @@ pthread_mutex_timedlock (pthread_mutex_t *mutex, &mutex->__data.__list.__next); oldval = mutex->__data.__lock; + /* This is set to FUTEX_WAITERS iff we might have shared the + FUTEX_WAITERS flag with other threads, and therefore need to keep it + set to avoid lost wake-ups. We have the same requirement in the + simple mutex algorithm. */ + unsigned int assume_other_futex_waiters = 0; do { again: if ((oldval & FUTEX_OWNER_DIED) != 0) { /* The previous owner died. Try locking the mutex. */ - int newval = id | (oldval & FUTEX_WAITERS); + int newval = id | (oldval & FUTEX_WAITERS) + | assume_other_futex_waiters; newval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, @@ -203,8 +209,12 @@ pthread_mutex_timedlock (pthread_mutex_t *mutex, } } - result = lll_robust_timedlock (mutex->__data.__lock, abstime, id, + result = lll_robust_timedlock (mutex->__data.__lock, abstime, + id | assume_other_futex_waiters, PTHREAD_ROBUST_MUTEX_PSHARED (mutex)); + /* See above. We set FUTEX_WAITERS and might have shared this flag + with other threads; thus, we need to preserve it. */ + assume_other_futex_waiters = FUTEX_WAITERS; if (__builtin_expect (mutex->__data.__owner == PTHREAD_MUTEX_NOTRECOVERABLE, 0))