diff mbox

[RFC] locking/mutexes: don't spin on owner when wait list is not NULL.

Message ID 56A08061.2080708@huawei.com
State Superseded
Headers show

Commit Message

Ding Tianhong Jan. 21, 2016, 6:53 a.m. UTC
I build a script to create several process for ioctl loop calling,
the ioctl will calling the kernel function just like:
xx_ioctl {
...
rtnl_lock();
function();
rtnl_unlock();
...
}
The function may sleep several ms, but will not halt, at the same time
another user service may calling ifconfig to change the state of the
ethernet, and after several hours, the hung task thread report this problem:

-- 
2.5.0

Comments

Ding Tianhong Jan. 21, 2016, 9:04 a.m. UTC | #1
On 2016/1/21 15:29, Ingo Molnar wrote:
> 

> Cc:-ed other gents who touched the mutex code recently. Mail quoted below.

> 


Ok, thanks.

Ding

> Thanks,

> 

> 	Ingo

> 

> * Ding Tianhong <dingtianhong@huawei.com> wrote:

> 

>> I build a script to create several process for ioctl loop calling,

>> the ioctl will calling the kernel function just like:

>> xx_ioctl {

>> ...

>> rtnl_lock();

>> function();

>> rtnl_unlock();

>> ...

>> }

>> The function may sleep several ms, but will not halt, at the same time

>> another user service may calling ifconfig to change the state of the

>> ethernet, and after several hours, the hung task thread report this problem:

>>

>> ========================================================================

>> 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds.

>> [149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

>> [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080

>> [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8

>> [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240

>> [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248

>> [149738.042290] Call Trace:

>> [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70

>> [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0

>> [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f

>> [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20

>> [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590

>> [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560

>> [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50

>> [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0

>> [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0

>> [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0

>>

>> ================================ cut here ================================

>>

>> I got the vmcore and found that the ifconfig is already in the wait_list of the

>> rtnl_lock for 120 second, but my process could get and release the rtnl_lock

>> normally several times in one second, so it means that my process jump the

>> queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock

>> slow path and found that the mutex may spin on owner ignore whether the  wait list

>> is empty, it will cause the task in the wait list always be cut in line, so add

>> test for wait list in the mutex_can_spin_on_owner and avoid this problem.

>>

>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

>> ---

>>  kernel/locking/mutex.c | 11 ++++++-----

>>  1 file changed, 6 insertions(+), 5 deletions(-)

>>

>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c

>> index 0551c21..596b341 100644

>> --- a/kernel/locking/mutex.c

>> +++ b/kernel/locking/mutex.c

>> @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)

>>  	struct task_struct *owner;

>>  	int retval = 1;

>>  

>> -	if (need_resched())

>> +	if (need_resched() || atomic_read(&lock->count) == -1)

>>  		return 0;

>>  

>>  	rcu_read_lock();

>> @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock)

>>  /*

>>   * Optimistic spinning.

>>   *

>> - * We try to spin for acquisition when we find that the lock owner

>> - * is currently running on a (different) CPU and while we don't

>> - * need to reschedule. The rationale is that if the lock owner is

>> - * running, it is likely to release the lock soon.

>> + * We try to spin for acquisition when we find that there are no

>> + * pending waiters and the lock owner is currently running on a

>> + * (different) CPU and while we don't need to reschedule. The

>> + * rationale is that if the lock owner is running, it is likely

>> + * to release the lock soon.

>>   *

>>   * Since this needs the lock owner, and this mutex implementation

>>   * doesn't track the owner atomically in the lock field, we need to

>> -- 

>> 2.5.0

>>

>>

> 

> .

>
diff mbox

Patch

========================================================================
149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds.
[149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080
[149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8
[149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240
[149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248
[149738.042290] Call Trace:
[149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70
[149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0
[149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f
[149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20
[149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590
[149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560
[149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50
[149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0
[149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0
[149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0

================================ cut here ================================

I got the vmcore and found that the ifconfig is already in the wait_list of the
rtnl_lock for 120 second, but my process could get and release the rtnl_lock
normally several times in one second, so it means that my process jump the
queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock
slow path and found that the mutex may spin on owner ignore whether the  wait list
is empty, it will cause the task in the wait list always be cut in line, so add
test for wait list in the mutex_can_spin_on_owner and avoid this problem.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 kernel/locking/mutex.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 0551c21..596b341 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -256,7 +256,7 @@  static inline int mutex_can_spin_on_owner(struct mutex *lock)
 	struct task_struct *owner;
 	int retval = 1;
 
-	if (need_resched())
+	if (need_resched() || atomic_read(&lock->count) == -1)
 		return 0;
 
 	rcu_read_lock();
@@ -283,10 +283,11 @@  static inline bool mutex_try_to_acquire(struct mutex *lock)
 /*
  * Optimistic spinning.
  *
- * We try to spin for acquisition when we find that the lock owner
- * is currently running on a (different) CPU and while we don't
- * need to reschedule. The rationale is that if the lock owner is
- * running, it is likely to release the lock soon.
+ * We try to spin for acquisition when we find that there are no
+ * pending waiters and the lock owner is currently running on a
+ * (different) CPU and while we don't need to reschedule. The
+ * rationale is that if the lock owner is running, it is likely
+ * to release the lock soon.
  *
  * Since this needs the lock owner, and this mutex implementation
  * doesn't track the owner atomically in the lock field, we need to