Message ID | fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v2] leds: trigger: netdev: fix RTNL handling to prevent potential deadlock | expand |
On Fri, Dec 01, 2023 at 11:23:22AM +0100, Heiner Kallweit wrote: > When working on LED support for r8169 I got the following lockdep > warning. Easiest way to prevent this scenario seems to be to take > the RTNL lock before the trigger_data lock in set_device_name(). > > Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode") > Cc: stable@vger.kernel.org > Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Andrew
On Fri, 1 Dec 2023 11:23:22 +0100 Heiner Kallweit wrote: > When working on LED support for r8169 I got the following lockdep > warning. Easiest way to prevent this scenario seems to be to take > the RTNL lock before the trigger_data lock in set_device_name(). Pavel/Lee, would you like to handle this patch? Or should we ship it to Linus on Thu?
On Mon, 4 Dec 2023 14:48:08 -0800 Jakub Kicinski wrote: > On Fri, 1 Dec 2023 11:23:22 +0100 Heiner Kallweit wrote: > > When working on LED support for r8169 I got the following lockdep > > warning. Easiest way to prevent this scenario seems to be to take > > the RTNL lock before the trigger_data lock in set_device_name(). > > Pavel/Lee, would you like to handle this patch? > Or should we ship it to Linus on Thu? Better safe than step on someone's toes, so I'll drop this version from the netdev PW.
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 1 Dec 2023 11:23:22 +0100 you wrote: > When working on LED support for r8169 I got the following lockdep > warning. Easiest way to prevent this scenario seems to be to take > the RTNL lock before the trigger_data lock in set_device_name(). > > ====================================================== > WARNING: possible circular locking dependency detected > 6.7.0-rc2-next-20231124+ #2 Not tainted > > [...] Here is the summary with links: - [v2] leds: trigger: netdev: fix RTNL handling to prevent potential deadlock https://git.kernel.org/netdev/net/c/fe2b1226656a You are awesome, thank you!
diff --git a/drivers/leds/trigger/ledtrig-netdev.c b/drivers/leds/trigger/ledtrig-netdev.c index e358e77e4..4dbcb2573 100644 --- a/drivers/leds/trigger/ledtrig-netdev.c +++ b/drivers/leds/trigger/ledtrig-netdev.c @@ -226,6 +226,11 @@ static int set_device_name(struct led_netdev_data *trigger_data, cancel_delayed_work_sync(&trigger_data->work); + /* + * Take RTNL lock before trigger_data lock to prevent potential + * deadlock with netdev notifier registration. + */ + rtnl_lock(); mutex_lock(&trigger_data->lock); if (trigger_data->net_dev) { @@ -245,16 +250,14 @@ static int set_device_name(struct led_netdev_data *trigger_data, trigger_data->carrier_link_up = false; trigger_data->link_speed = SPEED_UNKNOWN; trigger_data->duplex = DUPLEX_UNKNOWN; - if (trigger_data->net_dev != NULL) { - rtnl_lock(); + if (trigger_data->net_dev) get_device_state(trigger_data); - rtnl_unlock(); - } trigger_data->last_activity = 0; set_baseline_state(trigger_data); mutex_unlock(&trigger_data->lock); + rtnl_unlock(); return 0; }
When working on LED support for r8169 I got the following lockdep warning. Easiest way to prevent this scenario seems to be to take the RTNL lock before the trigger_data lock in set_device_name(). ====================================================== WARNING: possible circular locking dependency detected 6.7.0-rc2-next-20231124+ #2 Not tainted ------------------------------------------------------ bash/383 is trying to acquire lock: ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev] but task is already holding lock: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}-{3:3}: __mutex_lock+0x9b/0xb50 mutex_lock_nested+0x16/0x20 rtnl_lock+0x12/0x20 set_device_name+0xa9/0x120 [ledtrig_netdev] netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 -> #0 (&trigger_data->lock){+.+.}-{3:3}: __lock_acquire+0x1459/0x25a0 lock_acquire+0xc8/0x2d0 __mutex_lock+0x9b/0xb50 mutex_lock_nested+0x16/0x20 netdev_trig_notify+0xec/0x190 [ledtrig_netdev] call_netdevice_register_net_notifiers+0x5a/0x100 register_netdevice_notifier+0x85/0x120 netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(&trigger_data->lock); lock(rtnl_mutex); lock(&trigger_data->lock); *** DEADLOCK *** 8 locks held by bash/383: #0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0 #1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210 #2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210 #3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140 #4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140 #5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140 #6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120 #7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 stack backtrace: CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2 Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023 Call Trace: <TASK> dump_stack_lvl+0x5c/0xd0 dump_stack+0x10/0x20 print_circular_bug+0x2dd/0x410 check_noncircular+0x131/0x150 __lock_acquire+0x1459/0x25a0 lock_acquire+0xc8/0x2d0 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] __mutex_lock+0x9b/0xb50 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] ? __this_cpu_preempt_check+0x13/0x20 ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] ? __cancel_work_timer+0x11c/0x1b0 ? __mutex_lock+0x123/0xb50 mutex_lock_nested+0x16/0x20 ? mutex_lock_nested+0x16/0x20 netdev_trig_notify+0xec/0x190 [ledtrig_netdev] call_netdevice_register_net_notifiers+0x5a/0x100 register_netdevice_notifier+0x85/0x120 netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] led_trigger_set+0x172/0x2c0 ? preempt_count_add+0x49/0xc0 led_trigger_write+0xf1/0x140 sysfs_kf_bin_write+0x5d/0x80 kernfs_fop_write_iter+0x15d/0x210 vfs_write+0x1f0/0x510 ksys_write+0x6c/0xf0 __x64_sys_write+0x14/0x20 do_syscall_64+0x3f/0xf0 entry_SYSCALL_64_after_hwframe+0x6c/0x74 RIP: 0033:0x7f269055d034 Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034 RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001 RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007 R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000 </TASK> Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> --- v2: - fix commit hash in Fixes tag --- drivers/leds/trigger/ledtrig-netdev.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)