Message ID | 20230125-hid-unregister-leds-v3-0-0a52ac225e00@diag.uniroma1.it |
---|---|
Headers | show |
Series | HID: use spinlocks to safely schedule led workers | expand |
On Fri, Feb 10, 2023 at 2:24 PM Hillf Danton <hdanton@sina.com> wrote: > > On Thu, 09 Feb 2023 23:58:55 +0000 Pietro Borrello <borrello@diag.uniroma1.it> > > Use spinlocks to deal with workers introducing a wrapper > > bigben_schedule_work(), and several spinlock checks. > > Otherwise, bigben_set_led() may schedule bigben->worker after the > > structure has been freed, causing a use-after-free. > > > > Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal") > > Given the flag added in 4eb1b01de5b9 and the spinlock added in this > patchset, devm_led_classdev_register() looks to not work for you. Actually, looking at the code now, it is clear that we need that lock. The current code is happily changing the struct bigben_device from multiple contexts, and pulls that without any barrier in the work struct which should produce some interesting results :) And we can probably abuse that lock to prevent scheduling a new work as it is done in hid-playstation.c I'll comment in the patch which parts need to be changed, because it is true that this patch is definitely not mergeable as such and will need another revision. > > How about replacing the advanced devm_ method with the traditional plain > pair of led_classdev_un/register(), with the flag mentioned cut off but > without bothering to add another lock? > As mentioned above, the lock is needed anyway, and will probably need to be added in a separate patch. Reverting to a non devm version of the led class would complexify the driver for the error paths, and is probably not the best move IMO. Cheers, Benjamin
On Sat, Feb 11, 2023 at 3:01 AM Hillf Danton <hdanton@sina.com> wrote: > > On Fri, 10 Feb 2023 15:11:26 +0100 Benjamin Tissoires <benjamin.tissoires@redhat.com> > > On Fri, Feb 10, 2023 at 2:24 PM Hillf Danton <hdanton@sina.com> wrote: > > > On Thu, 09 Feb 2023 23:58:55 +0000 Pietro Borrello <borrello@diag.uniroma1.it> > > > > Use spinlocks to deal with workers introducing a wrapper > > > > bigben_schedule_work(), and several spinlock checks. > > > > Otherwise, bigben_set_led() may schedule bigben->worker after the > > > > structure has been freed, causing a use-after-free. > > > > > > > > Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal") > > > > > > Given the flag added in 4eb1b01de5b9 and the spinlock added in this > > > patchset, devm_led_classdev_register() looks to not work for you. > > > > Actually, looking at the code now, it is clear that we need that lock. > > The current code is happily changing the struct bigben_device from > > multiple contexts, and pulls that without any barrier in the work > > struct which should produce some interesting results :) > > > > And we can probably abuse that lock to prevent scheduling a new work > > as it is done in hid-playstation.c > > > > I'll comment in the patch which parts need to be changed, because it > > is true that this patch is definitely not mergeable as such and will > > need another revision. > > > > > > > > How about replacing the advanced devm_ method with the traditional plain > > > pair of led_classdev_un/register(), with the flag mentioned cut off but > > > without bothering to add another lock? > > > > > > > As mentioned above, the lock is needed anyway, and will probably need > > to be added in a separate patch. > > Reverting to a non devm version of the led class would complexify the > > driver for the error paths, and is probably not the best move IMO. > > After this patch, > > cpu 0 cpu 2 > --- --- > bigben_remove() > spin_lock_irqsave(&bigben->lock, flags); > bigben->removed = true; > spin_unlock_irqrestore(&bigben->lock, flags); > > spin_lock_irqsave(&bigben->lock, flags); > > what makes it safe for cpu2 to acquire lock after the removed flag is true? The remove flag is just a way to prevent any other workqueue from starting. It doesn't mean that the struct bigben has been freed, so acquiring a lock at that point is fine. We then rely on 2 things: - devm_class_led to be released before struct bigben, because it was devm-allocated *after* the struct bigben was devm-allocated - we prevent any new workqueue to start and we guarantee that any running workqueue is terminated before leaving the .remove function. Given that the ledclass is gracefully shutting down all of its potential queues, we don't have any other possibility to have an unsafe call AFAIU. Cheers, Benjamin
On Mon, Feb 13, 2023 at 11:37 AM Hillf Danton <hdanton@sina.com> wrote: > > On Mon, 13 Feb 2023 09:25:37 +0100 Benjamin Tissoires <benjamin.tissoires@redhat.com> > > > > The remove flag is just a way to prevent any other workqueue from > > starting. It doesn't mean that the struct bigben has been freed, so > > acquiring a lock at that point is fine. > > We then rely on 2 things: > > - devm_class_led to be released before struct bigben, because it was > > devm-allocated *after* the struct bigben was devm-allocated > > This is the current behavior and it is intact after this patch. > > > - we prevent any new workqueue to start and we guarantee that any > > running workqueue is terminated before leaving the .remove function. > > If spinlock is added for not scheduling new workqueue then it is not > needed, because the removed flag is set before running workqueue is > terminated. Checking the flag is enough upon queuing new work. > I tend to disagree (based on Pietro's v4: - no worker is running - a led sysfs call is made - the line "if (!bigben->removed)" is true - this gets interrupted/or another CPU kicks in for the next one -> .remove gets called - bigben->removed is set to false - cancel_work_sync() called the led call continues, and schedules the work .removes terminates, and devm kicks in, killing led_class and struct bigben while the workqueue is running. So having a simple spinlocks ensures the atomicity between checking for bigben->removed and scheduling a workqueue. Cheers, Benjamin
I noticed a recurring pattern is present in multiple hid devices in the Linux tree, where the LED controller of a device schedules a work_struct to interact with the hardware. The work_struct is embedded in the device structure and thus, is freed at device removal. The issue is that a LED worker may be scheduled by a timer concurrently with device removal, causing the work_struct to be accessed after having been freed. I was able to trigger the issue in hid-bigbenff.c and hid-asus.c where the work_structs may be scheduled by the LED controller while the device is disconnecting, triggering use-after-frees. I can attach the reproducer, but it's very simple USB configuration, using the /dev/raw-gadget interface with some more USB interactions to manage LEDs configuration and pass checks in asus_kbd_init() and asus_kbd_get_functions() in case of hid-asus.c. I triggered the issue by connecting a device and immediately disconnecting it, so that the remove function runs before the LED one which remains pending. I am attaching multiple patches for asus and bigben drivers. The proposed patches introduce safe wrappers to schedule the workers safely with several spinlocks checks. I attach the (partial for brevity) ODEBUG dumps: ```hid-bigbenff.c [ 37.803135][ T1170] usb 1-1: USB disconnect, device number 2 [ 37.827979][ T1170] ODEBUG: free active (active state 0) object type: work_struct hint: bigben_worker+0x0/0x860 [ 37.829634][ T1170] WARNING: CPU: 0 PID: 1170 at lib/debugobjects.c:505 debug_check_no_obj_freed+0x43a/0x630 [ 37.830904][ T1170] Modules linked in: [ 37.831413][ T1170] CPU: 0 PID: 1170 Comm: kworker/0:3 Not tainted 6.1.0-rc4-dirty #43 [ 37.832465][ T1170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 37.833751][ T1170] Workqueue: usb_hub_wq hub_event [ 37.834409][ T1170] RIP: 0010:debug_check_no_obj_freed+0x43a/0x630 [ 37.835218][ T1170] Code: 48 89 ef e8 28 82 58 ff 49 8b 14 24 4c 8b 45 00 48 c7 c7 40 5f 09 87 48 c7 c6 60 5b 09 87 89 d9 4d 89 f9 31 c0 e8 46 25 ef fe <0f> 0b 4c 8b 64 24 20 48 ba 00 00 00 00 00 fc ff df ff 05 4f 7c 17 [ 37.837667][ T1170] RSP: 0018:ffffc900006fee60 EFLAGS: 00010246 [ 37.838503][ T1170] RAX: 0d2d19ffcded3d00 RBX: 0000000000000000 RCX: ffff888117fc9b00 [ 37.839519][ T1170] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 37.840570][ T1170] RBP: ffffffff86e88380 R08: ffffffff8130793b R09: fffff520000dfd85 [ 37.841618][ T1170] R10: fffff520000dfd85 R11: 0000000000000000 R12: ffffffff87095fb8 [ 37.842649][ T1170] R13: ffff888117770ad8 R14: ffff888117770acc R15: ffffffff852b7420 [ 37.843728][ T1170] FS: 0000000000000000(0000) GS:ffff8881f6600000(0000) knlGS:0000000000000000 [ 37.844877][ T1170] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.845749][ T1170] CR2: 00007f992eaab380 CR3: 000000011834b000 CR4: 00000000001006f0 [ 37.846794][ T1170] Call Trace: [ 37.847245][ T1170] <TASK> [ 37.847643][ T1170] slab_free_freelist_hook+0x89/0x160 [ 37.848409][ T1170] ? devres_release_all+0x262/0x350 [ 37.849156][ T1170] __kmem_cache_free+0x71/0x110 [ 37.849829][ T1170] devres_release_all+0x262/0x350 [ 37.850478][ T1170] ? devres_release+0x90/0x90 [ 37.851118][ T1170] device_release_driver_internal+0x5e5/0x8a0 [ 37.851944][ T1170] bus_remove_device+0x2ea/0x400 [ 37.852611][ T1170] device_del+0x64f/0xb40 [ 37.853212][ T1170] ? kill_device+0x150/0x150 [ 37.853831][ T1170] ? print_irqtrace_events+0x1f0/0x1f0 [ 37.854564][ T1170] hid_destroy_device+0x66/0x100 [ 37.855226][ T1170] usbhid_disconnect+0x9a/0xc0 [ 37.855887][ T1170] usb_unbind_interface+0x1e1/0x890 ``` ``` hid-asus.c [ 77.409878][ T1169] usb 1-1: USB disconnect, device number 2 [ 77.423606][ T1169] ODEBUG: free active (active state 0) object type: work_struct hint: asus_kbd_backlight_work+0x0/0x2c0 [ 77.425222][ T1169] WARNING: CPU: 0 PID: 1169 at lib/debugobjects.c:505 debug_check_no_obj_freed+0x43a/0x630 [ 77.426599][ T1169] Modules linked in: [ 77.427322][ T1169] CPU: 0 PID: 1169 Comm: kworker/0:3 Not tainted 6.1.0-rc4-dirty #43 [ 77.428404][ T1169] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 77.429644][ T1169] Workqueue: usb_hub_wq hub_event [ 77.430296][ T1169] RIP: 0010:debug_check_no_obj_freed+0x43a/0x630 [ 77.431142][ T1169] Code: 48 89 ef e8 28 82 58 ff 49 8b 14 24 4c 8b 45 00 48 c7 c7 40 5f 09 87 48 c7 c6 60 5b 09 87 89 d9 4d 89 f9 31 c0 e8 46 25 ef fe <0f> 0b 4c 8b 64 24 20 48 ba 00 00 00 00 00 fc ff df ff 05 4f 7c 17 [ 77.433691][ T1169] RSP: 0018:ffffc9000069ee60 EFLAGS: 00010246 [ 77.434470][ T1169] RAX: b85d2b40c12d7600 RBX: 0000000000000000 RCX: ffff888117a78000 [ 77.435507][ T1169] RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 [ 77.436521][ T1169] RBP: ffffffff86e88380 R08: ffffffff8130793b R09: ffffed103ecc4ed6 [ 77.437582][ T1169] R10: ffffed103ecc4ed6 R11: 0000000000000000 R12: ffffffff87095fb8 [ 77.438593][ T1169] R13: ffff88810e348fe0 R14: ffff88810e348fd4 R15: ffffffff852b5780 [ 77.439667][ T1169] FS: 0000000000000000(0000) GS:ffff8881f6600000(0000) knlGS:0000000000000000 [ 77.440842][ T1169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 77.441688][ T1169] CR2: 00007ffc05495ff0 CR3: 000000010cdf0000 CR4: 00000000001006f0 [ 77.442720][ T1169] Call Trace: [ 77.443167][ T1169] <TASK> [ 77.443555][ T1169] slab_free_freelist_hook+0x89/0x160 [ 77.444302][ T1169] ? devres_release_all+0x262/0x350 [ 77.444990][ T1169] __kmem_cache_free+0x71/0x110 [ 77.445638][ T1169] devres_release_all+0x262/0x350 [ 77.446309][ T1169] ? devres_release+0x90/0x90 [ 77.446978][ T1169] device_release_driver_internal+0x5e5/0x8a0 [ 77.447748][ T1169] bus_remove_device+0x2ea/0x400 [ 77.448421][ T1169] device_del+0x64f/0xb40 [ 77.448976][ T1169] ? kill_device+0x150/0x150 [ 77.449577][ T1169] ? print_irqtrace_events+0x1f0/0x1f0 [ 77.450307][ T1169] hid_destroy_device+0x66/0x100 [ 77.450938][ T1169] usbhid_disconnect+0x9a/0xc0 ``` Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> --- Changes in v3: - use spinlocks to prevent workers scheduling - drop patches on sony & playstation hid drivers - Link to v2: https://lore.kernel.org/r/20230125-hid-unregister-leds-v2-0-689cc62fc878@diag.uniroma1.it Changes in v2: - dualshock4: Clarify UAF - dualsense: Clarify UAF - dualsense: Unregister multicolor led controller - Link to v1: https://lore.kernel.org/r/20230125-hid-unregister-leds-v1-0-9a5192dcef16@diag.uniroma1.it --- Pietro Borrello (2): HID: bigben: use spinlock to safely schedule workers HID: asus: use spinlock to safely schedule workers drivers/hid/hid-asus.c | 24 +++++++++++++++++++++--- drivers/hid/hid-bigbenff.c | 34 +++++++++++++++++++++++++++++----- 2 files changed, 50 insertions(+), 8 deletions(-) --- base-commit: 2241ab53cbb5cdb08a6b2d4688feb13971058f65 change-id: 20230125-hid-unregister-leds-4cbf67099e1d Best regards,