Message ID | 20220430025749.2320824-1-junxiao.chang@intel.com |
---|---|
State | New |
Headers | show |
Series | video: fbdev: don't remove firmware fb device if it is busy | expand |
Hi Am 30.04.22 um 04:57 schrieb Junxiao Chang: > When firmware framebuffer is in use, don't remove its platform > device. Or else its fb_info buffer is released by platform remove > hook while device is still opened. This introduces use after free > issue. When exactly does this happen? Do you have a reliable way of reproducing it? Best regards Thomas > > A kernel panic example: > CPU: 2 PID: 3425 Comm: psplash Tainted: G U W 5.18.0-rc3 > Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB > RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210 > RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206 > RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40 > RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c > RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188 > ... > Call Trace: > <TASK> > _raw_spin_lock+0x2c/0x30 > __mutex_lock.constprop.0+0x175/0x4f0 > ? _raw_spin_unlock+0x15/0x30 > ? list_lru_add+0x124/0x160 > fb_release+0x1b/0x60 > __fput+0x89/0x240 > task_work_run+0x59/0x90 > do_exit+0x343/0xaf0 > do_group_exit+0x2d/0x90 > __x64_sys_exit_group+0x14/0x20 > do_syscall_64+0x40/0x90 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal") > Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> > Signed-off-by: Lili Li <lili.li@intel.com> > --- > drivers/video/fbdev/core/fbmem.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c > index a6bb0e438216..ff9b9830b398 100644 > --- a/drivers/video/fbdev/core/fbmem.c > +++ b/drivers/video/fbdev/core/fbmem.c > @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a, > * framebuffer as before without warning. > */ > do_unregister_framebuffer(registered_fb[i]); > - } else if (dev_is_platform(device)) { > + } else if (dev_is_platform(device) && > + refcount_read(®istered_fb[i]->count) == 1) { > + /* Remove platform device if it is not in use. */ > registered_fb[i]->forced_out = true; > platform_device_unregister(to_platform_device(device)); > } else {
We tested Javier's fix, issue couldn't be reproduced anymore. https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u But this fix doesn't cover all FB driver's interface: .get_unmapped_area = get_fb_unmapped_area, .fsync = fb_deferred_io_fsync, file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well? Regards, Junxiao -----Original Message----- From: Thomas Zimmermann <tzimmermann@suse.de> Sent: Tuesday, May 3, 2022 3:16 PM To: Chang, Junxiao <junxiao.chang@intel.com>; linux-fbdev@vger.kernel.org Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, Lili <lili.li@intel.com>; Javier Martinez Canillas <javierm@redhat.com> Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if it is busy (cc'ing Javier) Hi Am 03.05.22 um 02:38 schrieb Chang, Junxiao: > Hi Thomas, > > We reproduced this issue with Yocto image + 5.18-rc3 kernel. > If you could try Yocto image and enable psplash, the problem could be reproduced almost every time if psplash is launched before Intel i915 registers 2nd framebuffer. > > During Yocto booting up, it launches psplash which opens EFI firmware framebuffer and plays animation. When it exits, fb_release in kernel might access memory which has been released. > > Overall process looks like: > 1. EFI registers framebuffer in efifb_probe, framebuffer_alloc is > called to allocate "struct fb_info" memory; 2. In userspace, psplash > is started to play boot animation, it opens framebuffer driver. In > fb_open function, it saves fb_info memory to file->private_data; 3. > When Intel i915 driver is probed, it registers 2nd framebuffer, and > will remove conflicting framebuffer; 4. In do_remove_conflicting_framebuffers, it calls "platform_device_unregister" to remove EFI platform framebuffer device; 5. In EFI framebuffer's efifb_remove function, it calls framebuffer_release to release "struct fb_info" memory which is still use and stored in file->private_data; 6. When psplash user space application exits, it calls fb_release function, and accesses to fb_info memory, but it has been released in step 5. Kernel panic happens. Thanks for the information. We only had a similar report about RPi devices, but we never encountered this problem before. > > Our patch is to check whether EFI framebuffer is opened at that time. If it is free(registered_fb[i]->count == 1), go ahead to remove EFI platform device. Or else, just unregister framebuffer to avoid kernel panic. Javier posted a possible fix for the root cause of this problem, but we're still looking for testers. If you have the opportunity, could you please test the patch at [1] and report back on the results. Best regards Thomas [1] https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u > > Thanks, > Junxiao > > -----Original Message----- > From: Thomas Zimmermann <tzimmermann@suse.de> > Sent: Monday, May 2, 2022 3:24 PM > To: Chang, Junxiao <junxiao.chang@intel.com>; > linux-fbdev@vger.kernel.org > Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, > Lili <lili.li@intel.com> > Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if > it is busy > > Hi > > Am 30.04.22 um 04:57 schrieb Junxiao Chang: >> When firmware framebuffer is in use, don't remove its platform device. >> Or else its fb_info buffer is released by platform remove hook while >> device is still opened. This introduces use after free issue. > > When exactly does this happen? Do you have a reliable way of reproducing it? > > Best regards > Thomas > >> >> A kernel panic example: >> CPU: 2 PID: 3425 Comm: psplash Tainted: G U W 5.18.0-rc3 >> Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB >> RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210 >> RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206 >> RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40 >> RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c >> RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188 ... >> Call Trace: >> <TASK> >> _raw_spin_lock+0x2c/0x30 >> __mutex_lock.constprop.0+0x175/0x4f0 >> ? _raw_spin_unlock+0x15/0x30 >> ? list_lru_add+0x124/0x160 >> fb_release+0x1b/0x60 >> __fput+0x89/0x240 >> task_work_run+0x59/0x90 >> do_exit+0x343/0xaf0 >> do_group_exit+0x2d/0x90 >> __x64_sys_exit_group+0x14/0x20 >> do_syscall_64+0x40/0x90 >> entry_SYSCALL_64_after_hwframe+0x44/0xae >> >> Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced >> removal") >> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> >> Signed-off-by: Lili Li <lili.li@intel.com> >> --- >> drivers/video/fbdev/core/fbmem.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/video/fbdev/core/fbmem.c >> b/drivers/video/fbdev/core/fbmem.c >> index a6bb0e438216..ff9b9830b398 100644 >> --- a/drivers/video/fbdev/core/fbmem.c >> +++ b/drivers/video/fbdev/core/fbmem.c >> @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a, >> * framebuffer as before without warning. >> */ >> do_unregister_framebuffer(registered_fb[i]); >> - } else if (dev_is_platform(device)) { >> + } else if (dev_is_platform(device) && >> + refcount_read(®istered_fb[i]->count) == 1) { >> + /* Remove platform device if it is not in use. */ >> registered_fb[i]->forced_out = true; >> platform_device_unregister(to_platform_device(device)); >> } else { > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) > Geschäftsführer: Ivo Totev -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Hi Am 03.05.22 um 14:29 schrieb Chang, Junxiao: > We tested Javier's fix, issue couldn't be reproduced anymore. > https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u Thank you so much. Best regards Thomas > > But this fix doesn't cover all FB driver's interface: > > .get_unmapped_area = get_fb_unmapped_area, > .fsync = fb_deferred_io_fsync, > > file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well? > > Regards, > Junxiao > > -----Original Message----- > From: Thomas Zimmermann <tzimmermann@suse.de> > Sent: Tuesday, May 3, 2022 3:16 PM > To: Chang, Junxiao <junxiao.chang@intel.com>; linux-fbdev@vger.kernel.org > Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, Lili <lili.li@intel.com>; Javier Martinez Canillas <javierm@redhat.com> > Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if it is busy > > (cc'ing Javier) > > Hi > > Am 03.05.22 um 02:38 schrieb Chang, Junxiao: >> Hi Thomas, >> >> We reproduced this issue with Yocto image + 5.18-rc3 kernel. >> If you could try Yocto image and enable psplash, the problem could be reproduced almost every time if psplash is launched before Intel i915 registers 2nd framebuffer. >> >> During Yocto booting up, it launches psplash which opens EFI firmware framebuffer and plays animation. When it exits, fb_release in kernel might access memory which has been released. >> >> Overall process looks like: >> 1. EFI registers framebuffer in efifb_probe, framebuffer_alloc is >> called to allocate "struct fb_info" memory; 2. In userspace, psplash >> is started to play boot animation, it opens framebuffer driver. In >> fb_open function, it saves fb_info memory to file->private_data; 3. >> When Intel i915 driver is probed, it registers 2nd framebuffer, and >> will remove conflicting framebuffer; 4. In do_remove_conflicting_framebuffers, it calls "platform_device_unregister" to remove EFI platform framebuffer device; 5. In EFI framebuffer's efifb_remove function, it calls framebuffer_release to release "struct fb_info" memory which is still use and stored in file->private_data; 6. When psplash user space application exits, it calls fb_release function, and accesses to fb_info memory, but it has been released in step 5. Kernel panic happens. > > Thanks for the information. We only had a similar report about RPi devices, but we never encountered this problem before. > >> >> Our patch is to check whether EFI framebuffer is opened at that time. If it is free(registered_fb[i]->count == 1), go ahead to remove EFI platform device. Or else, just unregister framebuffer to avoid kernel panic. > > Javier posted a possible fix for the root cause of this problem, but we're still looking for testers. If you have the opportunity, could you > please test the patch at [1] and report back on the results. > > Best regards > Thomas > > [1] > https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u > >> >> Thanks, >> Junxiao >> >> -----Original Message----- >> From: Thomas Zimmermann <tzimmermann@suse.de> >> Sent: Monday, May 2, 2022 3:24 PM >> To: Chang, Junxiao <junxiao.chang@intel.com>; >> linux-fbdev@vger.kernel.org >> Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, >> Lili <lili.li@intel.com> >> Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if >> it is busy >> >> Hi >> >> Am 30.04.22 um 04:57 schrieb Junxiao Chang: >>> When firmware framebuffer is in use, don't remove its platform device. >>> Or else its fb_info buffer is released by platform remove hook while >>> device is still opened. This introduces use after free issue. >> >> When exactly does this happen? Do you have a reliable way of reproducing it? >> >> Best regards >> Thomas >> >>> >>> A kernel panic example: >>> CPU: 2 PID: 3425 Comm: psplash Tainted: G U W 5.18.0-rc3 >>> Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB >>> RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210 >>> RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206 >>> RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40 >>> RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c >>> RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188 ... >>> Call Trace: >>> <TASK> >>> _raw_spin_lock+0x2c/0x30 >>> __mutex_lock.constprop.0+0x175/0x4f0 >>> ? _raw_spin_unlock+0x15/0x30 >>> ? list_lru_add+0x124/0x160 >>> fb_release+0x1b/0x60 >>> __fput+0x89/0x240 >>> task_work_run+0x59/0x90 >>> do_exit+0x343/0xaf0 >>> do_group_exit+0x2d/0x90 >>> __x64_sys_exit_group+0x14/0x20 >>> do_syscall_64+0x40/0x90 >>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>> >>> Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced >>> removal") >>> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> >>> Signed-off-by: Lili Li <lili.li@intel.com> >>> --- >>> drivers/video/fbdev/core/fbmem.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/video/fbdev/core/fbmem.c >>> b/drivers/video/fbdev/core/fbmem.c >>> index a6bb0e438216..ff9b9830b398 100644 >>> --- a/drivers/video/fbdev/core/fbmem.c >>> +++ b/drivers/video/fbdev/core/fbmem.c >>> @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a, >>> * framebuffer as before without warning. >>> */ >>> do_unregister_framebuffer(registered_fb[i]); >>> - } else if (dev_is_platform(device)) { >>> + } else if (dev_is_platform(device) && >>> + refcount_read(®istered_fb[i]->count) == 1) { >>> + /* Remove platform device if it is not in use. */ >>> registered_fb[i]->forced_out = true; >>> platform_device_unregister(to_platform_device(device)); >>> } else { >> >> -- >> Thomas Zimmermann >> Graphics Driver Developer >> SUSE Software Solutions Germany GmbH >> Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) >> Geschäftsführer: Ivo Totev > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Ivo Totev
Hello Junxiao, On 5/3/22 14:29, Chang, Junxiao wrote: > We tested Javier's fix, issue couldn't be reproduced anymore. > https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u > Thanks for testing! > But this fix doesn't cover all FB driver's interface: > > .get_unmapped_area = get_fb_unmapped_area, > .fsync = fb_deferred_io_fsync, > > file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well? > Yes, I was about to send a new version adding checks for this but I decided not to. Because by looking at these file ops, I see get_fb_unmapped_area() is only used when: #if defined(HAVE_ARCH_FB_UNMAPPED_AREA) || \ (defined(CONFIG_FB_PROVIDE_GET_FB_UNMAPPED_AREA) && \ !defined(CONFIG_MMU)) .get_unmapped_area = get_fb_unmapped_area, #endif so is not really a common configuration and fb_deferred_io_fsync() is not defined in the same compilation unit, which means that file_fb_info() would have to be exported (and probably renamed to fb_file_fb_info or something), which will make the fix more complex and harder to backport to stable. The fb_release() handler bug in the other hand is quite easy to trigger, so I'll just keep this fix with the minimum change. I plan to fix the other two handlers though in a separate patch. -- Best regards, Javier Martinez Canillas Linux Engineering Red Hat
diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c index a6bb0e438216..ff9b9830b398 100644 --- a/drivers/video/fbdev/core/fbmem.c +++ b/drivers/video/fbdev/core/fbmem.c @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a, * framebuffer as before without warning. */ do_unregister_framebuffer(registered_fb[i]); - } else if (dev_is_platform(device)) { + } else if (dev_is_platform(device) && + refcount_read(®istered_fb[i]->count) == 1) { + /* Remove platform device if it is not in use. */ registered_fb[i]->forced_out = true; platform_device_unregister(to_platform_device(device)); } else {