diff mbox series

video: fbdev: don't remove firmware fb device if it is busy

Message ID 20220430025749.2320824-1-junxiao.chang@intel.com
State New
Headers show
Series video: fbdev: don't remove firmware fb device if it is busy | expand

Commit Message

Junxiao Chang April 30, 2022, 2:57 a.m. UTC
When firmware framebuffer is in use, don't remove its platform
device. Or else its fb_info buffer is released by platform remove
hook while device is still opened. This introduces use after free
issue.

A kernel panic example:
CPU: 2 PID: 3425 Comm: psplash Tainted: G     U  W     5.18.0-rc3
Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB
RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210
RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206
RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40
RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c
RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188
...
Call Trace:
 <TASK>
 _raw_spin_lock+0x2c/0x30
 __mutex_lock.constprop.0+0x175/0x4f0
 ? _raw_spin_unlock+0x15/0x30
 ? list_lru_add+0x124/0x160
 fb_release+0x1b/0x60
 __fput+0x89/0x240
 task_work_run+0x59/0x90
 do_exit+0x343/0xaf0
 do_group_exit+0x2d/0x90
 __x64_sys_exit_group+0x14/0x20
 do_syscall_64+0x40/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Signed-off-by: Lili Li <lili.li@intel.com>
---
 drivers/video/fbdev/core/fbmem.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Thomas Zimmermann May 2, 2022, 7:24 a.m. UTC | #1
Hi

Am 30.04.22 um 04:57 schrieb Junxiao Chang:
> When firmware framebuffer is in use, don't remove its platform
> device. Or else its fb_info buffer is released by platform remove
> hook while device is still opened. This introduces use after free
> issue.

When exactly does this happen? Do you have a reliable way of reproducing it?

Best regards
Thomas

> 
> A kernel panic example:
> CPU: 2 PID: 3425 Comm: psplash Tainted: G     U  W     5.18.0-rc3
> Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB
> RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210
> RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206
> RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40
> RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c
> RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188
> ...
> Call Trace:
>   <TASK>
>   _raw_spin_lock+0x2c/0x30
>   __mutex_lock.constprop.0+0x175/0x4f0
>   ? _raw_spin_unlock+0x15/0x30
>   ? list_lru_add+0x124/0x160
>   fb_release+0x1b/0x60
>   __fput+0x89/0x240
>   task_work_run+0x59/0x90
>   do_exit+0x343/0xaf0
>   do_group_exit+0x2d/0x90
>   __x64_sys_exit_group+0x14/0x20
>   do_syscall_64+0x40/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
> Signed-off-by: Lili Li <lili.li@intel.com>
> ---
>   drivers/video/fbdev/core/fbmem.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index a6bb0e438216..ff9b9830b398 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>   				 * framebuffer as before without warning.
>   				 */
>   				do_unregister_framebuffer(registered_fb[i]);
> -			} else if (dev_is_platform(device)) {
> +			} else if (dev_is_platform(device) &&
> +					refcount_read(&registered_fb[i]->count) == 1) {
> +				/* Remove platform device if it is not in use. */
>   				registered_fb[i]->forced_out = true;
>   				platform_device_unregister(to_platform_device(device));
>   			} else {
Junxiao Chang May 3, 2022, 12:29 p.m. UTC | #2
We tested Javier's fix, issue couldn't be reproduced anymore.
https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u

But this fix doesn't cover all FB driver's interface:

	.get_unmapped_area = get_fb_unmapped_area,
	.fsync =	fb_deferred_io_fsync,

file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well?

Regards,
Junxiao

-----Original Message-----
From: Thomas Zimmermann <tzimmermann@suse.de> 
Sent: Tuesday, May 3, 2022 3:16 PM
To: Chang, Junxiao <junxiao.chang@intel.com>; linux-fbdev@vger.kernel.org
Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, Lili <lili.li@intel.com>; Javier Martinez Canillas <javierm@redhat.com>
Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if it is busy

(cc'ing Javier)

Hi

Am 03.05.22 um 02:38 schrieb Chang, Junxiao:
> Hi Thomas,
> 
> We reproduced this issue with Yocto image + 5.18-rc3 kernel.
> If you could try Yocto image and enable psplash, the problem could be reproduced almost every time if psplash is launched before Intel i915 registers 2nd framebuffer.
> 
> During Yocto booting up, it launches psplash which opens EFI firmware framebuffer and plays animation. When it exits, fb_release in kernel might access memory which has been released.
> 
> Overall process looks like:
> 1. EFI registers framebuffer in efifb_probe, framebuffer_alloc is 
> called to allocate "struct fb_info" memory; 2. In userspace, psplash 
> is started to play boot animation, it opens framebuffer driver. In 
> fb_open function, it saves fb_info memory to file->private_data; 3. 
> When Intel i915 driver is probed, it registers 2nd framebuffer, and 
> will remove conflicting framebuffer; 4. In do_remove_conflicting_framebuffers, it calls "platform_device_unregister" to remove EFI platform framebuffer device; 5. In EFI framebuffer's efifb_remove function, it calls framebuffer_release to release "struct fb_info" memory which is still use and stored in file->private_data; 6. When psplash user space application exits, it calls fb_release function, and accesses to fb_info memory, but it has been released in step 5. Kernel panic happens.

Thanks for the information. We only had a similar report about RPi devices, but we never encountered this problem before.

> 
> Our patch is to check whether EFI framebuffer is opened at that time. If it is free(registered_fb[i]->count == 1), go ahead to remove EFI platform device. Or else, just unregister framebuffer to avoid kernel panic.

Javier posted a possible fix for the root cause of this problem, but we're still looking for testers. If you have the opportunity, could you
  please test the patch at [1] and report back on the results.

Best regards
Thomas

[1]
https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u

> 
> Thanks,
> Junxiao
> 
> -----Original Message-----
> From: Thomas Zimmermann <tzimmermann@suse.de>
> Sent: Monday, May 2, 2022 3:24 PM
> To: Chang, Junxiao <junxiao.chang@intel.com>; 
> linux-fbdev@vger.kernel.org
> Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, 
> Lili <lili.li@intel.com>
> Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if 
> it is busy
> 
> Hi
> 
> Am 30.04.22 um 04:57 schrieb Junxiao Chang:
>> When firmware framebuffer is in use, don't remove its platform device.
>> Or else its fb_info buffer is released by platform remove hook while 
>> device is still opened. This introduces use after free issue.
> 
> When exactly does this happen? Do you have a reliable way of reproducing it?
> 
> Best regards
> Thomas
> 
>>
>> A kernel panic example:
>> CPU: 2 PID: 3425 Comm: psplash Tainted: G     U  W     5.18.0-rc3
>> Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB
>> RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210
>> RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206
>> RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40
>> RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c
>> RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188 ...
>> Call Trace:
>>    <TASK>
>>    _raw_spin_lock+0x2c/0x30
>>    __mutex_lock.constprop.0+0x175/0x4f0
>>    ? _raw_spin_unlock+0x15/0x30
>>    ? list_lru_add+0x124/0x160
>>    fb_release+0x1b/0x60
>>    __fput+0x89/0x240
>>    task_work_run+0x59/0x90
>>    do_exit+0x343/0xaf0
>>    do_group_exit+0x2d/0x90
>>    __x64_sys_exit_group+0x14/0x20
>>    do_syscall_64+0x40/0x90
>>    entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced
>> removal")
>> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
>> Signed-off-by: Lili Li <lili.li@intel.com>
>> ---
>>    drivers/video/fbdev/core/fbmem.c | 4 +++-
>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/video/fbdev/core/fbmem.c
>> b/drivers/video/fbdev/core/fbmem.c
>> index a6bb0e438216..ff9b9830b398 100644
>> --- a/drivers/video/fbdev/core/fbmem.c
>> +++ b/drivers/video/fbdev/core/fbmem.c
>> @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>>    				 * framebuffer as before without warning.
>>    				 */
>>    				do_unregister_framebuffer(registered_fb[i]);
>> -			} else if (dev_is_platform(device)) {
>> +			} else if (dev_is_platform(device) &&
>> +					refcount_read(&registered_fb[i]->count) == 1) {
>> +				/* Remove platform device if it is not in use. */
>>    				registered_fb[i]->forced_out = true;
>>    				platform_device_unregister(to_platform_device(device));
>>    			} else {
> 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
Thomas Zimmermann May 3, 2022, 1:05 p.m. UTC | #3
Hi

Am 03.05.22 um 14:29 schrieb Chang, Junxiao:
> We tested Javier's fix, issue couldn't be reproduced anymore.
> https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u

Thank you so much.

Best regards
Thomas

> 
> But this fix doesn't cover all FB driver's interface:
> 
> 	.get_unmapped_area = get_fb_unmapped_area,
> 	.fsync =	fb_deferred_io_fsync,
> 
> file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well?
> 
> Regards,
> Junxiao
> 
> -----Original Message-----
> From: Thomas Zimmermann <tzimmermann@suse.de>
> Sent: Tuesday, May 3, 2022 3:16 PM
> To: Chang, Junxiao <junxiao.chang@intel.com>; linux-fbdev@vger.kernel.org
> Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li, Lili <lili.li@intel.com>; Javier Martinez Canillas <javierm@redhat.com>
> Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if it is busy
> 
> (cc'ing Javier)
> 
> Hi
> 
> Am 03.05.22 um 02:38 schrieb Chang, Junxiao:
>> Hi Thomas,
>>
>> We reproduced this issue with Yocto image + 5.18-rc3 kernel.
>> If you could try Yocto image and enable psplash, the problem could be reproduced almost every time if psplash is launched before Intel i915 registers 2nd framebuffer.
>>
>> During Yocto booting up, it launches psplash which opens EFI firmware framebuffer and plays animation. When it exits, fb_release in kernel might access memory which has been released.
>>
>> Overall process looks like:
>> 1. EFI registers framebuffer in efifb_probe, framebuffer_alloc is
>> called to allocate "struct fb_info" memory; 2. In userspace, psplash
>> is started to play boot animation, it opens framebuffer driver. In
>> fb_open function, it saves fb_info memory to file->private_data; 3.
>> When Intel i915 driver is probed, it registers 2nd framebuffer, and
>> will remove conflicting framebuffer; 4. In do_remove_conflicting_framebuffers, it calls "platform_device_unregister" to remove EFI platform framebuffer device; 5. In EFI framebuffer's efifb_remove function, it calls framebuffer_release to release "struct fb_info" memory which is still use and stored in file->private_data; 6. When psplash user space application exits, it calls fb_release function, and accesses to fb_info memory, but it has been released in step 5. Kernel panic happens.
> 
> Thanks for the information. We only had a similar report about RPi devices, but we never encountered this problem before.
> 
>>
>> Our patch is to check whether EFI framebuffer is opened at that time. If it is free(registered_fb[i]->count == 1), go ahead to remove EFI platform device. Or else, just unregister framebuffer to avoid kernel panic.
> 
> Javier posted a possible fix for the root cause of this problem, but we're still looking for testers. If you have the opportunity, could you
>    please test the patch at [1] and report back on the results.
> 
> Best regards
> Thomas
> 
> [1]
> https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u
> 
>>
>> Thanks,
>> Junxiao
>>
>> -----Original Message-----
>> From: Thomas Zimmermann <tzimmermann@suse.de>
>> Sent: Monday, May 2, 2022 3:24 PM
>> To: Chang, Junxiao <junxiao.chang@intel.com>;
>> linux-fbdev@vger.kernel.org
>> Cc: lethal@linux-sh.org; patchwork-bot@kernel.org; deller@gmx.de; Li,
>> Lili <lili.li@intel.com>
>> Subject: Re: [PATCH] video: fbdev: don't remove firmware fb device if
>> it is busy
>>
>> Hi
>>
>> Am 30.04.22 um 04:57 schrieb Junxiao Chang:
>>> When firmware framebuffer is in use, don't remove its platform device.
>>> Or else its fb_info buffer is released by platform remove hook while
>>> device is still opened. This introduces use after free issue.
>>
>> When exactly does this happen? Do you have a reliable way of reproducing it?
>>
>> Best regards
>> Thomas
>>
>>>
>>> A kernel panic example:
>>> CPU: 2 PID: 3425 Comm: psplash Tainted: G     U  W     5.18.0-rc3
>>> Hardware name: Intel Client Platform/ADP-S DDR5 UDIMM CRB
>>> RIP: 0010:native_queued_spin_lock_slowpath+0x1c7/0x210
>>> RSP: 0018:ffffb3a0c0c2fdb0 EFLAGS: 00010206
>>> RAX: 002dc074ff5c0988 RBX: ffff92e987a5d818 RCX: ffff92e989ba9f40
>>> RDX: 0000000000002067 RSI: ffffffff864344f1 RDI: ffffffff8644183c
>>> RBP: ffff92f10f4abd40 R08: 0000000000000001 R09: ffff92e986dc2188 ...
>>> Call Trace:
>>>     <TASK>
>>>     _raw_spin_lock+0x2c/0x30
>>>     __mutex_lock.constprop.0+0x175/0x4f0
>>>     ? _raw_spin_unlock+0x15/0x30
>>>     ? list_lru_add+0x124/0x160
>>>     fb_release+0x1b/0x60
>>>     __fput+0x89/0x240
>>>     task_work_run+0x59/0x90
>>>     do_exit+0x343/0xaf0
>>>     do_group_exit+0x2d/0x90
>>>     __x64_sys_exit_group+0x14/0x20
>>>     do_syscall_64+0x40/0x90
>>>     entry_SYSCALL_64_after_hwframe+0x44/0xae
>>>
>>> Fixes: 27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced
>>> removal")
>>> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
>>> Signed-off-by: Lili Li <lili.li@intel.com>
>>> ---
>>>     drivers/video/fbdev/core/fbmem.c | 4 +++-
>>>     1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/video/fbdev/core/fbmem.c
>>> b/drivers/video/fbdev/core/fbmem.c
>>> index a6bb0e438216..ff9b9830b398 100644
>>> --- a/drivers/video/fbdev/core/fbmem.c
>>> +++ b/drivers/video/fbdev/core/fbmem.c
>>> @@ -1586,7 +1586,9 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>>>     				 * framebuffer as before without warning.
>>>     				 */
>>>     				do_unregister_framebuffer(registered_fb[i]);
>>> -			} else if (dev_is_platform(device)) {
>>> +			} else if (dev_is_platform(device) &&
>>> +					refcount_read(&registered_fb[i]->count) == 1) {
>>> +				/* Remove platform device if it is not in use. */
>>>     				registered_fb[i]->forced_out = true;
>>>     				platform_device_unregister(to_platform_device(device));
>>>     			} else {
>>
>> --
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Ivo Totev
> 
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Ivo Totev
Javier Martinez Canillas May 3, 2022, 3:15 p.m. UTC | #4
Hello Junxiao,

On 5/3/22 14:29, Chang, Junxiao wrote:
> We tested Javier's fix, issue couldn't be reproduced anymore.
> https://lore.kernel.org/dri-devel/20220502135014.377945-1-javierm@redhat.com/T/#u
> 

Thanks for testing!

> But this fix doesn't cover all FB driver's interface:
> 
> 	.get_unmapped_area = get_fb_unmapped_area,
> 	.fsync =	fb_deferred_io_fsync,
> 
> file_fb_info(file) NULL checking should be added in these two interface functions(get_fb_unmapped_area and fb_deferred_io_fsync) as well?
>

Yes, I was about to send a new version adding checks for this but I decided
not to. Because by looking at these file ops, I see get_fb_unmapped_area()
is only used when:

#if defined(HAVE_ARCH_FB_UNMAPPED_AREA) || \
	(defined(CONFIG_FB_PROVIDE_GET_FB_UNMAPPED_AREA) && \
	 !defined(CONFIG_MMU))
	.get_unmapped_area = get_fb_unmapped_area,
#endif

so is not really a common configuration and fb_deferred_io_fsync() is not
defined in the same compilation unit, which means that file_fb_info() would
have to be exported (and probably renamed to fb_file_fb_info or something),
which will make the fix more complex and harder to backport to stable.

The fb_release() handler bug in the other hand is quite easy to trigger,
so I'll just keep this fix with the minimum change.

I plan to fix the other two handlers though in a separate patch.
 -- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat
diff mbox series

Patch

diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
index a6bb0e438216..ff9b9830b398 100644
--- a/drivers/video/fbdev/core/fbmem.c
+++ b/drivers/video/fbdev/core/fbmem.c
@@ -1586,7 +1586,9 @@  static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
 				 * framebuffer as before without warning.
 				 */
 				do_unregister_framebuffer(registered_fb[i]);
-			} else if (dev_is_platform(device)) {
+			} else if (dev_is_platform(device) &&
+					refcount_read(&registered_fb[i]->count) == 1) {
+				/* Remove platform device if it is not in use. */
 				registered_fb[i]->forced_out = true;
 				platform_device_unregister(to_platform_device(device));
 			} else {