diff mbox series

[2/2] padata: fix UAF in padata_reorder

Message ID 20241123080509.2573987-3-chenridong@huaweicloud.com
State New
Headers show
Series padata: fix UAF in padata_reorder | expand

Commit Message

Chen Ridong Nov. 23, 2024, 8:05 a.m. UTC
From: Chen Ridong <chenridong@huawei.com>

A bug was found when run ltp test:

BUG: KASAN: slab-use-after-free in padata_find_next+0x29/0x1a0
Read of size 4 at addr ffff88bbfe003524 by task kworker/u113:2/3039206

CPU: 0 PID: 3039206 Comm: kworker/u113:2 Kdump: loaded Not tainted 6.6.0+
Workqueue: pdecrypt_parallel padata_parallel_worker
Call Trace:
<TASK>
dump_stack_lvl+0x32/0x50
print_address_description.constprop.0+0x6b/0x3d0
print_report+0xdd/0x2c0
kasan_report+0xa5/0xd0
padata_find_next+0x29/0x1a0
padata_reorder+0x131/0x220
padata_parallel_worker+0x3d/0xc0
process_one_work+0x2ec/0x5a0

If 'mdelay(10)' is added before calling 'padata_find_next' in the
'padata_reorder' function, this issue could be reproduced easily with
ltp test (pcrypt_aead01).

This can be explained as bellow:

pcrypt_aead_encrypt
...
padata_do_parallel
refcount_inc(&pd->refcnt); // add refcnt
...
padata_do_serial
padata_reorder // pd
while (1) {
padata_find_next(pd, true); // using pd
queue_work_on
...
padata_serial_worker				crypto_del_alg
padata_put_pd_cnt // sub refcnt
						padata_free_shell
						padata_put_pd(ps->pd);
						// pd is freed
// loop again, but pd is freed
// call padata_find_next, UAF
}

In the padata_reorder function, when it loops in 'while', if the alg is
deleted, the refcnt may be decreased to 0 before entering
'padata_find_next', which leads to UAF, To fix this issue, add refcnt in
the padata_reorder to avoid UAF.

Fixes: b128a3040935 ("padata: allocate workqueue internally")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Qu Zicheng <quzicheng@huawei.com>
---
 kernel/padata.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Daniel Jordan Dec. 5, 2024, 11:01 p.m. UTC | #1
Hello Ridong,

On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
> 
> A bug was found when run ltp test:
...snip...
> This can be explained as bellow:
> 
> pcrypt_aead_encrypt
> ...
> padata_do_parallel
> refcount_inc(&pd->refcnt); // add refcnt
> ...
> padata_do_serial
> padata_reorder // pd
> while (1) {
> padata_find_next(pd, true); // using pd
> queue_work_on
> ...
> padata_serial_worker				crypto_del_alg
> padata_put_pd_cnt // sub refcnt
> 						padata_free_shell
> 						padata_put_pd(ps->pd);
> 						// pd is freed
> // loop again, but pd is freed
> // call padata_find_next, UAF
> }

Thanks for the fix and clear explanation.

> diff --git a/kernel/padata.c b/kernel/padata.c
> index 5d8e18cdcb25..627014825266 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -319,6 +319,7 @@ static void padata_reorder(struct parallel_data *pd)
>  	if (!spin_trylock_bh(&pd->lock))
>  		return;
>  
> +	padata_get_pd(pd);
>  	while (1) {
>  		padata = padata_find_next(pd, true);
>  
> @@ -355,6 +356,7 @@ static void padata_reorder(struct parallel_data *pd)
>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>  		queue_work(pinst->serial_wq, &pd->reorder_work);
> +	padata_put_pd(pd);

Putting the ref unconditionally here doesn't cover the case where reorder_work
is queued and accesses the freed pd.

The review of patches 3-5 from this series has a potential solution for
this that also keeps some of these refcount operations out of the fast
path:

    https://lore.kernel.org/all/20221019083708.27138-1-nstange@suse.de/
chenridong Dec. 6, 2024, 3:48 a.m. UTC | #2
On 2024/12/6 7:01, Daniel Jordan wrote:
> Hello Ridong,
> 
> On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> A bug was found when run ltp test:
> ...snip...
>> This can be explained as bellow:
>>
>> pcrypt_aead_encrypt
>> ...
>> padata_do_parallel
>> refcount_inc(&pd->refcnt); // add refcnt
>> ...
>> padata_do_serial
>> padata_reorder // pd
>> while (1) {
>> padata_find_next(pd, true); // using pd
>> queue_work_on
>> ...
>> padata_serial_worker				crypto_del_alg
>> padata_put_pd_cnt // sub refcnt
>> 						padata_free_shell
>> 						padata_put_pd(ps->pd);
>> 						// pd is freed
>> // loop again, but pd is freed
>> // call padata_find_next, UAF
>> }
> 
> Thanks for the fix and clear explanation.
> 
>> diff --git a/kernel/padata.c b/kernel/padata.c
>> index 5d8e18cdcb25..627014825266 100644
>> --- a/kernel/padata.c
>> +++ b/kernel/padata.c
>> @@ -319,6 +319,7 @@ static void padata_reorder(struct parallel_data *pd)
>>  	if (!spin_trylock_bh(&pd->lock))
>>  		return;
>>  
>> +	padata_get_pd(pd);
>>  	while (1) {
>>  		padata = padata_find_next(pd, true);
>>  
>> @@ -355,6 +356,7 @@ static void padata_reorder(struct parallel_data *pd)
>>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>>  		queue_work(pinst->serial_wq, &pd->reorder_work);
>> +	padata_put_pd(pd);
> 
> Putting the ref unconditionally here doesn't cover the case where reorder_work
> is queued and accesses the freed pd.
> 
> The review of patches 3-5 from this series has a potential solution for
> this that also keeps some of these refcount operations out of the fast
> path:
> 
>     https://lore.kernel.org/all/20221019083708.27138-1-nstange@suse.de/
> 

Thank you for your review.

IIUC, patches 3-5 from this series aim to fix two issue.
1. Avoid UAF for pd(the patch 3).
2. Avoid UAF for ps(the patch 4-5).
What my patch 2 intends to fix is the issue 1.

Let's focus on issue 1.
As shown bellow, if reorder_work is queued, the refcnt must greater than
0, since its serial work have not be finished yet. Do your agree with that?

pcrypt_aead_encrypt/pcrypt_aead_decrypt
padata_do_parallel 			// refcount_inc(&pd->refcnt);
padata_parallel_worker	
padata->parallel(padata);
padata_do_serial(padata);		
// pd->reorder_list 			// enque reorder_list
padata_reorder
 - case1:squeue->work
	padata_serial_worker		// sub refcnt cnt
 - case2:pd->reorder_work		// reorder->list is not empty
	invoke_padata_reorder 		// this means refcnt > 0
	...
	padata_serial_worker

I think the patch 3(from Nicolai Stange) can also avoid UAF for pd, but
it's complicated. IIUC, the issue 1 can only occur in the scenario what
I mentioned is my commit message. How I fix issue 1 is by adding and
putting the refcnt in the padata_reorder function, which is simple and
clear.

If understand something uncorrectly, please let me know.

As the issue 2, I have not encountered it, but it exists in theory.

Thanks,
Ridong
Chen Ridong Dec. 10, 2024, 9:38 a.m. UTC | #3
On 2024/12/6 11:48, chenridong wrote:
> 
> 
> On 2024/12/6 7:01, Daniel Jordan wrote:
>> Hello Ridong,
>>
>> On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
>>> From: Chen Ridong <chenridong@huawei.com>
>>>
>>> A bug was found when run ltp test:
>> ...snip...
>>> This can be explained as bellow:
>>>
>>> pcrypt_aead_encrypt
>>> ...
>>> padata_do_parallel
>>> refcount_inc(&pd->refcnt); // add refcnt
>>> ...
>>> padata_do_serial
>>> padata_reorder // pd
>>> while (1) {
>>> padata_find_next(pd, true); // using pd
>>> queue_work_on
>>> ...
>>> padata_serial_worker				crypto_del_alg
>>> padata_put_pd_cnt // sub refcnt
>>> 						padata_free_shell
>>> 						padata_put_pd(ps->pd);
>>> 						// pd is freed
>>> // loop again, but pd is freed
>>> // call padata_find_next, UAF
>>> }
>>
>> Thanks for the fix and clear explanation.
>>
>>> diff --git a/kernel/padata.c b/kernel/padata.c
>>> index 5d8e18cdcb25..627014825266 100644
>>> --- a/kernel/padata.c
>>> +++ b/kernel/padata.c
>>> @@ -319,6 +319,7 @@ static void padata_reorder(struct parallel_data *pd)
>>>  	if (!spin_trylock_bh(&pd->lock))
>>>  		return;
>>>  
>>> +	padata_get_pd(pd);
>>>  	while (1) {
>>>  		padata = padata_find_next(pd, true);
>>>  
>>> @@ -355,6 +356,7 @@ static void padata_reorder(struct parallel_data *pd)
>>>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>>>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>>>  		queue_work(pinst->serial_wq, &pd->reorder_work);
>>> +	padata_put_pd(pd);
>>
>> Putting the ref unconditionally here doesn't cover the case where reorder_work
>> is queued and accesses the freed pd.
>>
>> The review of patches 3-5 from this series has a potential solution for
>> this that also keeps some of these refcount operations out of the fast
>> path:
>>
>>     https://lore.kernel.org/all/20221019083708.27138-1-nstange@suse.de/
>>
> 
> Thank you for your review.
> 
> IIUC, patches 3-5 from this series aim to fix two issue.
> 1. Avoid UAF for pd(the patch 3).
> 2. Avoid UAF for ps(the patch 4-5).
> What my patch 2 intends to fix is the issue 1.
> 
> Let's focus on issue 1.
> As shown bellow, if reorder_work is queued, the refcnt must greater than
> 0, since its serial work have not be finished yet. Do your agree with that?
> 
> pcrypt_aead_encrypt/pcrypt_aead_decrypt
> padata_do_parallel 			// refcount_inc(&pd->refcnt);
> padata_parallel_worker	
> padata->parallel(padata);
> padata_do_serial(padata);		
> // pd->reorder_list 			// enque reorder_list
> padata_reorder
>  - case1:squeue->work
> 	padata_serial_worker		// sub refcnt cnt
>  - case2:pd->reorder_work		// reorder->list is not empty
> 	invoke_padata_reorder 		// this means refcnt > 0
> 	...
> 	padata_serial_worker
> 
> I think the patch 3(from Nicolai Stange) can also avoid UAF for pd, but
> it's complicated. IIUC, the issue 1 can only occur in the scenario what
> I mentioned is my commit message. How I fix issue 1 is by adding and
> putting the refcnt in the padata_reorder function, which is simple and
> clear.
> 
> If understand something uncorrectly, please let me know.
> 
> As the issue 2, I have not encountered it, but it exists in theory.
> 
> Thanks,
> Ridong

Hi, Daniel, I am trying to produce the issue 2. However,I failed.
I added 'mdelay' as helper.

 static void padata_reorder(struct parallel_data *pd)
 {
+       mdelay(10);
        struct padata_instance *pinst = pd->ps->pinst;
        int cb_cpu;
        struct padata_priv *padata;

I believe this can increase the probability of issue 2. But after
testing with pcrypt_aead01, issue 2 cannot be reproduced.
And I don't know whether it exists now.

Looking forward your reply.

Best regards,
Ridong
Daniel Jordan Dec. 10, 2024, 7:12 p.m. UTC | #4
Hi Ridong,

On Fri, Dec 06, 2024 at 11:48:36AM +0800, chenridong wrote:
> On 2024/12/6 7:01, Daniel Jordan wrote:
> > On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
> >> diff --git a/kernel/padata.c b/kernel/padata.c
> >> index 5d8e18cdcb25..627014825266 100644
> >> --- a/kernel/padata.c
> >> +++ b/kernel/padata.c
> >> @@ -319,6 +319,7 @@ static void padata_reorder(struct parallel_data *pd)
> >>  	if (!spin_trylock_bh(&pd->lock))
> >>  		return;
> >>  
> >> +	padata_get_pd(pd);
> >>  	while (1) {
> >>  		padata = padata_find_next(pd, true);
> >>  
> >> @@ -355,6 +356,7 @@ static void padata_reorder(struct parallel_data *pd)
> >>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
> >>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
> >>  		queue_work(pinst->serial_wq, &pd->reorder_work);
> >> +	padata_put_pd(pd);
> > 
> > Putting the ref unconditionally here doesn't cover the case where reorder_work
> > is queued and accesses the freed pd.
> > 
> > The review of patches 3-5 from this series has a potential solution for
> > this that also keeps some of these refcount operations out of the fast
> > path:
> > 
> >     https://lore.kernel.org/all/20221019083708.27138-1-nstange@suse.de/
> > 
> 
> Thank you for your review.
> 
> IIUC, patches 3-5 from this series aim to fix two issue.
> 1. Avoid UAF for pd(the patch 3).
> 2. Avoid UAF for ps(the patch 4-5).
> What my patch 2 intends to fix is the issue 1.
> 
> Let's focus on issue 1.
> As shown bellow, if reorder_work is queued, the refcnt must greater than
> 0, since its serial work have not be finished yet. Do your agree with that?

I think it's possible for reorder_work to be queued even though all
serial works have finished:

 - padata_reorder finds the reorder list nonempty and sees an object from
   padata_find_next, then gets preempted
 - the serial work finishes in another context
 - back in padata_reorder, reorder_work is queued

Not sure this race could actually happen in practice though.

But, I also think reorder_work can be queued when there's an unfinished
serial work, as you say, but with UAF still happening:

padata_do_serial
  ...
  padata_reorder
    // processes all remaining
    // requests then breaks
    while (1) {
      if (!padata)
        break;
      ...
    }
  
                                  padata_do_serial
                                    // new request added
                                    list_add
    // sees the new request
    queue_work(reorder_work)
                                    padata_reorder
                                      queue_work_on(squeue->work)


   
                                  <kworker context>
                                  padata_serial_worker
                                    // completes new request,
                                    // no more outstanding
                                    // requests
                                                                      crypto_del_alg
                                                                        // free pd
<kworker context>
invoke_padata_reorder
  // UAF of pd

> pcrypt_aead_encrypt/pcrypt_aead_decrypt
> padata_do_parallel 			// refcount_inc(&pd->refcnt);
> padata_parallel_worker	
> padata->parallel(padata);
> padata_do_serial(padata);		
> // pd->reorder_list 			// enque reorder_list
> padata_reorder
>  - case1:squeue->work
> 	padata_serial_worker		// sub refcnt cnt
>  - case2:pd->reorder_work		// reorder->list is not empty
> 	invoke_padata_reorder 		// this means refcnt > 0
> 	...
> 	padata_serial_worker

In other words, in case2 above, reorder_work could be queued, another
context could complete the request in padata_serial_worker, and then
invoke_padata_reorder could run and UAF when there aren't any remaining
serial works.

> I think the patch 3(from Nicolai Stange) can also avoid UAF for pd, but
> it's complicated.

For fixing the issue you describe, without regard for the reorder work,
I think the synchronize_rcu from near the end of the patch 3 thread is
enough.  A synchronize_rcu in the slow path seems better than two
atomics in the fast path.
Daniel Jordan Dec. 10, 2024, 7:16 p.m. UTC | #5
On Tue, Dec 10, 2024 at 05:38:51PM +0800, Chen Ridong wrote:
> On 2024/12/6 11:48, chenridong wrote:
> > On 2024/12/6 7:01, Daniel Jordan wrote:
> >> On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
> > IIUC, patches 3-5 from this series aim to fix two issue.
> > 1. Avoid UAF for pd(the patch 3).
> > 2. Avoid UAF for ps(the patch 4-5).
> > What my patch 2 intends to fix is the issue 1.
...
> Hi, Daniel, I am trying to produce the issue 2. However,I failed.

Thanks for trying!

> I added 'mdelay' as helper.
> 
>  static void padata_reorder(struct parallel_data *pd)
>  {
> +       mdelay(10);
>         struct padata_instance *pinst = pd->ps->pinst;
>         int cb_cpu;
>         struct padata_priv *padata;
> 
> I believe this can increase the probability of issue 2. But after
> testing with pcrypt_aead01, issue 2 cannot be reproduced.
> And I don't know whether it exists now.

Yeah, no reports of issue 2 that I've seen.
Chen Ridong Dec. 23, 2024, 9 a.m. UTC | #6
On 2024/12/11 3:12, Daniel Jordan wrote:
> Hi Ridong,
> 
> On Fri, Dec 06, 2024 at 11:48:36AM +0800, chenridong wrote:
>> On 2024/12/6 7:01, Daniel Jordan wrote:
>>> On Sat, Nov 23, 2024 at 08:05:09AM +0000, Chen Ridong wrote:
>>>> diff --git a/kernel/padata.c b/kernel/padata.c
>>>> index 5d8e18cdcb25..627014825266 100644
>>>> --- a/kernel/padata.c
>>>> +++ b/kernel/padata.c
>>>> @@ -319,6 +319,7 @@ static void padata_reorder(struct parallel_data *pd)
>>>>  	if (!spin_trylock_bh(&pd->lock))
>>>>  		return;
>>>>  
>>>> +	padata_get_pd(pd);
>>>>  	while (1) {
>>>>  		padata = padata_find_next(pd, true);
>>>>  
>>>> @@ -355,6 +356,7 @@ static void padata_reorder(struct parallel_data *pd)
>>>>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>>>>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>>>>  		queue_work(pinst->serial_wq, &pd->reorder_work);
>>>> +	padata_put_pd(pd);
>>>
>>> Putting the ref unconditionally here doesn't cover the case where reorder_work
>>> is queued and accesses the freed pd.
>>>
>>> The review of patches 3-5 from this series has a potential solution for
>>> this that also keeps some of these refcount operations out of the fast
>>> path:
>>>
>>>     https://lore.kernel.org/all/20221019083708.27138-1-nstange@suse.de/
>>>
>>
>> Thank you for your review.
>>
>> IIUC, patches 3-5 from this series aim to fix two issue.
>> 1. Avoid UAF for pd(the patch 3).
>> 2. Avoid UAF for ps(the patch 4-5).
>> What my patch 2 intends to fix is the issue 1.
>>
>> Let's focus on issue 1.
>> As shown bellow, if reorder_work is queued, the refcnt must greater than
>> 0, since its serial work have not be finished yet. Do your agree with that?
> 
> I think it's possible for reorder_work to be queued even though all
> serial works have finished:
> 
>  - padata_reorder finds the reorder list nonempty and sees an object from
>    padata_find_next, then gets preempted
>  - the serial work finishes in another context
>  - back in padata_reorder, reorder_work is queued
> 
> Not sure this race could actually happen in practice though.
> 
> But, I also think reorder_work can be queued when there's an unfinished
> serial work, as you say, but with UAF still happening:
> 
> padata_do_serial
>   ...
>   padata_reorder
>     // processes all remaining
>     // requests then breaks
>     while (1) {
>       if (!padata)
>         break;
>       ...
>     }
>   
>                                   padata_do_serial
>                                     // new request added
>                                     list_add
>     // sees the new request
>     queue_work(reorder_work)
>                                     padata_reorder
>                                       queue_work_on(squeue->work)
> 
> 
>    
>                                   <kworker context>
>                                   padata_serial_worker
>                                     // completes new request,
>                                     // no more outstanding
>                                     // requests
>                                                                       crypto_del_alg
>                                                                         // free pd
> <kworker context>
> invoke_padata_reorder
>   // UAF of pd
> 

Sorry for being busy with other work for a while.
Thank you for your patience.
In theory, it does exist. Although I was unable reproduce it(I added
delay helper as below), I noticed that Herbert has reported a UAF issue
occurred in the padata_parallel_worker function. Therefore, it would be
better to fix it in Nicolai's approach.

static void padata_parallel_worker(struct work_struct *parallel_work)
 {
+       mdelay(10);
+

Hi, Nicolai, would you resend the patch 3 to fix this issue?
I noticed you sent the patch 2 years ago, but this series has not been
merged.

Or may I send a patch that aligns with your approach to resolve it?
Looking forward your feedback.


>> pcrypt_aead_encrypt/pcrypt_aead_decrypt
>> padata_do_parallel 			// refcount_inc(&pd->refcnt);
>> padata_parallel_worker	
>> padata->parallel(padata);
>> padata_do_serial(padata);		
>> // pd->reorder_list 			// enque reorder_list
>> padata_reorder
>>  - case1:squeue->work
>> 	padata_serial_worker		// sub refcnt cnt
>>  - case2:pd->reorder_work		// reorder->list is not empty
>> 	invoke_padata_reorder 		// this means refcnt > 0
>> 	...
>> 	padata_serial_worker
> 
> In other words, in case2 above, reorder_work could be queued, another
> context could complete the request in padata_serial_worker, and then
> invoke_padata_reorder could run and UAF when there aren't any remaining
> serial works.
> 
>> I think the patch 3(from Nicolai Stange) can also avoid UAF for pd, but
>> it's complicated.
> 
> For fixing the issue you describe, without regard for the reorder work,
> I think the synchronize_rcu from near the end of the patch 3 thread is
> enough.  A synchronize_rcu in the slow path seems better than two
> atomics in the fast path.

Thank you. I tested with 'synchronize_rcu', and it can fix the issue I
encountered. As I mentioned, Herbert has provided another stack, which
indicates that case 2 exists. I think it would be better to fix it as
patch 3 did.

Thanks,
Ridong
diff mbox series

Patch

diff --git a/kernel/padata.c b/kernel/padata.c
index 5d8e18cdcb25..627014825266 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -319,6 +319,7 @@  static void padata_reorder(struct parallel_data *pd)
 	if (!spin_trylock_bh(&pd->lock))
 		return;
 
+	padata_get_pd(pd);
 	while (1) {
 		padata = padata_find_next(pd, true);
 
@@ -355,6 +356,7 @@  static void padata_reorder(struct parallel_data *pd)
 	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
 	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
 		queue_work(pinst->serial_wq, &pd->reorder_work);
+	padata_put_pd(pd);
 }
 
 static void invoke_padata_reorder(struct work_struct *work)