diff mbox series

[3/5] padata: grab parallel_data refcnt for reorder

Message ID 20221019083708.27138-4-nstange@suse.de
State New
Headers show
Series padata: fix liftime issues after ->serial() has completed | expand

Commit Message

Nicolai Stange Oct. 19, 2022, 8:37 a.m. UTC
On entry of padata_do_serial(), the in-flight padata_priv owns a reference
to the associated parallel_data instance.

However, as soon as the padata_priv got enqueued on the reorder list, it
can be completed from a different context, causing the reference to get
released in the course.

This would potentially cause UAFs from the subsequent padata_reorder()
operations invoked from the enqueueing padata_do_serial() or from the
reorder work.

Note that this is a purely theroretical concern, the problem has never been
actually observed -- it would require multiple pcrypt request submissions
racing against each other, ultimately a pcrypt instance destruction
(DELALG) short after request completions as well as unfortunate timing.

However, for the sake of correctness, it is still worth fixing.

Make padata_do_serial() grab a reference count on the parallel_data for
the subsequent reorder operation(s). As long as the padata_priv has not
been enqueued, this is safe, because as mentioned above, in-flight
pdata_privs own a reference already.

Note that padata_reorder() might schedule another padata_reorder() work
and thus, care must be taken to not prematurely release that "reorder
refcount" from padata_do_serial() again in case that has happened.
Make padata_reorder() return a bool for indicating whether or not a
reorder work has been scheduled. Let padata_do_serial() drop its refcount
only if this is not the case. Accordingly, make the reorder work handler,
invoke_padata_reorder(), drop it then as appropriate.

A remark on the commit chosen for the Fixes tag reference below: before
commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance
padata queues"), the padata_parallel lifetime had been tied to the
padata_instance. The padata_free() resp. padata_stop() issued a
synchronize_rcu() before padata_free_pd() from the instance destruction
path, rendering UAFs from the padata_do_serial()=>padata_reorder()
invocations with BHs disabled impossible AFAICS. With that, the
padata_reorder() work remains to be considered. Before
commit b128a3040935 ("padata: allocate workqueue internally"), the
workqueue got destroyed (from pcrypt), hence drained, before the padata
instance destruction, but this change moved that to after the
padata_free_pd() invocation from __padata_free(). So, while the Fixes
reference from below is most likely technically correct, I would still like
to reiterate that this problem is probably hard to trigger in practice,
even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock
by using per-instance padata queues").

Fixes: b128a3040935 ("padata: allocate workqueue internally")
Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 kernel/padata.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

Comments

Daniel Jordan Oct. 28, 2022, 4:04 p.m. UTC | #1
On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
> to the associated parallel_data instance.
> 
> However, as soon as the padata_priv got enqueued on the reorder list, it
> can be completed from a different context, causing the reference to get
> released in the course.
> 
> This would potentially cause UAFs from the subsequent padata_reorder()
> operations invoked from the enqueueing padata_do_serial() or from the
> reorder work.
> 
> Note that this is a purely theroretical concern, the problem has never been
> actually observed -- it would require multiple pcrypt request submissions
> racing against each other, ultimately a pcrypt instance destruction
> (DELALG) short after request completions as well as unfortunate timing.
> 
> However, for the sake of correctness, it is still worth fixing.
> 
> Make padata_do_serial() grab a reference count on the parallel_data for
> the subsequent reorder operation(s). As long as the padata_priv has not
> been enqueued, this is safe, because as mentioned above, in-flight
> pdata_privs own a reference already.
> 
> Note that padata_reorder() might schedule another padata_reorder() work
> and thus, care must be taken to not prematurely release that "reorder
> refcount" from padata_do_serial() again in case that has happened.
> Make padata_reorder() return a bool for indicating whether or not a
> reorder work has been scheduled. Let padata_do_serial() drop its refcount
> only if this is not the case. Accordingly, make the reorder work handler,
> invoke_padata_reorder(), drop it then as appropriate.
> 
> A remark on the commit chosen for the Fixes tag reference below: before
> commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance
> padata queues"), the padata_parallel lifetime had been tied to the
> padata_instance. The padata_free() resp. padata_stop() issued a
> synchronize_rcu() before padata_free_pd() from the instance destruction
> path, rendering UAFs from the padata_do_serial()=>padata_reorder()
> invocations with BHs disabled impossible AFAICS. With that, the
> padata_reorder() work remains to be considered. Before
> commit b128a3040935 ("padata: allocate workqueue internally"), the
> workqueue got destroyed (from pcrypt), hence drained, before the padata
> instance destruction, but this change moved that to after the
> padata_free_pd() invocation from __padata_free(). So, while the Fixes
> reference from below is most likely technically correct, I would still like
> to reiterate that this problem is probably hard to trigger in practice,
> even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock
> by using per-instance padata queues").
> 
> Fixes: b128a3040935 ("padata: allocate workqueue internally")
> Signed-off-by: Nicolai Stange <nstange@suse.de>
> ---
>  kernel/padata.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 0bf8c80dad5a..b79226727ef7 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
>  	return padata;
>  }
>  
> -static void padata_reorder(struct parallel_data *pd)
> +static bool padata_reorder(struct parallel_data *pd)
>  {
>  	struct padata_instance *pinst = pd->ps->pinst;
>  	int cb_cpu;
> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
>  	 * care for all the objects enqueued during the holdtime of the lock.
>  	 */
>  	if (!spin_trylock_bh(&pd->lock))
> -		return;
> +		return false;
>  
>  	while (1) {
>  		padata = padata_find_next(pd, true);
> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
>  
>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
> -		queue_work(pinst->serial_wq, &pd->reorder_work);
> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
> +
> +	return false;
>  }
>  
>  static void invoke_padata_reorder(struct work_struct *work)
>  {
>  	struct parallel_data *pd;
> +	bool keep_refcnt;
>  
>  	local_bh_disable();
>  	pd = container_of(work, struct parallel_data, reorder_work);
> -	padata_reorder(pd);
> +	keep_refcnt = padata_reorder(pd);
>  	local_bh_enable();
> +
> +	if (!keep_refcnt)
> +		padata_put_pd(pd);
>  }
>  
>  static void padata_serial_worker(struct work_struct *serial_work)
> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
>  	struct padata_priv *cur;
>  
> +	/*
> +	 * The in-flight padata owns a reference on pd. However, as
> +	 * soon as it's been enqueued on the reorder list, another
> +	 * task can dequeue and complete it, thereby dropping the
> +	 * reference. Grab another reference here, it will eventually
> +	 * be released from a reorder work, if any, or below.
> +	 */
> +	padata_get_pd(pd);
> +
>  	spin_lock(&reorder->lock);
>  	/* Sort in ascending order of sequence number. */
>  	list_for_each_entry_reverse(cur, &reorder->list, list)
> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
>  	 */
>  	smp_mb();
>  
> -	padata_reorder(pd);
> +	if (!padata_reorder(pd))
> +		padata_put_pd(pd);

do_serial is supposed to be called with BHs disabled and will be in
every case after a fix for a separate issue that I plan to send this
cycle.  Given that it (will soon...) always happen under RCU protection,
part of this issue could be addressed like this, which puts the expense
of dealing with this rare problem in the slow path:

diff --git a/kernel/padata.c b/kernel/padata.c
index 0bf8c80dad5a..cd6740ae6629 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
 	if (!ps)
 		return;
 
+	/*
+	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
+	 * and ps's.
+	 */
+	synchronize_rcu();
+
 	mutex_lock(&ps->pinst->lock);
 	list_del(&ps->list);
 	padata_put_pd(rcu_dereference_protected(ps->pd, 1));

pcrypt calls padata_free_shell() when all outstanding transforms (and
thus requests, I think) have been freed/completed, so no new task can
come into padata_reorder.  synchronize_rcu() then flushes out any
remaining _reorder calls.

This doesn't deal with pending reorder_work items, but we can fix it
later in the series.

What do you think?
Nicolai Stange Nov. 9, 2022, 1:03 p.m. UTC | #2
Daniel Jordan <daniel.m.jordan@oracle.com> writes:

> On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
>> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
>> to the associated parallel_data instance.
>> 
>> However, as soon as the padata_priv got enqueued on the reorder list, it
>> can be completed from a different context, causing the reference to get
>> released in the course.
>> 
>> This would potentially cause UAFs from the subsequent padata_reorder()
>> operations invoked from the enqueueing padata_do_serial() or from the
>> reorder work.
>> 

<snip>

>> ---
>>  kernel/padata.c | 26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/padata.c b/kernel/padata.c
>> index 0bf8c80dad5a..b79226727ef7 100644
>> --- a/kernel/padata.c
>> +++ b/kernel/padata.c
>> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
>>  	return padata;
>>  }
>>  
>> -static void padata_reorder(struct parallel_data *pd)
>> +static bool padata_reorder(struct parallel_data *pd)
>>  {
>>  	struct padata_instance *pinst = pd->ps->pinst;
>>  	int cb_cpu;
>> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
>>  	 * care for all the objects enqueued during the holdtime of the lock.
>>  	 */
>>  	if (!spin_trylock_bh(&pd->lock))
>> -		return;
>> +		return false;
>>  
>>  	while (1) {
>>  		padata = padata_find_next(pd, true);
>> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
>>  
>>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>> -		queue_work(pinst->serial_wq, &pd->reorder_work);
>> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
>> +
>> +	return false;
>>  }
>>  
>>  static void invoke_padata_reorder(struct work_struct *work)
>>  {
>>  	struct parallel_data *pd;
>> +	bool keep_refcnt;
>>  
>>  	local_bh_disable();
>>  	pd = container_of(work, struct parallel_data, reorder_work);
>> -	padata_reorder(pd);
>> +	keep_refcnt = padata_reorder(pd);
>>  	local_bh_enable();
>> +
>> +	if (!keep_refcnt)
>> +		padata_put_pd(pd);
>>  }
>>  
>>  static void padata_serial_worker(struct work_struct *serial_work)
>> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
>>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
>>  	struct padata_priv *cur;
>>  
>> +	/*
>> +	 * The in-flight padata owns a reference on pd. However, as
>> +	 * soon as it's been enqueued on the reorder list, another
>> +	 * task can dequeue and complete it, thereby dropping the
>> +	 * reference. Grab another reference here, it will eventually
>> +	 * be released from a reorder work, if any, or below.
>> +	 */
>> +	padata_get_pd(pd);
>> +
>>  	spin_lock(&reorder->lock);
>>  	/* Sort in ascending order of sequence number. */
>>  	list_for_each_entry_reverse(cur, &reorder->list, list)
>> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
>>  	 */
>>  	smp_mb();
>>  
>> -	padata_reorder(pd);
>> +	if (!padata_reorder(pd))
>> +		padata_put_pd(pd);
>
> do_serial is supposed to be called with BHs disabled and will be in
> every case after a fix for a separate issue that I plan to send this
> cycle.  Given that it (will soon...) always happen under RCU protection,
> part of this issue could be addressed like this, which puts the expense
> of dealing with this rare problem in the slow path:
>
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 0bf8c80dad5a..cd6740ae6629 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
>  	if (!ps)
>  		return;
>  
> +	/*
> +	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
> +	 * and ps's.
> +	 */
> +	synchronize_rcu();
> +
>  	mutex_lock(&ps->pinst->lock);
>  	list_del(&ps->list);
>  	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
>
> pcrypt calls padata_free_shell() when all outstanding transforms (and
> thus requests, I think) have been freed/completed, so no new task can
> come into padata_reorder.  synchronize_rcu() then flushes out any
> remaining _reorder calls.
>
> This doesn't deal with pending reorder_work items, but we can fix it
> later in the series.
>
> What do you think?

Yes, I think that would work. Will you handle it alongside that fix for
a separate issue you mentioned above? Or shall I once this fix has
landed?

Thanks!

Nicolai
Daniel Jordan Nov. 10, 2022, 11:16 p.m. UTC | #3
On Wed, Nov 09, 2022 at 02:03:15PM +0100, Nicolai Stange wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> 
> > On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
> >> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
> >> to the associated parallel_data instance.
> >> 
> >> However, as soon as the padata_priv got enqueued on the reorder list, it
> >> can be completed from a different context, causing the reference to get
> >> released in the course.
> >> 
> >> This would potentially cause UAFs from the subsequent padata_reorder()
> >> operations invoked from the enqueueing padata_do_serial() or from the
> >> reorder work.
> >> 
> 
> <snip>
> 
> >> ---
> >>  kernel/padata.c | 26 +++++++++++++++++++++-----
> >>  1 file changed, 21 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/kernel/padata.c b/kernel/padata.c
> >> index 0bf8c80dad5a..b79226727ef7 100644
> >> --- a/kernel/padata.c
> >> +++ b/kernel/padata.c
> >> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
> >>  	return padata;
> >>  }
> >>  
> >> -static void padata_reorder(struct parallel_data *pd)
> >> +static bool padata_reorder(struct parallel_data *pd)
> >>  {
> >>  	struct padata_instance *pinst = pd->ps->pinst;
> >>  	int cb_cpu;
> >> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
> >>  	 * care for all the objects enqueued during the holdtime of the lock.
> >>  	 */
> >>  	if (!spin_trylock_bh(&pd->lock))
> >> -		return;
> >> +		return false;
> >>  
> >>  	while (1) {
> >>  		padata = padata_find_next(pd, true);
> >> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
> >>  
> >>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
> >>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
> >> -		queue_work(pinst->serial_wq, &pd->reorder_work);
> >> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
> >> +
> >> +	return false;
> >>  }
> >>  
> >>  static void invoke_padata_reorder(struct work_struct *work)
> >>  {
> >>  	struct parallel_data *pd;
> >> +	bool keep_refcnt;
> >>  
> >>  	local_bh_disable();
> >>  	pd = container_of(work, struct parallel_data, reorder_work);
> >> -	padata_reorder(pd);
> >> +	keep_refcnt = padata_reorder(pd);
> >>  	local_bh_enable();
> >> +
> >> +	if (!keep_refcnt)
> >> +		padata_put_pd(pd);
> >>  }
> >>  
> >>  static void padata_serial_worker(struct work_struct *serial_work)
> >> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
> >>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
> >>  	struct padata_priv *cur;
> >>  
> >> +	/*
> >> +	 * The in-flight padata owns a reference on pd. However, as
> >> +	 * soon as it's been enqueued on the reorder list, another
> >> +	 * task can dequeue and complete it, thereby dropping the
> >> +	 * reference. Grab another reference here, it will eventually
> >> +	 * be released from a reorder work, if any, or below.
> >> +	 */
> >> +	padata_get_pd(pd);
> >> +
> >>  	spin_lock(&reorder->lock);
> >>  	/* Sort in ascending order of sequence number. */
> >>  	list_for_each_entry_reverse(cur, &reorder->list, list)
> >> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
> >>  	 */
> >>  	smp_mb();
> >>  
> >> -	padata_reorder(pd);
> >> +	if (!padata_reorder(pd))
> >> +		padata_put_pd(pd);
> >
> > do_serial is supposed to be called with BHs disabled and will be in
> > every case after a fix for a separate issue that I plan to send this
> > cycle.  Given that it (will soon...) always happen under RCU protection,
> > part of this issue could be addressed like this, which puts the expense
> > of dealing with this rare problem in the slow path:
> >
> > diff --git a/kernel/padata.c b/kernel/padata.c
> > index 0bf8c80dad5a..cd6740ae6629 100644
> > --- a/kernel/padata.c
> > +++ b/kernel/padata.c
> > @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
> >  	if (!ps)
> >  		return;
> >  
> > +	/*
> > +	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
> > +	 * and ps's.
> > +	 */
> > +	synchronize_rcu();
> > +
> >  	mutex_lock(&ps->pinst->lock);
> >  	list_del(&ps->list);
> >  	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
> >
> > pcrypt calls padata_free_shell() when all outstanding transforms (and
> > thus requests, I think) have been freed/completed, so no new task can
> > come into padata_reorder.  synchronize_rcu() then flushes out any
> > remaining _reorder calls.
> >
> > This doesn't deal with pending reorder_work items, but we can fix it
> > later in the series.
> >
> > What do you think?
> 
> Yes, I think that would work. Will you handle it alongside that fix for
> a separate issue you mentioned above? Or shall I once this fix has
> landed?

Please go ahead and do it yourself.  I'll send mine soon, I think it
should be ok for them to go in in the same cycle.
Herbert Xu Sept. 11, 2024, 10:29 a.m. UTC | #4
On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
> to the associated parallel_data instance.
> 
> However, as soon as the padata_priv got enqueued on the reorder list, it
> can be completed from a different context, causing the reference to get
> released in the course.
> 
> This would potentially cause UAFs from the subsequent padata_reorder()
> operations invoked from the enqueueing padata_do_serial() or from the
> reorder work.

I've just received yet another report of a UAF in padata on s390
while running the pcrypt_aead01 LTP test.  The kernel in question
contains an identical version of padata to upstream.  Perhaps it's
time to revisit the life-time issues in padata?

[ 9463.217163] ==================================================================
 [ 9463.217339] BUG: KASAN: slab-use-after-free in work_is_static_object+0x2a/0x68
 [ 9463.217678] Read of size 8 at addr 0000000098917c38 by task kworker/u4:2/59
 [ 9463.217798]
 [ 9463.217804] CPU: 0 PID: 59 Comm: kworker/u4:2 Kdump: loaded Not tainted 5.14.0-454.el9.s390x+debug #1
 [ 9463.218051] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
 [ 9463.218058] Workqueue: pdecrypt_parallel padata_parallel_worker
 [ 9463.218119] Call Trace:
 [ 9463.218123]  [<0000000001cbe122>] dump_stack_lvl+0xfa/0x150
 [ 9463.218226]  [<0000000001cada10>] print_address_description.constprop.0+0x180/0x430
 [ 9463.218236]  [<0000000001caddc2>] print_report+0x102/0x1f0
 [ 9463.218243]  [<00000000009c075a>] kasan_report+0xb2/0x1f8
 [ 9463.218251]  [<00000000009c1af2>] kasan_check_range+0x17a/0x1c0
 [ 9463.218302]  [<0000000000235b62>] work_is_static_object+0x2a/0x68
 [ 9463.218308]  [<0000000001002036>] lookup_object_or_alloc.part.0+0x12e/0x1a8
 [ 9463.218373]  [<0000000001002da2>] debug_object_activate+0x20a/0x450
 [ 9463.218381]  [<0000000000240456>] __queue_work+0x646/0x1128
 [ 9463.218443]  [<0000000000241080>] queue_work_on+0x148/0x150
 [ 9463.218506]  [<00000000007d707e>] padata_parallel_worker+0x76/0xc8
 [ 9463.218988]  [<0000000000242040>] process_one_work+0x978/0x1748
 [ 9463.218994]  [<00000000002433b4>] worker_thread+0x5a4/0xf58
 [ 9463.219000]  [<000000000025d60c>] kthread+0x2a4/0x358
 [ 9463.219008]  [<000000000010601a>] __ret_from_fork+0x8a/0xe8
 [ 9463.219016]  [<0000000001cf343a>] ret_from_fork+0xa/0x30
 [ 9463.219086] 5 locks held by kworker/u4:2/59:
 [ 9463.219143]  #0: 00000000ef82a948 ((wq_completion)pdecrypt_parallel){+.+.}-{0:0}, at: process_one_work+0x830/0x1748
 [ 9463.219204]  #1: 001bff8000b0fcf8 ((work_completion)(&pw->pw_work)){+.+.}-{0:0}, at: process_one_work+0x830/0x1748
 [ 9463.219215]  #2: 0000000002a93160 (rcu_read_lock){....}-{1:2}, at: __queue_work+0xa0/0x1128
 [ 9463.219337]  #3: 000000017fa0f618 (&pool->lock){-.-.}-{2:2}, at: __queue_work+0x314/0x1128
 [ 9463.219399]  #4: 0000000005d7a0c0 (&obj_hash[i].lock){-.-.}-{2:2}, at: debug_object_activate+0x166/0x450
 [ 9463.219446]
 [ 9463.219448] Allocated by task 397970:
 [ 9463.219452]  kasan_save_stack+0x34/0x58
 [ 9463.219457]  kasan_set_track+0x36/0x48
 [ 9463.219460]  __kasan_kmalloc+0xaa/0xc0
 [ 9463.219463]  padata_alloc_pd+0x70/0xbc8
 [ 9463.219466]  padata_alloc_shell+0x8a/0x208
 [ 9463.219469]  pcrypt_create_aead.isra.0+0xba/0x698 [pcrypt]
 [ 9463.219535]  cryptomgr_probe+0x9e/0x1d8
 [ 9463.219578]  kthread+0x2a4/0x358
 [ 9463.219582]  __ret_from_fork+0x8a/0xe8
 [ 9463.219586]  ret_from_fork+0xa/0x30
 [ 9463.219589]
 [ 9463.219591] Freed by task 392422:
 [ 9463.219594]  kasan_save_stack+0x34/0x58
 [ 9463.219598]  kasan_set_track+0x36/0x48
 [ 9463.219645]  kasan_save_free_info+0x46/0x68
 [ 9463.219651]  __kasan_slab_free+0x102/0x178
 [ 9463.219656]  slab_free_freelist_hook+0x148/0x228
 [ 9463.219685]  __kmem_cache_free+0x92/0x2b8
 [ 9463.219690]  padata_free_shell+0x242/0x2f8
 [ 9463.219697]  pcrypt_free+0x50/0xa0 [pcrypt]
 [ 9463.219703]  crypto_destroy_instance+0x86/0xe0
 [ 9463.219709]  crypto_mod_put+0xa4/0xf0
 [ 9463.219714]  crypto_del_alg+0xc8/0x138
 [ 9463.219720]  crypto_user_rcv_msg+0x39a/0x490
 [ 9463.219725]  netlink_rcv_skb+0x21a/0x4a0
 [ 9463.219773]  crypto_netlink_rcv+0x40/0x58
 [ 9463.219781]  netlink_unicast+0x46a/0x6e0
 [ 9463.219787]  netlink_sendmsg+0x75c/0xbe8
 [ 9463.219835]  __sock_sendmsg+0xca/0x170
 [ 9463.219843]  ____sys_sendmsg+0x5d8/0x730
 [ 9463.219848]  ___sys_sendmsg+0x11e/0x178
 [ 9463.219855]  __sys_sendmsg+0xda/0x150
 [ 9463.219915]  __do_sys_socketcall+0x3e2/0x480
 [ 9463.220092]  do_syscall+0x22c/0x328
 [ 9463.220097]  __do_syscall+0xce/0xf0
 [ 9463.220103]  system_call+0x70/0x98
 [ 9463.220109]
 [ 9463.220148] Last potentially related work creation:
 [ 9463.220152]  kasan_save_stack+0x34/0x58
 [ 9463.220157]  __kasan_record_aux_stack+0xba/0xd0
 [ 9463.221000]  __call_rcu_common.constprop.0+0x104/0x998
 [ 9463.221070]  __key_link+0x124/0x320
 [ 9463.221380]  construct_alloc_key+0x53a/0x618
 [ 9463.221386]  construct_key_and_link+0x196/0x3a8
 [ 9463.221506]  request_key_and_link+0x38a/0x3e0
 [ 9463.221512]  __do_sys_request_key+0x164/0x268
 [ 9463.221518]  do_syscall+0x22c/0x328
 [ 9463.221617]  __do_syscall+0xce/0xf0
 [ 9463.221622]  system_call+0x70/0x98
 [ 9463.221627]
 [ 9463.221630] Second to last potentially related work creation:
 [ 9463.221634]  kasan_save_stack+0x34/0x58
 [ 9463.221638]  __kasan_record_aux_stack+0xba/0xd0
 [ 9463.221642]  __call_rcu_common.constprop.0+0x104/0x998
 [ 9463.221743]  assoc_array_gc+0xde6/0x10f8
 [ 9463.221747]  keyring_gc+0x170/0x210
 [ 9463.221751]  key_garbage_collector+0x598/0x760
 [ 9463.221756]  process_one_work+0x978/0x1748
 [ 9463.221759]  worker_thread+0x71a/0xf58
 [ 9463.221763]  kthread+0x2a4/0x358
 [ 9463.221767]  __ret_from_fork+0x8a/0xe8
 [ 9463.221770]  ret_from_fork+0xa/0x30
 [ 9463.221774]
 [ 9463.221836] The buggy address belongs to the object at 0000000098917c00
 [ 9463.221836]  which belongs to the cache kmalloc-512 of size 512
 [ 9463.221887] The buggy address is located 56 bytes inside of
 [ 9463.221887]  freed 512-byte region [0000000098917c00, 0000000098917e00)
 [ 9463.221891]
 [ 9463.221894] The buggy address belongs to the physical page:
 [ 9463.221898] page:00000000e2f31a4b refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x98914
 [ 9463.222571] head:00000000e2f31a4b order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0
 [ 9463.222577] flags: 0x2000000000010200(slab|head|node=0|zone=1)
 [ 9463.222585] page_type: 0xffffffff()
 [ 9463.222592] raw: 2000000000010200 0000000080083400 0000000000000000 0000000100000122
 [ 9463.222599] raw: 0000000000000000 0010002000000000 ffffffff00000001 0000000000000000
 [ 9463.222604] page dumped because: kasan: bad access detected
 [ 9463.222648]
 [ 9463.222651] Memory state around the buggy address:
 [ 9463.222660]  0000000098917b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [ 9463.222665]  0000000098917b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 [ 9463.222670] >0000000098917c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 [ 9463.222795]                                         ^
 [ 9463.222838]  0000000098917c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 [ 9463.222843]  0000000098917d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 [ 9463.222849] ==================================================================
 [ 9463.222853] Disabling lock debugging due to kernel taint
 [ 9463.222858] ------------[ cut here ]------------
 [ 9463.222862] ODEBUG: activate not available (active state 0) object: 00000000990ad955 object type: work_struct hint: invoke_padata_reorder+0x0/0x58
 [ 9463.223373] WARNING: CPU: 0 PID: 59 at lib/debugobjects.c:514 debug_print_object+0x1be/0x248
 [ 9463.223549] Modules linked in: pcrypt zstd zstd_compress zsmalloc exfat vfat fat overlay ext4 mbcache jbd2 loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc vfio_ccw mdev vfio_iommu_type1 vfio iommufd drm i2c_core drm_panel_orientation_quirks fuse xfs libcrc32c ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 virtio_net net_failover failover virtio_blk sha3_256_s390 dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: zram]
 [ 9463.223743] CPU: 0 PID: 59 Comm: kworker/u4:2 Kdump: loaded Tainted: G    B             -------  ---  5.14.0-454.el9.s390x+debug #1
 [ 9463.223747] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
 [ 9463.223748] Workqueue: pdecrypt_parallel padata_parallel_worker
 [ 9463.223751] Krnl PSW : 0404e00180000000 00000000010004aa (debug_print_object+0x1c2/0x248)
 [ 9463.223756]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 [ 9463.223789] Krnl GPRS: 0000000000000001 001c000000000000 0000000000000086 0000000000000004
 [ 9463.223791]            0000000000000001 00000000006289f8 0000000098917c38 0000000001fccca0
 [ 9463.223793]            00000000007d7dc8 00000000029243e0 0000000001fcc780 001bff8000b0f9c8
 [ 9463.223795]            00000000028e3eb0 0000000000000000 00000000010004a6 001bff8000b0f868
 [ 9463.223804] Krnl Code: 000000000100049a: c020007e5e13       larl    %r2,0000000001fcc0c0
 [ 9463.223804]            00000000010004a0: c0e5ff8ee004       brasl   %r14,00000000001dc4a8
 [ 9463.223804]           #00000000010004a6: af000000           mc      0,0
 [ 9463.223804]           >00000000010004aa: c41d00c06113       lrl     %r1,000000000280c6d0
 [ 9463.223804]            00000000010004b0: a71a0001           ahi     %r1,1
 [ 9463.223804]            00000000010004b4: c41f00c0610e       strl    %r1,000000000280c6d0
 [ 9463.223804]            00000000010004ba: eb6ff0b00004       lmg     %r6,%r15,176(%r15)
 [ 9463.223804]            00000000010004c0: 07fe               bcr     15,%r14
 [ 9463.223818] Call Trace:
 [ 9463.223819]  [<00000000010004aa>] debug_print_object+0x1c2/0x248
 [ 9463.223823] ([<00000000010004a6>] debug_print_object+0x1be/0x248)
 [ 9463.223852]  [<0000000001002f5e>] debug_object_activate+0x3c6/0x450
 [ 9463.223854]  [<0000000000240456>] __queue_work+0x646/0x1128
 [ 9463.223856]  [<0000000000241080>] queue_work_on+0x148/0x150
 [ 9463.223858]  [<00000000007d707e>] padata_parallel_worker+0x76/0xc8
 [ 9463.223861]  [<0000000000242040>] process_one_work+0x978/0x1748
 [ 9463.223864]  [<00000000002433b4>] worker_thread+0x5a4/0xf58
 [ 9463.223866]  [<000000000025d60c>] kthread+0x2a4/0x358
 [ 9463.223868]  [<000000000010601a>] __ret_from_fork+0x8a/0xe8
 [ 9463.223871]  [<0000000001cf343a>] ret_from_fork+0xa/0x30
 [ 9463.223873] INFO: lockdep is turned off.
 [ 9463.223874] Last Breaking-Event-Address:
 [ 9463.223875]  [<00000000001dc64a>] __warn_printk+0x1a2/0x1a8
 [ 9463.223878] irq event stamp: 24602747
 [ 9463.223879] hardirqs last  enabled at (24602746): [<00000000001f4ac0>] __local_bh_enable_ip+0x188/0x348
 [ 9463.223884] hardirqs last disabled at (24602747): [<0000000000241042>] queue_work_on+0x10a/0x150
 [ 9463.223886] softirqs last  enabled at (24602712): [<00000000007d7072>] padata_parallel_worker+0x6a/0xc8
 [ 9463.223917] softirqs last disabled at (24602720): [<00000000007d7042>] padata_parallel_worker+0x3a/0xc8
 [ 9463.223920] ---[ end trace 0000000000000000 ]---
 [ 9463.224330] ------------[ cut here ]------------
 [ 9463.224394] ODEBUG: deactivate not available (active state 0) object: 00000000990ad955 object type: work_struct hint: invoke_padata_reorder+0x0/0x58
 [ 9463.224520] WARNING: CPU: 0 PID: 306462 at lib/debugobjects.c:514 debug_print_object+0x1be/0x248
 [ 9463.224526] Modules linked in: pcrypt zstd zstd_compress zsmalloc exfat vfat fat overlay ext4 mbcache jbd2 loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc vfio_ccw mdev vfio_iommu_type1 vfio iommufd drm i2c_core drm_panel_orientation_quirks fuse xfs libcrc32c ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 virtio_net net_failover failover virtio_blk sha3_256_s390 dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: zram]
 [ 9463.224781] CPU: 0 PID: 306462 Comm: kworker/0:0 Kdump: loaded Tainted: G    B   W         -------  ---  5.14.0-454.el9.s390x+debug #1
 [ 9463.224897] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
 [ 9463.224898] Workqueue:  0x0 (events)
 [ 9463.224907] Krnl PSW : 0404e00180000000 00000000010004aa (debug_print_object+0x1c2/0x248)
 [ 9463.224949]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 [ 9463.224953] Krnl GPRS: 0000000000000001 001c000000000000 0000000000000088 0000000000000004
 [ 9463.224956]            0000000000000001 00000000006289f8 0000000098917c38 0000000001fccca0
 [ 9463.224959]            00000000007d7dc8 00000000029243e0 0000000001fcc6c0 001bff8005b6fb58
 [ 9463.224961]            00000000028e3eb0 0000000000000000 00000000010004a6 001bff8005b6fa10
 [ 9463.224968] Krnl Code: 000000000100049a: c020007e5e13       larl    %r2,0000000001fcc0c0
 [ 9463.224968]            00000000010004a0: c0e5ff8ee004       brasl   %r14,00000000001dc4a8
 [ 9463.224968]           #00000000010004a6: af000000           mc      0,0
 [ 9463.224968]           >00000000010004aa: c41d00c06113       lrl     %r1,000000000280c6d0
 [ 9463.224968]            00000000010004b0: a71a0001           ahi     %r1,1
 [ 9463.224968]            00000000010004b4: c41f00c0610e       strl    %r1,000000000280c6d0
 [ 9463.224968]            00000000010004ba: eb6ff0b00004       lmg     %r6,%r15,176(%r15)
 [ 9463.224968]            00000000010004c0: 07fe               bcr     15,%r14
 [ 9463.225011] Call Trace:
 [ 9463.225012]  [<00000000010004aa>] debug_print_object+0x1c2/0x248
 [ 9463.225022] ([<00000000010004a6>] debug_print_object+0x1be/0x248)
 [ 9463.225025]  [<0000000001002546>] debug_object_deactivate+0x1c6/0x2f8
 [ 9463.225029]  [<0000000000241a86>] process_one_work+0x3be/0x1748
 [ 9463.225032]  [<00000000002433b4>] worker_thread+0x5a4/0xf58
 [ 9463.225035]  [<000000000025d60c>] kthread+0x2a4/0x358
 [ 9463.225038]  [<000000000010601a>] __ret_from_fork+0x8a/0xe8
 [ 9463.225101]  [<0000000001cf343a>] ret_from_fork+0xa/0x30
 [ 9463.225294] INFO: lockdep is turned off.
 [ 9463.225357] Last Breaking-Event-Address:
 [ 9463.225357]  [<00000000001dc64a>] __warn_printk+0x1a2/0x1a8
 [ 9463.225543] irq event stamp: 2298862
 [ 9463.225544] hardirqs last  enabled at (2298861): [<0000000001cf2cc2>] _raw_spin_unlock_irq+0x92/0xb0
 [ 9463.225549] hardirqs last disabled at (2298862): [<0000000001cd7e60>] __schedule+0xb20/0x1478
 [ 9463.225585] softirqs last  enabled at (2292304): [<0000000001cf5c12>] __do_softirq+0xaaa/0x1040
 [ 9463.225589] softirqs last disabled at (2291215): [<00000000001f3bf2>] __irq_exit_rcu+0x2fa/0x3b8
 [ 9463.225593] ---[ end trace 0000000000000000 ]---
 [ 9463.225606] Unable to handle kernel pointer dereference in virtual kernel address space
 [ 9463.225847] Failing address: 0005fce7876cc000 TEID: 0005fce7876cc403
 [ 9463.225853] Fault in home space mode while using kernel ASCE.
 [ 9463.225863] AS:0000000003b2800b R2:0000000000000028
 [ 9463.226167] Oops: 003a ilc:2 [#1] SMP
 [ 9463.226174] Modules linked in: pcrypt zstd zstd_compress zsmalloc exfat vfat fat overlay ext4 mbcache jbd2 loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc vfio_ccw mdev vfio_iommu_type1 vfio iommufd drm i2c_core drm_panel_orientation_quirks fuse xfs libcrc32c ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390 virtio_net net_failover failover virtio_blk sha3_256_s390 dm_mirror dm_region_hash dm_log dm_mod pkey zcrypt [last unloaded: zram]
 [ 9463.226373] CPU: 0 PID: 306462 Comm: kworker/0:0 Kdump: loaded Tainted: G    B   W         -------  ---  5.14.0-454.el9.s390x+debug #1
 [ 9463.226376] Hardware name: IBM 8561 LT1 400 (KVM/Linux)
 [ 9463.226378] Workqueue: pdecrypt_serial invoke_padata_reorder
 [ 9463.226381] Krnl PSW : 0704c00180000000 0000000000361e96 (do_raw_spin_trylock+0x46/0x148)
 [ 9463.226427]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
 [ 9463.226430] Krnl GPRS: 0000000000000001 001c000000000000 0005fce700000001 001c000000000000
 [ 9463.226433]            0000000000000000 00000000003583a8 0000000000000001 0005fce7876cc850
 [ 9463.226435]            0000000098917c24 0005fce7876cc868 0005fce7876cc850 0005fce7876cc850
 [ 9463.226438]            00000000028e3eb0 0000000013122f84 001bff8005b6f9b0 001bff8005b6f970
 [ 9463.226445] Krnl Code: 0000000000361e8a: a7740065           brc     7,0000000000361f54
 [ 9463.226445]            0000000000361e8e: 582003ac           l       %r2,940
 [ 9463.226445]           #0000000000361e92: a7180000           lhi     %r1,0
 [ 9463.226445]           >0000000000361e96: ba12b000           cs      %r1,%r2,0(%r11)
 [ 9463.226445]            0000000000361e9a: a7a80001           lhi     %r10,1
 [ 9463.226445]            0000000000361e9e: ec160050007e       cij     %r1,0,6,0000000000361f3e
 [ 9463.226445]            0000000000361ea4: a51c001c           llihh   %r1,28
 [ 9463.226445]            KernelAddressSanitizer initialized

Thanks,
diff mbox series

Patch

diff --git a/kernel/padata.c b/kernel/padata.c
index 0bf8c80dad5a..b79226727ef7 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -275,7 +275,7 @@  static struct padata_priv *padata_find_next(struct parallel_data *pd,
 	return padata;
 }
 
-static void padata_reorder(struct parallel_data *pd)
+static bool padata_reorder(struct parallel_data *pd)
 {
 	struct padata_instance *pinst = pd->ps->pinst;
 	int cb_cpu;
@@ -294,7 +294,7 @@  static void padata_reorder(struct parallel_data *pd)
 	 * care for all the objects enqueued during the holdtime of the lock.
 	 */
 	if (!spin_trylock_bh(&pd->lock))
-		return;
+		return false;
 
 	while (1) {
 		padata = padata_find_next(pd, true);
@@ -331,17 +331,23 @@  static void padata_reorder(struct parallel_data *pd)
 
 	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
 	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
-		queue_work(pinst->serial_wq, &pd->reorder_work);
+		return queue_work(pinst->serial_wq, &pd->reorder_work);
+
+	return false;
 }
 
 static void invoke_padata_reorder(struct work_struct *work)
 {
 	struct parallel_data *pd;
+	bool keep_refcnt;
 
 	local_bh_disable();
 	pd = container_of(work, struct parallel_data, reorder_work);
-	padata_reorder(pd);
+	keep_refcnt = padata_reorder(pd);
 	local_bh_enable();
+
+	if (!keep_refcnt)
+		padata_put_pd(pd);
 }
 
 static void padata_serial_worker(struct work_struct *serial_work)
@@ -392,6 +398,15 @@  void padata_do_serial(struct padata_priv *padata)
 	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
 	struct padata_priv *cur;
 
+	/*
+	 * The in-flight padata owns a reference on pd. However, as
+	 * soon as it's been enqueued on the reorder list, another
+	 * task can dequeue and complete it, thereby dropping the
+	 * reference. Grab another reference here, it will eventually
+	 * be released from a reorder work, if any, or below.
+	 */
+	padata_get_pd(pd);
+
 	spin_lock(&reorder->lock);
 	/* Sort in ascending order of sequence number. */
 	list_for_each_entry_reverse(cur, &reorder->list, list)
@@ -407,7 +422,8 @@  void padata_do_serial(struct padata_priv *padata)
 	 */
 	smp_mb();
 
-	padata_reorder(pd);
+	if (!padata_reorder(pd))
+		padata_put_pd(pd);
 }
 EXPORT_SYMBOL(padata_do_serial);