Message ID | 20221019083708.27138-3-nstange@suse.de |
---|---|
State | New |
Headers | show |
Series | padata: fix liftime issues after ->serial() has completed | expand |
On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote: > On a PREEMPT kernel, the following has been observed while running > pcrypt_aead01 from LTP: > > [ ] general protection fault: 0000 [#1] PREEMPT_RT SMP PTI > <...> > [ ] Workqueue: pdecrypt_parallel padata_parallel_worker > [ ] RIP: 0010:padata_reorder+0x19/0x120 > <...> > [ ] Call Trace: > [ ] padata_parallel_worker+0xa3/0xf0 > [ ] process_one_work+0x1db/0x4a0 > [ ] worker_thread+0x2d/0x3c0 > [ ] ? process_one_work+0x4a0/0x4a0 > [ ] kthread+0x159/0x180 > [ ] ? kthread_park+0xb0/0xb0 > [ ] ret_from_fork+0x35/0x40 > > The pcrypt_aead01 testcase basically runs a NEWALG/DELALG sequence for some > fixed pcrypt instance in a loop, back to back. > > The problem is that once the last ->serial() in padata_serial_worker() is > getting invoked, the pcrypt requests from the selftests would signal > completion, and pcrypt_aead01 can move on and subsequently issue a DELALG. > Upon pcrypt instance deregistration, the associated padata_shell would get > destroyed, which in turn would unconditionally free the associated > parallel_data instance. > > If padata_serial_worker() now resumes operation after e.g. having > previously been preempted upon the return from the last of those ->serial() > callbacks, its subsequent accesses to pd for managing the ->refcnt will > all be UAFs. In particular, if the memory backing pd has meanwhile been > reused for some new parallel_data allocation, e.g in the course of > processing another subsequent NEWALG request, the padata_serial_worker() > might find the initial ->refcnt of one and free pd from under that NEWALG > or the associated selftests respectively, leading to "secondary" UAFs such > as in the Oops above. > > Note that as it currently stands, a padata_shell owns a reference on its > associated parallel_data already. So fix the UAF in padata_serial_worker() > by making padata_free_shell() to not unconditionally free the shell's > associated parallel_data, but to properly drop that reference via > padata_put_pd() instead. > > Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing") It looks like this issue goes back to the first padata commit. For instance, pd->refcnt goes to zero after the last _priv is serialized, padata_free is called in another task, and a particularly sluggish padata_reorder call touches pd after. So wouldn't it be Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") ? Otherwise, Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> > Signed-off-by: Nicolai Stange <nstange@suse.de> > --- > kernel/padata.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/padata.c b/kernel/padata.c > index 3bd1e23f089b..0bf8c80dad5a 100644 > --- a/kernel/padata.c > +++ b/kernel/padata.c > @@ -1112,7 +1112,7 @@ void padata_free_shell(struct padata_shell *ps) > > mutex_lock(&ps->pinst->lock); > list_del(&ps->list); > - padata_free_pd(rcu_dereference_protected(ps->pd, 1)); > + padata_put_pd(rcu_dereference_protected(ps->pd, 1)); > mutex_unlock(&ps->pinst->lock); > > kfree(ps); > -- > 2.37.3 >
On Fri, Oct 28, 2022 at 10:35:49AM -0400, Daniel Jordan wrote: > On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote: > > Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing") > > It looks like this issue goes back to the first padata commit. For > instance, pd->refcnt goes to zero after the last _priv is serialized, > padata_free is called in another task, and a particularly sluggish > padata_reorder call touches pd after. > > So wouldn't it be > > Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") > > ? I guess not, by my own logic from 2/5. I think it might be Fixes: d46a5ac7a7e2 ("padata: Use a timer to handle remaining objects in the reorder queues")
Daniel Jordan <daniel.m.jordan@oracle.com> writes: > On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote: >> >> Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing") > > It looks like this issue goes back to the first padata commit. For > instance, pd->refcnt goes to zero after the last _priv is serialized, > padata_free is called in another task, and a particularly sluggish > padata_reorder call touches pd after. > > So wouldn't it be > > Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") I chose 07928d9bfc81 ("padata: Remove broken queue flushing"), because that one reads like it fixed a couple of much more severe padata lifetime issues, it only missed the relatively minor one addressed here, in a sense. Or to put it the other way around: if one were to backport this patch here, 07928d9bfc81 should probably get picked first, I think. But I'd be fine with any Fixes tag, of course, I don't have a strong opinion on this matter. Thanks! Nicolai > > ? > > Otherwise, > > Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> > >> Signed-off-by: Nicolai Stange <nstange@suse.de> >> ---
On Wed, Nov 09, 2022 at 02:02:37PM +0100, Nicolai Stange wrote: > Daniel Jordan <daniel.m.jordan@oracle.com> writes: > > > On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote: > >> > >> Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing") > > > > It looks like this issue goes back to the first padata commit. For > > instance, pd->refcnt goes to zero after the last _priv is serialized, > > padata_free is called in another task, and a particularly sluggish > > padata_reorder call touches pd after. > > > > So wouldn't it be > > > > Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface") > > I chose 07928d9bfc81 ("padata: Remove broken queue flushing"), because > that one reads like it fixed a couple of much more severe padata > lifetime issues, it only missed the relatively minor one addressed here, > in a sense. > > Or to put it the other way around: if one were to backport this patch > here, 07928d9bfc81 should probably get picked first, I think. > > But I'd be fine with any Fixes tag, of course, I don't have a strong > opinion on this matter. Ok, makes sense, your Fixes is fine then. Please put that justification in the changelog for the next version.
diff --git a/kernel/padata.c b/kernel/padata.c index 3bd1e23f089b..0bf8c80dad5a 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -1112,7 +1112,7 @@ void padata_free_shell(struct padata_shell *ps) mutex_lock(&ps->pinst->lock); list_del(&ps->list); - padata_free_pd(rcu_dereference_protected(ps->pd, 1)); + padata_put_pd(rcu_dereference_protected(ps->pd, 1)); mutex_unlock(&ps->pinst->lock); kfree(ps);
On a PREEMPT kernel, the following has been observed while running pcrypt_aead01 from LTP: [ ] general protection fault: 0000 [#1] PREEMPT_RT SMP PTI <...> [ ] Workqueue: pdecrypt_parallel padata_parallel_worker [ ] RIP: 0010:padata_reorder+0x19/0x120 <...> [ ] Call Trace: [ ] padata_parallel_worker+0xa3/0xf0 [ ] process_one_work+0x1db/0x4a0 [ ] worker_thread+0x2d/0x3c0 [ ] ? process_one_work+0x4a0/0x4a0 [ ] kthread+0x159/0x180 [ ] ? kthread_park+0xb0/0xb0 [ ] ret_from_fork+0x35/0x40 The pcrypt_aead01 testcase basically runs a NEWALG/DELALG sequence for some fixed pcrypt instance in a loop, back to back. The problem is that once the last ->serial() in padata_serial_worker() is getting invoked, the pcrypt requests from the selftests would signal completion, and pcrypt_aead01 can move on and subsequently issue a DELALG. Upon pcrypt instance deregistration, the associated padata_shell would get destroyed, which in turn would unconditionally free the associated parallel_data instance. If padata_serial_worker() now resumes operation after e.g. having previously been preempted upon the return from the last of those ->serial() callbacks, its subsequent accesses to pd for managing the ->refcnt will all be UAFs. In particular, if the memory backing pd has meanwhile been reused for some new parallel_data allocation, e.g in the course of processing another subsequent NEWALG request, the padata_serial_worker() might find the initial ->refcnt of one and free pd from under that NEWALG or the associated selftests respectively, leading to "secondary" UAFs such as in the Oops above. Note that as it currently stands, a padata_shell owns a reference on its associated parallel_data already. So fix the UAF in padata_serial_worker() by making padata_free_shell() to not unconditionally free the shell's associated parallel_data, but to properly drop that reference via padata_put_pd() instead. Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing") Signed-off-by: Nicolai Stange <nstange@suse.de> --- kernel/padata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)