Message ID | 20221019083708.27138-6-nstange@suse.de |
---|---|
State | New |
Headers | show |
Series | padata: fix liftime issues after ->serial() has completed | expand |
On Wed, Oct 19, 2022 at 10:37:08AM +0200, Nicolai Stange wrote: > Even though the parallel_data "pd" instance passed to padata_reorder() is > guaranteed to exist as per the reference held by its callers, the same is > not true for the associated padata_shell, pd->ps. More specifically, once > the last padata_priv request has been completed, either at entry from > padata_reorder() or concurrently to it, the padata API users are well > within their right to free the padata_shell instance. The synchronize_rcu change seems to make padata_reorder safe from freed ps's with the exception of a straggler reorder_work. For that, I think something like this hybrid of your code and mine is enough to plug the hole. It's on top of 1-2 and my hunk from 3. It has to take an extra ref on pd, but only in the rare case where the reorder work is used. Thoughts? diff --git a/kernel/padata.c b/kernel/padata.c index cd6740ae6629..f14c256a0ee3 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -277,7 +277,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, static void padata_reorder(struct parallel_data *pd) { - struct padata_instance *pinst = pd->ps->pinst; + struct padata_instance *pinst; int cb_cpu; struct padata_priv *padata; struct padata_serial_queue *squeue; @@ -314,7 +314,7 @@ static void padata_reorder(struct parallel_data *pd) list_add_tail(&padata->list, &squeue->serial.list); spin_unlock(&squeue->serial.lock); - queue_work_on(cb_cpu, pinst->serial_wq, &squeue->work); + queue_work_on(cb_cpu, pd->ps->pinst->serial_wq, &squeue->work); } spin_unlock_bh(&pd->lock); @@ -330,8 +330,10 @@ static void padata_reorder(struct parallel_data *pd) smp_mb(); reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); - if (!list_empty(&reorder->list) && padata_find_next(pd, false)) - queue_work(pinst->serial_wq, &pd->reorder_work); + if (!list_empty(&reorder->list) && padata_find_next(pd, false)) { + if (queue_work(pd->ps->pinst->serial_wq, &pd->reorder_work)) + padata_get_pd(pd); + } } static void invoke_padata_reorder(struct work_struct *work) @@ -342,6 +344,7 @@ static void invoke_padata_reorder(struct work_struct *work) pd = container_of(work, struct parallel_data, reorder_work); padata_reorder(pd); local_bh_enable(); + padata_put_pd(pd); } static void padata_serial_worker(struct work_struct *serial_work)
Daniel Jordan <daniel.m.jordan@oracle.com> writes: > On Wed, Oct 19, 2022 at 10:37:08AM +0200, Nicolai Stange wrote: >> Even though the parallel_data "pd" instance passed to padata_reorder() is >> guaranteed to exist as per the reference held by its callers, the same is >> not true for the associated padata_shell, pd->ps. More specifically, once >> the last padata_priv request has been completed, either at entry from >> padata_reorder() or concurrently to it, the padata API users are well >> within their right to free the padata_shell instance. > > The synchronize_rcu change seems to make padata_reorder safe from freed > ps's with the exception of a straggler reorder_work. For that, I think > something like this hybrid of your code and mine is enough to plug the > hole. It's on top of 1-2 and my hunk from 3. It has to take an extra > ref on pd, but only in the rare case where the reorder work is used. > Thoughts? > > diff --git a/kernel/padata.c b/kernel/padata.c > index cd6740ae6629..f14c256a0ee3 100644 > --- a/kernel/padata.c > +++ b/kernel/padata.c > @@ -277,7 +277,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, > > static void padata_reorder(struct parallel_data *pd) > { > - struct padata_instance *pinst = pd->ps->pinst; > + struct padata_instance *pinst; > int cb_cpu; > struct padata_priv *padata; > struct padata_serial_queue *squeue; > @@ -314,7 +314,7 @@ static void padata_reorder(struct parallel_data *pd) > list_add_tail(&padata->list, &squeue->serial.list); > spin_unlock(&squeue->serial.lock); > > - queue_work_on(cb_cpu, pinst->serial_wq, &squeue->work); > + queue_work_on(cb_cpu, pd->ps->pinst->serial_wq, &squeue->work); > } > > spin_unlock_bh(&pd->lock); > @@ -330,8 +330,10 @@ static void padata_reorder(struct parallel_data *pd) > smp_mb(); > > reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); > - if (!list_empty(&reorder->list) && padata_find_next(pd, false)) > - queue_work(pinst->serial_wq, &pd->reorder_work); > + if (!list_empty(&reorder->list) && padata_find_next(pd, false)) { > + if (queue_work(pd->ps->pinst->serial_wq, &pd->reorder_work)) > + padata_get_pd(pd); As the reorder_work can start running immediately after having been submitted, wouldn't it be more correct to do something like padata_get_pd(pd); if (!queue_work(pd->ps->pinst->serial_wq, &pd->reorder_work)) padata_put_pd(pd); ? Otherwise the patch looks good to me. Thanks! Nicolai > + } > } > > static void invoke_padata_reorder(struct work_struct *work) > @@ -342,6 +344,7 @@ static void invoke_padata_reorder(struct work_struct *work) > pd = container_of(work, struct parallel_data, reorder_work); > padata_reorder(pd); > local_bh_enable(); > + padata_put_pd(pd); > } > > static void padata_serial_worker(struct work_struct *serial_work) >
On Wed, Nov 09, 2022 at 02:03:29PM +0100, Nicolai Stange wrote: > Daniel Jordan <daniel.m.jordan@oracle.com> writes: > > > On Wed, Oct 19, 2022 at 10:37:08AM +0200, Nicolai Stange wrote: > >> Even though the parallel_data "pd" instance passed to padata_reorder() is > >> guaranteed to exist as per the reference held by its callers, the same is > >> not true for the associated padata_shell, pd->ps. More specifically, once > >> the last padata_priv request has been completed, either at entry from > >> padata_reorder() or concurrently to it, the padata API users are well > >> within their right to free the padata_shell instance. > > > > The synchronize_rcu change seems to make padata_reorder safe from freed > > ps's with the exception of a straggler reorder_work. For that, I think > > something like this hybrid of your code and mine is enough to plug the > > hole. It's on top of 1-2 and my hunk from 3. It has to take an extra > > ref on pd, but only in the rare case where the reorder work is used. > > Thoughts? > > > > diff --git a/kernel/padata.c b/kernel/padata.c > > index cd6740ae6629..f14c256a0ee3 100644 > > --- a/kernel/padata.c > > +++ b/kernel/padata.c > > @@ -277,7 +277,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, > > > > static void padata_reorder(struct parallel_data *pd) > > { > > - struct padata_instance *pinst = pd->ps->pinst; > > + struct padata_instance *pinst; > > int cb_cpu; > > struct padata_priv *padata; > > struct padata_serial_queue *squeue; > > @@ -314,7 +314,7 @@ static void padata_reorder(struct parallel_data *pd) > > list_add_tail(&padata->list, &squeue->serial.list); > > spin_unlock(&squeue->serial.lock); > > > > - queue_work_on(cb_cpu, pinst->serial_wq, &squeue->work); > > + queue_work_on(cb_cpu, pd->ps->pinst->serial_wq, &squeue->work); > > } > > > > spin_unlock_bh(&pd->lock); > > @@ -330,8 +330,10 @@ static void padata_reorder(struct parallel_data *pd) > > smp_mb(); > > > > reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); > > - if (!list_empty(&reorder->list) && padata_find_next(pd, false)) > > - queue_work(pinst->serial_wq, &pd->reorder_work); > > + if (!list_empty(&reorder->list) && padata_find_next(pd, false)) { > > + if (queue_work(pd->ps->pinst->serial_wq, &pd->reorder_work)) > > + padata_get_pd(pd); > > As the reorder_work can start running immediately after having been > submitted, wouldn't it be more correct to do something like > > padata_get_pd(pd); > if (!queue_work(pd->ps->pinst->serial_wq, &pd->reorder_work)) > padata_put_pd(pd); > > ? Yes, that's better, and all the above can go in your next version too.
diff --git a/kernel/padata.c b/kernel/padata.c index e9eab3e94cfc..fa4818b81eca 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -286,7 +286,6 @@ static struct padata_priv *padata_dequeue_next(struct parallel_data *pd) static bool padata_reorder(struct parallel_data *pd) { - struct padata_instance *pinst = pd->ps->pinst; int cb_cpu; struct padata_priv *padata; struct padata_serial_queue *squeue; @@ -323,7 +322,11 @@ static bool padata_reorder(struct parallel_data *pd) list_add_tail(&padata->list, &squeue->serial.list); spin_unlock(&squeue->serial.lock); - queue_work_on(cb_cpu, pinst->serial_wq, &squeue->work); + /* + * Note: as long as there are requests in-flight, + * pd->ps is guaranteed to exist. + */ + queue_work_on(cb_cpu, pd->ps->pinst->serial_wq, &squeue->work); } spin_unlock_bh(&pd->lock); @@ -340,14 +343,23 @@ static bool padata_reorder(struct parallel_data *pd) reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); if (!list_empty(&reorder->list)) { + bool reenqueued; + spin_lock(&reorder->lock); if (!__padata_find_next(pd, reorder)) { spin_unlock(&reorder->lock); return false; } + + /* + * Note: as long as there are requests in-flight, + * pd->ps is guaranteed to exist. + */ + reenqueued = queue_work(pd->ps->pinst->serial_wq, + &pd->reorder_work); spin_unlock(&reorder->lock); - return queue_work(pinst->serial_wq, &pd->reorder_work); + return reenqueued; } return false;
Even though the parallel_data "pd" instance passed to padata_reorder() is guaranteed to exist as per the reference held by its callers, the same is not true for the associated padata_shell, pd->ps. More specifically, once the last padata_priv request has been completed, either at entry from padata_reorder() or concurrently to it, the padata API users are well within their right to free the padata_shell instance. Note that this is a purely theoretical issue, it has not been actually observed. Yet it's worth fixing for the sake of robustness. Exploit the fact that as long as there are any not yet completed padata_priv's around on any of the percpu reorder queues, pd->ps is guaranteed to exist. Make padata_reorder() to load from pd->ps only when it's known that there is at least one in-flight padata_priv object to reorder. Note that this involves moving pd->ps accesses to under the reorder->lock as appropriate, so that the found padata_priv object won't get dequeued and completed concurrently from a different context. Fixes: bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance padata queues") Signed-off-by: Nicolai Stange <nstange@suse.de> --- kernel/padata.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)