Message ID | 4064EB40-C84F-42E8-82F7-3940901C09D2@purestorage.com |
---|---|
State | New |
Headers | show |
Series | [1/1] scsi: scsi_dh_alua: remove the list entry before assigning the pointer and sdev to NULL | expand |
To me just removing the h->sdev = NULL seems strange because then this looks strange to me: pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock)); rcu_assign_pointer(h->pg, NULL); Then saying If (pg) Since we just assigned that pointer, h->pg to NULL. Thanks, Brian Brian Bunker SW Eng brian@purestorage.com > On Sep 11, 2020, at 6:44 AM, Ewan D. Milne <emilne@redhat.com> wrote: > > On Thu, 2020-09-10 at 14:22 -0700, Brian Bunker wrote: >> A race exists where the BUG_ON(!h->sdev) will fire if the detach >> device handler >> from one thread runs removing a list entry while another thread is >> trying to >> evaluate the target portal group state. >> >> The order of the detach operation is now changed to delete the list >> entry >> before modifying the pointer and setting h->sdev to NULL. >> >> Signed-off-by: Brian Bunker <brian@purestorage.com> >> Acked-by: Krishna Kant <krishna.kant@purestorage.com> >> ___ >> diff -Naur a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c >> b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c >> --- a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 >> 12:29:03.000000000 -0700 >> +++ b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 >> 12:41:34.000000000 -0700 >> @@ -1146,16 +1146,18 @@ >> >> spin_lock(&h->pg_lock); >> pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h- >>> pg_lock)); >> - rcu_assign_pointer(h->pg, NULL); >> - h->sdev = NULL; >> - spin_unlock(&h->pg_lock); >> if (pg) { >> spin_lock_irq(&pg->lock); >> list_del_rcu(&h->node); >> spin_unlock_irq(&pg->lock); >> - kref_put(&pg->kref, release_port_group); >> } >> + rcu_assign_pointer(h->pg, NULL); >> + h->sdev = NULL; >> + spin_unlock(&h->pg_lock); >> sdev->handler_data = NULL; >> + if (pg) { >> + kref_put(&pg->kref, release_port_group); >> + } >> kfree(h); >> } >> > > Good catch. > > This makes the code hold the h->pg_lock while holding the pg->lock > though. > > It seems like all that is needed is to remove the h->sdev = NULL > assignment, since it has to remain valid until the alua_ah_data is > removed from the pg->dh_list, and the object is going to be freed > right afterwards anyway? > > Might as well also remove the BUG_ON(!h->sdev) in 2 places since the > kernel will crash when h is dereferenced anyway if it is NULL. > > -Ewan > >
On 9/11/20 5:47 PM, Brian Bunker wrote: > To me just removing the h->sdev = NULL seems strange because then this > looks strange to me: > > pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock)); > rcu_assign_pointer(h->pg, NULL); > > Then saying > If (pg) > > Since we just assigned that pointer, h->pg to NULL. > True, but the 'rcu_dereference()' call is just ensuring 'alua_dh_data' is valid, not the contents of which. So to be absolutely correctly we would need to take 'h->lock' when evaluating 'h->sdev'; but then this is an optimisation anyway we might as well kill the BUG_ON() and replace it by a simple 'if' condition. Cheers, Hannes
diff -Naur a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c --- a/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 12:29:03.000000000 -0700 +++ b/scsi/drivers/scsi/device_handler/scsi_dh_alua.c 2020-09-10 12:41:34.000000000 -0700 @@ -1146,16 +1146,18 @@ spin_lock(&h->pg_lock); pg = rcu_dereference_protected(h->pg, lockdep_is_held(&h->pg_lock)); - rcu_assign_pointer(h->pg, NULL); - h->sdev = NULL; - spin_unlock(&h->pg_lock); if (pg) { spin_lock_irq(&pg->lock); list_del_rcu(&h->node); spin_unlock_irq(&pg->lock); - kref_put(&pg->kref, release_port_group); } + rcu_assign_pointer(h->pg, NULL); + h->sdev = NULL; + spin_unlock(&h->pg_lock); sdev->handler_data = NULL; + if (pg) { + kref_put(&pg->kref, release_port_group); + } kfree(h); }