Message ID | 20230915081507.761711-3-dlemoal@kernel.org |
---|---|
State | Superseded |
Headers | show |
Series | Fix libata suspend/resume handling and code cleanup | expand |
On Fri, Sep 15, 2023 at 05:14:46PM +0900, Damien Le Moal wrote: > Whenever an ATA adapter driver is removed (e.g. rmmod), > ata_port_detach() is called repeatedly for all the adapter ports to > remove (unload) the devices attached to the port and delete the port > device itself. Removing of devices is done using libata EH with the > ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute > ata_eh_unload() which disables all devices attached to the port. > > ata_port_detach() finishes by calling scsi_remove_host() to remove the > scsi host associated with the port. This function will trigger the > removal of all scsi devices attached to the host and in the case of > disks, calls to sd_shutdown() which will flush the device write cache > and stop the device. However, given that the devices were already > disabled by ata_eh_unload(), the synchronize write cache command and > start stop unit commands fail. E.g. running "rmmod ahci" with first > removing sd_mod results in error messages like: > > ata13.00: disable device > sd 0:0:0:0: [sda] Synchronizing SCSI cache > sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > sd 0:0:0:0: [sda] Stopping disk > sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > > Fix this by removing all scsi devices of the ata devices connected to > the port before scheduling libata EH to disable the ATA devices. > > Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") > Cc: stable@vger.kernel.org > Signed-off-by: Damien Le Moal <dlemoal@kernel.org> > Reviewed-by: Hannes Reinecke <hare@suse.de> > Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> > --- > drivers/ata/libata-core.c | 21 ++++++++++++++++++++- > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index c4898483d716..693cb3cd70cd 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) > struct ata_link *link; > struct ata_device *dev; > > - /* tell EH we're leaving & flush EH */ > + /* Wait for any ongoing EH */ > + ata_port_wait_eh(ap); > + > + mutex_lock(&ap->scsi_scan_mutex); > spin_lock_irqsave(ap->lock, flags); > + > + /* Remove scsi devices */ > + ata_for_each_link(link, ap, HOST_FIRST) { > + ata_for_each_dev(dev, link, ALL) { > + if (dev->sdev) { > + spin_unlock_irqrestore(ap->lock, flags); > + scsi_remove_device(dev->sdev); > + spin_lock_irqsave(ap->lock, flags); > + dev->sdev = NULL; > + } > + } > + } > + > + /* Tell EH to disable all devices */ > ap->pflags |= ATA_PFLAG_UNLOADING; > ata_port_schedule_eh(ap); > + > spin_unlock_irqrestore(ap->lock, flags); > + mutex_unlock(&ap->scsi_scan_mutex); > > /* wait till EH commits suicide */ > ata_port_wait_eh(ap); > -- > 2.41.0 > Looks good: Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Could you perhaps also fix the WARN_ON(!(ap->pflags & ATA_PFLAG_UNLOADED)); in this very same function. The pflag is checked without holding the lock. Kind regards, Niklas
On 2023/09/19 6:21, Niklas Cassel wrote: > On Fri, Sep 15, 2023 at 05:14:46PM +0900, Damien Le Moal wrote: >> Whenever an ATA adapter driver is removed (e.g. rmmod), >> ata_port_detach() is called repeatedly for all the adapter ports to >> remove (unload) the devices attached to the port and delete the port >> device itself. Removing of devices is done using libata EH with the >> ATA_PFLAG_UNLOADING port flag set. This causes libata EH to execute >> ata_eh_unload() which disables all devices attached to the port. >> >> ata_port_detach() finishes by calling scsi_remove_host() to remove the >> scsi host associated with the port. This function will trigger the >> removal of all scsi devices attached to the host and in the case of >> disks, calls to sd_shutdown() which will flush the device write cache >> and stop the device. However, given that the devices were already >> disabled by ata_eh_unload(), the synchronize write cache command and >> start stop unit commands fail. E.g. running "rmmod ahci" with first >> removing sd_mod results in error messages like: >> >> ata13.00: disable device >> sd 0:0:0:0: [sda] Synchronizing SCSI cache >> sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> sd 0:0:0:0: [sda] Stopping disk >> sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> >> Fix this by removing all scsi devices of the ata devices connected to >> the port before scheduling libata EH to disable the ATA devices. >> >> Fixes: 720ba12620ee ("[PATCH] libata-hp: update unload-unplug") >> Cc: stable@vger.kernel.org >> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> >> Reviewed-by: Hannes Reinecke <hare@suse.de> >> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> >> --- >> drivers/ata/libata-core.c | 21 ++++++++++++++++++++- >> 1 file changed, 20 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c >> index c4898483d716..693cb3cd70cd 100644 >> --- a/drivers/ata/libata-core.c >> +++ b/drivers/ata/libata-core.c >> @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) >> struct ata_link *link; >> struct ata_device *dev; >> >> - /* tell EH we're leaving & flush EH */ >> + /* Wait for any ongoing EH */ >> + ata_port_wait_eh(ap); >> + >> + mutex_lock(&ap->scsi_scan_mutex); >> spin_lock_irqsave(ap->lock, flags); >> + >> + /* Remove scsi devices */ >> + ata_for_each_link(link, ap, HOST_FIRST) { >> + ata_for_each_dev(dev, link, ALL) { >> + if (dev->sdev) { >> + spin_unlock_irqrestore(ap->lock, flags); >> + scsi_remove_device(dev->sdev); >> + spin_lock_irqsave(ap->lock, flags); >> + dev->sdev = NULL; >> + } >> + } >> + } >> + >> + /* Tell EH to disable all devices */ >> ap->pflags |= ATA_PFLAG_UNLOADING; >> ata_port_schedule_eh(ap); >> + >> spin_unlock_irqrestore(ap->lock, flags); >> + mutex_unlock(&ap->scsi_scan_mutex); >> >> /* wait till EH commits suicide */ >> ata_port_wait_eh(ap); >> -- >> 2.41.0 >> > > Looks good: > Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> > > Could you perhaps also fix the WARN_ON(!(ap->pflags & ATA_PFLAG_UNLOADED)); > in this very same function. > > The pflag is checked without holding the lock. It does not matter much at that point since EH is dead, or supposed to be, which is what the WARN_ON() checks. > > > Kind regards, > Niklas
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index c4898483d716..693cb3cd70cd 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5952,11 +5952,30 @@ static void ata_port_detach(struct ata_port *ap) struct ata_link *link; struct ata_device *dev; - /* tell EH we're leaving & flush EH */ + /* Wait for any ongoing EH */ + ata_port_wait_eh(ap); + + mutex_lock(&ap->scsi_scan_mutex); spin_lock_irqsave(ap->lock, flags); + + /* Remove scsi devices */ + ata_for_each_link(link, ap, HOST_FIRST) { + ata_for_each_dev(dev, link, ALL) { + if (dev->sdev) { + spin_unlock_irqrestore(ap->lock, flags); + scsi_remove_device(dev->sdev); + spin_lock_irqsave(ap->lock, flags); + dev->sdev = NULL; + } + } + } + + /* Tell EH to disable all devices */ ap->pflags |= ATA_PFLAG_UNLOADING; ata_port_schedule_eh(ap); + spin_unlock_irqrestore(ap->lock, flags); + mutex_unlock(&ap->scsi_scan_mutex); /* wait till EH commits suicide */ ata_port_wait_eh(ap);