Message ID | 20210401091105.8046-1-mwilck@suse.com |
---|---|
State | New |
Headers | show |
Series | scsi: scsi_transport_srp: don't block target in SRP_PORT_LOST state | expand |
On 4/1/21 2:11 AM, mwilck@suse.com wrote: > rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and > the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with > srp_reconnect_work(), a warning is printed: Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On Fri, 2021-04-02 at 12:38 -0700, Bart Van Assche wrote: > On 4/1/21 2:11 AM, mwilck@suse.com wrote: > > rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and > > the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with > > srp_reconnect_work(), a warning is printed: > > Reviewed-by: Bart Van Assche <bvanassche@acm.org> > Indeed I have seen this while running rapid resets in my lab. Was not sure if it was something I was doing or a real bug. For example this script will bring it out if I lower the delay #!/bin/bash #on ibclient server in /sys/class/srp_remote_ports, using echo 1 > delete for the particular port will simulate a port reset. #/sys/class/srp_remote_ports #[root@ibclient srp_remote_ports]# ls #port-1:1 port-2:1 for d in /sys/class/srp_remote_ports/* do echo 1 > $d/delete sleep 60 done Looks correct, and anyway Bart agrees so: Reviewed-by: Laurence Oberman <loberman@redhat.com>
On Thu, 1 Apr 2021 11:11:05 +0200, mwilck@suse.com wrote: > rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and > the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with > srp_reconnect_work(), a warning is printed: > > Mar 27 18:48:07 ictm1604s01h4 kernel: dev_loss_tmo expired for SRP port-18:1 / host18. > Mar 27 18:48:07 ictm1604s01h4 kernel: ------------[ cut here ]------------ > Mar 27 18:48:07 ictm1604s01h4 kernel: scsi_internal_device_block(18:0:0:100) failed: ret = -22 > Mar 27 18:48:07 ictm1604s01h4 kernel: Call Trace: > Mar 27 18:48:07 ictm1604s01h4 kernel: ? scsi_target_unblock+0x50/0x50 [scsi_mod] > Mar 27 18:48:07 ictm1604s01h4 kernel: starget_for_each_device+0x80/0xb0 [scsi_mod] > Mar 27 18:48:07 ictm1604s01h4 kernel: target_block+0x24/0x30 [scsi_mod] > Mar 27 18:48:07 ictm1604s01h4 kernel: device_for_each_child+0x57/0x90 > Mar 27 18:48:07 ictm1604s01h4 kernel: srp_reconnect_rport+0xe4/0x230 [scsi_transport_srp] > Mar 27 18:48:07 ictm1604s01h4 kernel: srp_reconnect_work+0x40/0xc0 [scsi_transport_srp] > > [...] Applied to 5.12/scsi-fixes, thanks! [1/1] scsi: scsi_transport_srp: don't block target in SRP_PORT_LOST state https://git.kernel.org/mkp/scsi/c/5cd0f6f57639 -- Martin K. Petersen Oracle Linux Engineering
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index 1e939a2a387f..98a34ed10f1a 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -541,7 +541,7 @@ int srp_reconnect_rport(struct srp_rport *rport) res = mutex_lock_interruptible(&rport->mutex); if (res) goto out; - if (rport->state != SRP_RPORT_FAIL_FAST) + if (rport->state != SRP_RPORT_FAIL_FAST && rport->state != SRP_RPORT_LOST) /* * sdev state must be SDEV_TRANSPORT_OFFLINE, transition * to SDEV_BLOCK is illegal. Calling scsi_target_unblock()