Message ID | 20240117213620.132880-1-brking@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Series | scsi: Update max_hw_sectors on rescan | expand |
On 17/01/2024 21:36, Brian King wrote: > This addresses an issue discovered on ibmvfc LUNs. For this driver, > max_sectors is negotiated with the VIOS. This gets done at initialization > time, then LUNs get scanned and things generally work fine. However, > this attribute can be changed on the VIOS, either due to a sysadmin > change or potentially a VIOS code level change. If this decreases > to a smaller value, due to one of these reasons, the next time the > ibmvfc driver performs an NPIV login, it will only be able to use > the smaller value. In the case of a VIOS reboot, when the VIOS goes > down, all paths through that VIOS will go to devloss state. When > the VIOS comes back up, ibmvfc negotiates max_sectors and will only > be able to get the smaller value and it will update shost->max_sectors. Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted. Thanks, John > However, when LUNs are scanned, the devloss paths will be found > and brought back online, still using the old max_hw_sectors. This > change ensures that max_hw_sectors gets updated. > > Signed-off-by: Brian King <brking@linux.vnet.ibm.com> > --- > drivers/scsi/scsi_scan.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index 44680f65ea14..01f2b38daab3 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, > blist_flags_t bflags; > int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; > struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > + struct request_queue *q; > > /* > * The rescan flag is used as an optimization, the first scan of a > @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, > *bflagsp = scsi_get_device_flags(sdev, > sdev->vendor, > sdev->model); > + q = sdev->request_queue; > + if (queue_max_hw_sectors(q) > shost->max_sectors) > + blk_queue_max_hw_sectors(q, shost->max_sectors); > + > return SCSI_SCAN_LUN_PRESENT; > } > scsi_device_put(sdev); > @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost) > } > spin_unlock_irqrestore(shost->host_lock, flags); > } > -
On 18/01/2024 17:22, Brian King wrote: > On 1/18/24 9:44 AM, John Garry wrote: >> On 17/01/2024 21:36, Brian King wrote: >>> This addresses an issue discovered on ibmvfc LUNs. For this driver, >>> max_sectors is negotiated with the VIOS. This gets done at initialization >>> time, then LUNs get scanned and things generally work fine. However, >>> this attribute can be changed on the VIOS, either due to a sysadmin >>> change or potentially a VIOS code level change. If this decreases >>> to a smaller value, due to one of these reasons, the next time the >>> ibmvfc driver performs an NPIV login, it will only be able to use >>> the smaller value. In the case of a VIOS reboot, when the VIOS goes >>> down, all paths through that VIOS will go to devloss state. When >>> the VIOS comes back up, ibmvfc negotiates max_sectors and will only >>> be able to get the smaller value and it will update shost->max_sectors. >> >> Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted. > > That is what happens. The characteristics of the underlying hardware can change across > a virtual adapter reset. That's unfortunate. I don't think that it's a good idea to change shost->max_sectors after adding the scsi host or to add core code to condone doing it. Indeed, there is code there to limit shost->max_sectors from DMA mapping constraints in scsi_add_host() path, which should not be ignored. Would it be possible to initially set shost->max_sectors for this adapter at the lowest anticipated value for that adapter and don't change thereafter? Thanks, John > > Thanks, > > Brian > >> >> Thanks, >> John >> >>> However, when LUNs are scanned, the devloss paths will be found >>> and brought back online, still using the old max_hw_sectors. This >>> change ensures that max_hw_sectors gets updated. >>> >>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> >>> --- >>> drivers/scsi/scsi_scan.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c >>> index 44680f65ea14..01f2b38daab3 100644 >>> --- a/drivers/scsi/scsi_scan.c >>> +++ b/drivers/scsi/scsi_scan.c >>> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, >>> blist_flags_t bflags; >>> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; >>> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); >>> + struct request_queue *q; >>> /* >>> * The rescan flag is used as an optimization, the first scan of a >>> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, >>> *bflagsp = scsi_get_device_flags(sdev, >>> sdev->vendor, >>> sdev->model); >>> + q = sdev->request_queue; >>> + if (queue_max_hw_sectors(q) > shost->max_sectors) >>> + blk_queue_max_hw_sectors(q, shost->max_sectors); >>> + >>> return SCSI_SCAN_LUN_PRESENT; >>> } >>> scsi_device_put(sdev); >>> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost) >>> } >>> spin_unlock_irqrestore(shost->host_lock, flags); >>> } >>> - >> >
On 1/19/24 3:02 AM, John Garry wrote: > On 18/01/2024 17:22, Brian King wrote: >> On 1/18/24 9:44 AM, John Garry wrote: >>> On 17/01/2024 21:36, Brian King wrote: >>>> This addresses an issue discovered on ibmvfc LUNs. For this driver, >>>> max_sectors is negotiated with the VIOS. This gets done at initialization >>>> time, then LUNs get scanned and things generally work fine. However, >>>> this attribute can be changed on the VIOS, either due to a sysadmin >>>> change or potentially a VIOS code level change. If this decreases >>>> to a smaller value, due to one of these reasons, the next time the >>>> ibmvfc driver performs an NPIV login, it will only be able to use >>>> the smaller value. In the case of a VIOS reboot, when the VIOS goes >>>> down, all paths through that VIOS will go to devloss state. When >>>> the VIOS comes back up, ibmvfc negotiates max_sectors and will only >>>> be able to get the smaller value and it will update shost->max_sectors. >>> >>> Are you saying that the driver will manually update shost->max_sectors after adding the scsi host? I didn't think that was permitted. >> >> That is what happens. The characteristics of the underlying hardware can change across >> a virtual adapter reset. > > That's unfortunate. > > I don't think that it's a good idea to change shost->max_sectors after adding the scsi host or to add core code to condone doing it. Indeed, there is code there to limit shost->max_sectors from DMA mapping constraints in scsi_add_host() path, which should not be ignored. Good point. However, this patch only lowers max_hw sectors if shost->max_sectors has since been decreased. > > Would it be possible to initially set shost->max_sectors for this adapter at the lowest anticipated value for that adapter and don't change thereafter? Different physical backing devices support different ranges of values and the physical backing device can change dynamically. There is currently no defined way for the client to determine what the lowest possible value is. The downside to adding such an attribute would be that we'd then always be limited to an arbitrarily small value, which would limit the performance. Thanks, Brian > > Thanks, > John > >> >> Thanks, >> >> Brian >> >>> >>> Thanks, >>> John >>> >>>> However, when LUNs are scanned, the devloss paths will be found >>>> and brought back online, still using the old max_hw_sectors. This >>>> change ensures that max_hw_sectors gets updated. >>>> >>>> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> >>>> --- >>>> drivers/scsi/scsi_scan.c | 6 +++++- >>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c >>>> index 44680f65ea14..01f2b38daab3 100644 >>>> --- a/drivers/scsi/scsi_scan.c >>>> +++ b/drivers/scsi/scsi_scan.c >>>> @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, >>>> blist_flags_t bflags; >>>> int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; >>>> struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); >>>> + struct request_queue *q; >>>> /* >>>> * The rescan flag is used as an optimization, the first scan of a >>>> @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, >>>> *bflagsp = scsi_get_device_flags(sdev, >>>> sdev->vendor, >>>> sdev->model); >>>> + q = sdev->request_queue; >>>> + if (queue_max_hw_sectors(q) > shost->max_sectors) >>>> + blk_queue_max_hw_sectors(q, shost->max_sectors); >>>> + >>>> return SCSI_SCAN_LUN_PRESENT; >>>> } >>>> scsi_device_put(sdev); >>>> @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost) >>>> } >>>> spin_unlock_irqrestore(shost->host_lock, flags); >>>> } >>>> - >>> >> > >
On 1/17/24 3:36 PM, Brian King wrote: > This addresses an issue discovered on ibmvfc LUNs. For this driver, > max_sectors is negotiated with the VIOS. This gets done at initialization > time, then LUNs get scanned and things generally work fine. However, > this attribute can be changed on the VIOS, either due to a sysadmin > change or potentially a VIOS code level change. If this decreases > to a smaller value, due to one of these reasons, the next time the > ibmvfc driver performs an NPIV login, it will only be able to use > the smaller value. In the case of a VIOS reboot, when the VIOS goes > down, all paths through that VIOS will go to devloss state. When > the VIOS comes back up, ibmvfc negotiates max_sectors and will only > be able to get the smaller value and it will update shost->max_sectors. > However, when LUNs are scanned, the devloss paths will be found > and brought back online, still using the old max_hw_sectors. This > change ensures that max_hw_sectors gets updated. > > Signed-off-by: Brian King <brking@linux.vnet.ibm.com> > --- > drivers/scsi/scsi_scan.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index 44680f65ea14..01f2b38daab3 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, > blist_flags_t bflags; > int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; > struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > + struct request_queue *q; > > /* > * The rescan flag is used as an optimization, the first scan of a > @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, > *bflagsp = scsi_get_device_flags(sdev, > sdev->vendor, > sdev->model); > + q = sdev->request_queue; > + if (queue_max_hw_sectors(q) > shost->max_sectors) > + blk_queue_max_hw_sectors(q, shost->max_sectors); > + What happens if commands that are larger than the new shost->max_sectors get sent to the driver/device? For example, if we called fc_remote_port_add and scsi_target_unblock puts the existing devices into SDEV_RUNNING, then we do the scsi_scan_target call and hit the code above, could we have commands in the request_queue already (we relogin before fast_io_fail even fires so the commands never get failed)? It looks like commands have already passed checks like bio_may_exceed_limit and will be sent to the driver. Will the driver/device spit out an error? Is this ok, or do you need some sort of flush and limit re-check/re-split?
We can't change the host-wide limit here (it wouldn't apply to all LUs anyway). If your limit is per-LU, you can call blk_queue_max_hw_sectors from ->slave_configure.
On 1/24/24 3:24 AM, Christoph Hellwig wrote: > We can't change the host-wide limit here (it wouldn't apply to all > LUs anyway). If your limit is per-LU, you can call > blk_queue_max_hw_sectors from ->slave_configure. Unfortunately, it doesn't look like slave_configure gets called in the scenario in question. In this case we already have a scsi_device created but its in devloss state and the FC transport layer is bringing it back online. There is also the point that Mike brought up in that if fast fail tmo has not yet fired, there could be I/O still in the queue that is now too large. To answer your earlier question, Mike, if the VIOS receives a request that is too large it closes the CRQ, forcing an entire reinit / discovery, so its definitely not something we want to encounter. I'm trying to get this behavior improved so that only the one command fails, but that's not what happens today. I suppose I could iterate through all the LUNs and call blk_queue_max_hw_sectors on them, but I'm not sure if that solves the problem. It would close the window that Mike highlighted, but if there are commands outstanding when this occurs that are larger than the new max_hw_sectors and they get requeued, will they get split in the block layer when they get resent to the LLD or will they just get resent as-is? If its the latter, I'd get a request larger than what I can support. -Brian
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 44680f65ea14..01f2b38daab3 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1162,6 +1162,7 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, blist_flags_t bflags; int res = SCSI_SCAN_NO_RESPONSE, result_len = 256; struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); + struct request_queue *q; /* * The rescan flag is used as an optimization, the first scan of a @@ -1182,6 +1183,10 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget, *bflagsp = scsi_get_device_flags(sdev, sdev->vendor, sdev->model); + q = sdev->request_queue; + if (queue_max_hw_sectors(q) > shost->max_sectors) + blk_queue_max_hw_sectors(q, shost->max_sectors); + return SCSI_SCAN_LUN_PRESENT; } scsi_device_put(sdev); @@ -2006,4 +2011,3 @@ void scsi_forget_host(struct Scsi_Host *shost) } spin_unlock_irqrestore(shost->host_lock, flags); } -
This addresses an issue discovered on ibmvfc LUNs. For this driver, max_sectors is negotiated with the VIOS. This gets done at initialization time, then LUNs get scanned and things generally work fine. However, this attribute can be changed on the VIOS, either due to a sysadmin change or potentially a VIOS code level change. If this decreases to a smaller value, due to one of these reasons, the next time the ibmvfc driver performs an NPIV login, it will only be able to use the smaller value. In the case of a VIOS reboot, when the VIOS goes down, all paths through that VIOS will go to devloss state. When the VIOS comes back up, ibmvfc negotiates max_sectors and will only be able to get the smaller value and it will update shost->max_sectors. However, when LUNs are scanned, the devloss paths will be found and brought back online, still using the old max_hw_sectors. This change ensures that max_hw_sectors gets updated. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> --- drivers/scsi/scsi_scan.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)