Message ID | 1602732462-10443-4-git-send-email-muneendra.kumar@broadcom.com |
---|---|
State | New |
Headers | show |
Series | [v3,01/17] scsi: Added a new definition in scsi_cmnd.h | expand |
On 10/14/20 10:27 PM, Muneendra wrote: > Made an additional check in scsi_noretry_cmd to verify whether user has > decided not to do retries on abort success by setting the > SCMD_NORETRIES_ABORT bit > > If SCMD_NORETRIES_ABORT bit is set we are making sure there won't be any > retries done on the same path and also setting the host byte as > DID_TRANSPORT_MARGINAL so that the error can be propogated as recoverable > transport error to the blk layers. > > Added a code in scsi_result_to_blk_status to translate > a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t > i.e BLK_STS_TRANSPORT > > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> > > --- > v3: > Merged first part of the previous patch(v2 patch3) with > this patch. > > v2: > set the hostbyte as DID_TRANSPORT_MARGINAL instead of > DID_TRANSPORT_FAILFAST. > --- > drivers/scsi/scsi_error.c | 10 ++++++++++ > drivers/scsi/scsi_lib.c | 1 + > 2 files changed, 11 insertions(+) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index ae80daa5d831..aa30c1c9e9db 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -1763,6 +1763,16 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd) > return 0; > > check_type: > + /* > + * Check whether caller has decided not to do retries on > + * abort success by setting the SCMD_NORETRIES_ABORT bit The comment seems wrong here because we are not setting that bit. > + */ > + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && > + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { > + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); > + return 1; > + } > + > /* > * assume caller has checked sense and determined > * the check condition was retryable. > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 2b5dea07498e..9606bad1542f 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result) > return BLK_STS_OK; > return BLK_STS_IOERR; > case DID_TRANSPORT_FAILFAST: > + case DID_TRANSPORT_MARGINAL: > return BLK_STS_TRANSPORT; > case DID_TARGET_FAILURE: > set_host_byte(cmd, DID_OK); >
On 10/14/20 10:27 PM, Muneendra wrote: > check_type: > + /* > + * Check whether caller has decided not to do retries on > + * abort success by setting the SCMD_NORETRIES_ABORT bit > + */ > + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && > + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { > + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); Hey, one other thing that might be confusing me is this check and the naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes me think we want to only run this if the cmd timedout and we went through the SCSI EH TMF operations. However, I think this will end up failing other errors with DID_TRANSPORT_MARGINAL right? Did you want just the the SCSI EH timeout/abort case to hit this or any errors that hit this code path? > + return 1; > + } > +
Hi Michael, > check_type: > + /* > + * Check whether caller has decided not to do retries on > + * abort success by setting the SCMD_NORETRIES_ABORT bit > + */ > + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && > + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { > + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); >Hey, one other thing that might be confusing me is this check and the >naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes >me think we want to only run this if the cmd timedout and we went >through >the SCSI EH TMF operations. However, I think this will end up failing other >errors with DID_TRANSPORT_MARGINAL right? >Did you want just the the SCSI EH timeout/abort case to hit this or any >errors that hit this code path? [Muneendra]At present we want SCSI EH timeout/abort case to hit this. Regards, Muneendra.
On 10/19/20 2:26 PM, Muneendra Kumar M wrote: > Hi Michael, > >> check_type: >> + /* >> + * Check whether caller has decided not to do retries on >> + * abort success by setting the SCMD_NORETRIES_ABORT bit >> + */ >> + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && >> + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { >> + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); > >> Hey, one other thing that might be confusing me is this check and the >> naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes >> me think we want to only run this if the cmd timedout and we went >through >> the SCSI EH TMF operations. However, I think this will end up failing other >> errors with DID_TRANSPORT_MARGINAL right? > >> Did you want just the the SCSI EH timeout/abort case to hit this or any >> errors that hit this code path? > [Muneendra]At present we want SCSI EH timeout/abort case to hit this. What about adding a new eh callout that the transport classes can implement. They can check the port state at this time and decide if the cmd should be retried or failed. It would be similar to fc_eh_timed_out for example.
On 10/20/20 2:53 PM, Mike Christie wrote: > On 10/19/20 2:26 PM, Muneendra Kumar M wrote: >> Hi Michael, >> >>> check_type: >>> + /* >>> + * Check whether caller has decided not to do retries on >>> + * abort success by setting the SCMD_NORETRIES_ABORT bit >>> + */ >>> + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && >>> + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { >>> + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); >> >>> Hey, one other thing that might be confusing me is this check and the >>> naming. The 0/17 email description and the SCMD_NORETRIES_ABORT name makes >>> me think we want to only run this if the cmd timedout and we went >through >>> the SCSI EH TMF operations. However, I think this will end up failing other >>> errors with DID_TRANSPORT_MARGINAL right? >> >>> Did you want just the the SCSI EH timeout/abort case to hit this or any >>> errors that hit this code path? >> [Muneendra]At present we want SCSI EH timeout/abort case to hit this. > > What about adding a new eh callout that the transport classes can implement. > They can check the port state at this time and decide if the cmd should be > retried or failed. It would be similar to fc_eh_timed_out for example. > Or, in the fc_eh_timed_out callout set the SCMD_NORETRIES_ABORT if port state is marginal and it indicates it wants to abort the cmd. In scsi_noretry_cmd above then you know we got there, because the cmd timedout and we tried/were going to abort it. We don't need the code to loop over running commands, the chkready changes, etc.
Hi Michael, Thanks for the input. >Or, in the fc_eh_timed_out callout set the SCMD_NORETRIES_ABORT if port >state is marginal and it indicates it wants to abort the cmd. In >scsi_noretry_cmd above then you know we got there, because the cmd timedout >and we >tried/were going to abort it. We don't need the code to loop over >running commands, the chkready changes, etc. Agreed. I will incorporate all your review comments. Regards, Muneendra.
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index ae80daa5d831..aa30c1c9e9db 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1763,6 +1763,16 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd) return 0; check_type: + /* + * Check whether caller has decided not to do retries on + * abort success by setting the SCMD_NORETRIES_ABORT bit + */ + if ((test_bit(SCMD_NORETRIES_ABORT, &scmd->state)) && + (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT)) { + set_host_byte(scmd, DID_TRANSPORT_MARGINAL); + return 1; + } + /* * assume caller has checked sense and determined * the check condition was retryable. diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 2b5dea07498e..9606bad1542f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -629,6 +629,7 @@ static blk_status_t scsi_result_to_blk_status(struct scsi_cmnd *cmd, int result) return BLK_STS_OK; return BLK_STS_IOERR; case DID_TRANSPORT_FAILFAST: + case DID_TRANSPORT_MARGINAL: return BLK_STS_TRANSPORT; case DID_TARGET_FAILURE: set_host_byte(cmd, DID_OK);
Made an additional check in scsi_noretry_cmd to verify whether user has decided not to do retries on abort success by setting the SCMD_NORETRIES_ABORT bit If SCMD_NORETRIES_ABORT bit is set we are making sure there won't be any retries done on the same path and also setting the host byte as DID_TRANSPORT_MARGINAL so that the error can be propogated as recoverable transport error to the blk layers. Added a code in scsi_result_to_blk_status to translate a new error DID_TRANSPORT_MARGINAL to the corresponding blk_status_t i.e BLK_STS_TRANSPORT Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> --- v3: Merged first part of the previous patch(v2 patch3) with this patch. v2: set the hostbyte as DID_TRANSPORT_MARGINAL instead of DID_TRANSPORT_FAILFAST. --- drivers/scsi/scsi_error.c | 10 ++++++++++ drivers/scsi/scsi_lib.c | 1 + 2 files changed, 11 insertions(+)