diff mbox series

[RESEND] scsi: megaraid_sas: make module parameter scmd_timeout writable

Message ID 20240408100505.1732370-1-lei.chen@smartx.com
State New
Headers show
Series [RESEND] scsi: megaraid_sas: make module parameter scmd_timeout writable | expand

Commit Message

Lei Chen April 8, 2024, 10:05 a.m. UTC
When an scmd times out, block layer calls megasas_reset_timer to
make further decisions. scmd_timeout indicates when an scmd is really
timed-out. If we want to make this process more fast, we can decrease
this value. This patch allows users to change this value in run-time.

Signed-off-by: Lei Chen <lei.chen@smartx.com>
---
 drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

John Garry April 8, 2024, 12:30 p.m. UTC | #1
On 08/04/2024 11:05, Lei Chen wrote:
> When an scmd times out, block layer calls megasas_reset_timer to
> make further decisions. scmd_timeout indicates when an scmd is really
> timed-out.

What does really timed-out mean?

> If we want to make this process more fast, we can decrease
> this value. This patch allows users to change this value in run-time.
> 
> Signed-off-by: Lei Chen <lei.chen@smartx.com>
> ---
>   drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> index 3d4f13da1ae8..2a165e5dc7a3 100644
> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> @@ -91,7 +91,7 @@ module_param(dual_qdepth_disable, int, 0444);
>   MODULE_PARM_DESC(dual_qdepth_disable, "Disable dual queue depth feature. Default: 0");
>   
>   static unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
> -module_param(scmd_timeout, int, 0444);
> +module_param(scmd_timeout, int, 0644);
>   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
>   
>   int perf_mode = -1;

I don't know why megaraid_sas has special handling here (and bypasses 
SCSI midlayer).

If the host is overloaded and you get a time-out as a command simply 
could not be handled in time, can you alternatively try reducing the 
scsi device queue depth?
Lei Chen April 9, 2024, 7:12 a.m. UTC | #2
Sorry for the non plain text format.

On Mon, Apr 8, 2024 at 8:30 PM John Garry <john.g.garry@oracle.com> wrote:
>
> On 08/04/2024 11:05, Lei Chen wrote:
> > When an scmd times out, block layer calls megasas_reset_timer to
> > make further decisions. scmd_timeout indicates when an scmd is really
> > timed-out.
>
> What does really timed-out mean?

scsi_times_out will call eh_timed_out (in megaraid driver, this
indicates megasas_reset_timer),
megasas_reset_timer determines whether a scmd is timed out. If not, it
will return
BLK_EH_RESET_TIMER to tell the block layer to reset the timer and do nothing.
>
>
> > If we want to make this process more fast, we can decrease
> > this value. This patch allows users to change this value in run-time.
> >
> > Signed-off-by: Lei Chen <lei.chen@smartx.com>
> > ---
> >   drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
> > index 3d4f13da1ae8..2a165e5dc7a3 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > @@ -91,7 +91,7 @@ module_param(dual_qdepth_disable, int, 0444);
> >   MODULE_PARM_DESC(dual_qdepth_disable, "Disable dual queue depth feature. Default: 0");
> >
> >   static unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
> > -module_param(scmd_timeout, int, 0444);
> > +module_param(scmd_timeout, int, 0644);
> >   MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
> >
> >   int perf_mode = -1;
>
> I don't know why megaraid_sas has special handling here (and bypasses
> SCSI midlayer).
>
> If the host is overloaded and you get a time-out as a command simply
> could not be handled in time, can you alternatively try reducing the
> scsi device queue depth?


Yeah, scsi layer and drivers already have some methods to control the
queue depth. For megaraid driver,
it will throttle queue depth in megasas_reset_timer. But since scsi
disks on the same megaraid card share
 the queue depth,  that will impact other scsi disks.
In most cases, a scsi disk is more likely to be misworking than a RAID
card, which makes scmd wrong and retry.
We want to adjust scmd_timeout without reloading the driver to make
scmds against abnormal scsi disks completed faster.
diff mbox series

Patch

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 3d4f13da1ae8..2a165e5dc7a3 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -91,7 +91,7 @@  module_param(dual_qdepth_disable, int, 0444);
 MODULE_PARM_DESC(dual_qdepth_disable, "Disable dual queue depth feature. Default: 0");
 
 static unsigned int scmd_timeout = MEGASAS_DEFAULT_CMD_TIMEOUT;
-module_param(scmd_timeout, int, 0444);
+module_param(scmd_timeout, int, 0644);
 MODULE_PARM_DESC(scmd_timeout, "scsi command timeout (10-90s), default 90s. See megasas_reset_timer.");
 
 int perf_mode = -1;