Message ID | 20230922092906.2645265-1-haowenchao2@huawei.com |
---|---|
Headers | show |
Series | scsi:scsi_debug: Add error injection for single device | expand |
Doug, > The original error injection mechanism was based on scsi_host which > could not inject fault for a single SCSI device. > > This patchset provides the ability to inject errors for a single SCSI > device. Now we supports inject timeout errors, queuecommand errors, > and hostbyte, driverbyte, statusbyte, and sense data for specific SCSI > Command. Two new error injection is defined to make abort command or > reset LUN failed. Please review patches 7 through 10. Thank you!
On 2023-09-22 05:28, Wenchao Hao wrote: > Create directory scsi_debug in the root of the debugfs filesystem. > Prepare to add interface for manage error injection. > > Acked-by: Douglas Gilbert <dgilbert@interlog.com> > Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> > --- > drivers/scsi/scsi_debug.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c > index 9c0af50501f9..35c336271b13 100644 > --- a/drivers/scsi/scsi_debug.c > +++ b/drivers/scsi/scsi_debug.c > @@ -41,6 +41,7 @@ > #include <linux/random.h> > #include <linux/xarray.h> > #include <linux/prefetch.h> > +#include <linux/debugfs.h> > > #include <net/checksum.h> > > @@ -862,6 +863,8 @@ static const int device_qfull_result = > > static const int condition_met_result = SAM_STAT_CONDITION_MET; > > +static struct dentry *sdebug_debugfs_root; > + > > /* Only do the extra work involved in logical block provisioning if one or > * more of the lbpu, lbpws or lbpws10 parameters are given and we are doing > @@ -7011,6 +7014,8 @@ static int __init scsi_debug_init(void) > goto driver_unreg; > } > > + sdebug_debugfs_root = debugfs_create_dir("scsi_debug", NULL); debugfs_create_dir() can fail and return NULL. Looking at other drivers, most seem to assume it will work. Since the scsi_debug driver is often used to test abnormal situations, perhaps adding something like: if (!sdebug_debugfs_root) pr_info("%s: failed to create initial debugfs directory\n", __func__); might save someone a bit of time if a NULL dereference on sdebug_debugfs_root follows later. That is what the mpt3sas driver does. Doug Gilbert > + > for (k = 0; k < hosts_to_add; k++) { > if (want_store && k == 0) { > ret = sdebug_add_host_helper(idx); > @@ -7057,6 +7062,7 @@ static void __exit scsi_debug_exit(void) > > sdebug_erase_all_stores(false); > xa_destroy(per_store_ap); > + debugfs_remove(sdebug_debugfs_root); > } > > device_initcall(scsi_debug_init);
On 2023/9/28 9:13, Douglas Gilbert wrote: > On 2023-09-22 05:28, Wenchao Hao wrote: >> Create directory scsi_debug in the root of the debugfs filesystem. >> Prepare to add interface for manage error injection. >> >> Acked-by: Douglas Gilbert <dgilbert@interlog.com> >> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> >> --- >> drivers/scsi/scsi_debug.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c >> index 9c0af50501f9..35c336271b13 100644 >> --- a/drivers/scsi/scsi_debug.c >> +++ b/drivers/scsi/scsi_debug.c >> @@ -41,6 +41,7 @@ >> #include <linux/random.h> >> #include <linux/xarray.h> >> #include <linux/prefetch.h> >> +#include <linux/debugfs.h> >> #include <net/checksum.h> >> @@ -862,6 +863,8 @@ static const int device_qfull_result = >> static const int condition_met_result = SAM_STAT_CONDITION_MET; >> +static struct dentry *sdebug_debugfs_root; >> + >> /* Only do the extra work involved in logical block provisioning if one or >> * more of the lbpu, lbpws or lbpws10 parameters are given and we are doing >> @@ -7011,6 +7014,8 @@ static int __init scsi_debug_init(void) >> goto driver_unreg; >> } >> + sdebug_debugfs_root = debugfs_create_dir("scsi_debug", NULL); > > debugfs_create_dir() can fail and return NULL. Looking at other drivers, most > seem to assume it will work. Since the scsi_debug driver is often used to test > abnormal situations, perhaps adding something like: > if (!sdebug_debugfs_root) > pr_info("%s: failed to create initial debugfs directory\n", __func__); > > might save someone a bit of time if a NULL dereference on sdebug_debugfs_root > follows later. That is what the mpt3sas driver does. > Yes, I would fix it by checking return value of debugfs related call after your review suggestions for other patches. > Doug Gilbert > >> + >> for (k = 0; k < hosts_to_add; k++) { >> if (want_store && k == 0) { >> ret = sdebug_add_host_helper(idx); >> @@ -7057,6 +7062,7 @@ static void __exit scsi_debug_exit(void) >> sdebug_erase_all_stores(false); >> xa_destroy(per_store_ap); >> + debugfs_remove(sdebug_debugfs_root); >> } >> device_initcall(scsi_debug_init); > >
On 2023-09-22 05:29, Wenchao Hao wrote: > Add error injection type 3 to make scsi_debug_abort() return FAILED. > Fail abort command foramt: s/foramt/format/ > > +--------+------+-------------------------------------------------------+ > | Column | Type | Description | > +--------+------+-------------------------------------------------------+ > | 1 | u8 | Error type, fixed to 0x3 | > +--------+------+-------------------------------------------------------+ > | 2 | s32 | Error count | > | | | 0: this rule will be ignored | > | | | positive: the rule will always take effect | > | | | negative: the rule takes effect n times where -n is | > | | | the value given. Ignored after n times | > +--------+------+-------------------------------------------------------+ > | 3 | x8 | SCSI command opcode, 0xff for all commands | > +--------+------+-------------------------------------------------------+ > > Examples: > error=/sys/kernel/debug/scsi_debug/0:0:0:1/error > echo "0 -10 0x12" > ${error} > will make the device return FAILED when abort inquiry command 10 times. Tested with: # sg_raw -t 10 -r 1k /dev/sg1 12 00 00 00 60 00 After 10 seconds (the timeout specified to sg_raw) I saw this: >>> transport error: Host_status=0x03 [DID_TIME_OUT] And the # cat /sys/kernel/debug/scsi_debug/1\:0\:0\:0/error Count value changed from -10 up to 0 for each invocation of the INQUIRY command. Thereafter the INQUIRY command worked. Looks good. Tested-by: Douglas Gilbert <dgilbert@interlog.com> > > Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> > --- > drivers/scsi/scsi_debug.c | 40 +++++++++++++++++++++++++++++++++++++++ > 1 file changed, 40 insertions(+) > > diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c > index fe1f7421f617..8a16cb9642a6 100644 > --- a/drivers/scsi/scsi_debug.c > +++ b/drivers/scsi/scsi_debug.c > @@ -293,6 +293,8 @@ enum sdebug_err_type { > ERR_FAIL_CMD = 2, /* make specific scsi command's */ > /* queuecmd return succeed but */ > /* with errors set in scsi_cmnd */ > + ERR_ABORT_CMD_FAILED = 3, /* control return FAILED from */ > + /* scsi_debug_abort() */ > }; > > struct sdebug_err_inject { > @@ -970,6 +972,7 @@ static int sdebug_error_show(struct seq_file *m, void *p) > list_for_each_entry_rcu(err, &devip->inject_err_list, list) { > switch (err->type) { > case ERR_TMOUT_CMD: > + case ERR_ABORT_CMD_FAILED: > seq_printf(m, "%d\t%d\t0x%x\n", err->type, err->cnt, > err->cmd); > break; > @@ -1031,6 +1034,7 @@ static ssize_t sdebug_error_write(struct file *file, const char __user *ubuf, > > switch (inject_type) { > case ERR_TMOUT_CMD: > + case ERR_ABORT_CMD_FAILED: > if (sscanf(buf, "%d %d %hhx", &inject->type, &inject->cnt, > &inject->cmd) != 3) > goto out_error; > @@ -5504,9 +5508,39 @@ static void stop_all_queued(void) > mutex_unlock(&sdebug_host_list_mutex); > } > > +static int sdebug_fail_abort(struct scsi_cmnd *cmnd) > +{ > + struct scsi_device *sdp = cmnd->device; > + struct sdebug_dev_info *devip = (struct sdebug_dev_info *)sdp->hostdata; > + struct sdebug_err_inject *err; > + unsigned char *cmd = cmnd->cmnd; > + int ret = 0; > + > + if (devip == NULL) > + return 0; > + > + rcu_read_lock(); > + list_for_each_entry_rcu(err, &devip->inject_err_list, list) { > + if (err->type == ERR_ABORT_CMD_FAILED && > + (err->cmd == cmd[0] || err->cmd == 0xff)) { > + ret = !!err->cnt; > + if (err->cnt < 0) > + err->cnt++; > + > + rcu_read_unlock(); > + return ret; > + } > + } > + rcu_read_unlock(); > + > + return 0; > +} > + > static int scsi_debug_abort(struct scsi_cmnd *SCpnt) > { > bool ok = scsi_debug_abort_cmnd(SCpnt); > + u8 *cmd = SCpnt->cmnd; > + u8 opcode = cmd[0]; > > ++num_aborts; > > @@ -5515,6 +5549,12 @@ static int scsi_debug_abort(struct scsi_cmnd *SCpnt) > "%s: command%s found\n", __func__, > ok ? "" : " not"); > > + if (sdebug_fail_abort(SCpnt)) { > + scmd_printk(KERN_INFO, SCpnt, "fail abort command 0x%x\n", > + opcode); > + return FAILED; > + } > + > return SUCCESS; > } >
On 2023-09-22 05:29, Wenchao Hao wrote: > The interface is found at > /sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t> > identifies the target to inject errors on. It's a simple bool type > interface which would make this target's reset fail if set to 'Y'. > > Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> > Reported-by: kernel test robot <oliver.sang@intel.com> Tested by setting 'echo 1 > /sys/bus/pseudo/drivers/scsi_debug/opts' and observing 'tail -f /var/log/syslog'. Looks good including that fail_reset is readable so its current state can be checked. Tested-by: Douglas Gilbert <dgilbert@interlog.com> <snip>