Message ID | 0351ea8bb5d7b223778367867ca3791cc6ee608d.1680083571.git.quic_nguyenb@quicinc.com |
---|---|
State | Superseded |
Headers | show |
Series | ufs: core: mcq: Add ufshcd_abort() and error handler support in MCQ mode | expand |
Hi Bao D., We have done some test based on your RFC v3 patches and an issue is reported. kworker/u16:4: BUG: scheduling while atomic: kworker/u16:4/5736/0x00000002 kworker/u16:4: [name:core&]Preemption disabled at: kworker/u16:4: [<ffffffef97e33024>] ufshcd_mcq_sq_cleanup+0x9c/0x27c kworker/u16:4: CPU: 2 PID: 5736 Comm: kworker/u16:4 Tainted: G S W OE kworker/u16:4: Workqueue: ufs_eh_wq_0 ufshcd_err_handler kworker/u16:4: Call trace: kworker/u16:4: dump_backtrace+0x108/0x15c kworker/u16:4: show_stack+0x20/0x30 kworker/u16:4: dump_stack_lvl+0x6c/0x8c kworker/u16:4: dump_stack+0x20/0x44 kworker/u16:4: __schedule_bug+0xd4/0x100 kworker/u16:4: __schedule+0x660/0xa5c kworker/u16:4: schedule+0x80/0xec kworker/u16:4: schedule_hrtimeout_range_clock+0xa0/0x140 kworker/u16:4: schedule_hrtimeout_range+0x1c/0x30 kworker/u16:4: usleep_range_state+0x88/0xd8 kworker/u16:4: ufshcd_mcq_sq_cleanup+0x170/0x27c kworker/u16:4: ufshcd_clear_cmds+0x78/0x184 kworker/u16:4: ufshcd_wait_for_dev_cmd+0x234/0x348 kworker/u16:4: ufshcd_exec_dev_cmd+0x220/0x298 kworker/u16:4: ufshcd_verify_dev_init+0x68/0x124 kworker/u16:4: ufshcd_probe_hba+0x390/0x9bc kworker/u16:4: ufshcd_host_reset_and_restore+0x74/0x158 kworker/u16:4: ufshcd_reset_and_restore+0x70/0x31c kworker/u16:4: ufshcd_err_handler+0xad4/0xe58 kworker/u16:4: process_one_work+0x214/0x5b8 kworker/u16:4: worker_thread+0x2d4/0x448 kworker/u16:4: kthread+0x110/0x1e0 kworker/u16:4: ret_from_fork+0x10/0x20 kworker/u16:4: ------------[ cut here ]------------ On Wed, 2023-03-29 at 03:01 -0700, Bao D. Nguyen wrote: > +/** > + * ufshcd_mcq_sq_cleanup - Clean up Submission Queue resources > + * associated with the pending command. > + * @hba - per adapter instance. > + * @task_tag - The command's task tag. > + * @result - Result of the Clean up operation. > + * > + * Returns 0 and result on completion. Returns error code if > + * the operation fails. > + */ > +int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag, int > *result) > +{ > + struct ufshcd_lrb *lrbp = &hba->lrb[task_tag]; > + struct scsi_cmnd *cmd = lrbp->cmd; > + struct ufs_hw_queue *hwq; > + void __iomem *reg, *opr_sqd_base; > + u32 nexus, i, val; > + int err; > + > + if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) { > + if (!cmd) > + return FAILED; > + hwq = ufshcd_mcq_req_to_hwq(hba, > scsi_cmd_to_rq(cmd)); > + } else { > + hwq = hba->dev_cmd_queue; > + } > + > + i = hwq->id; > + > + spin_lock(&hwq->sq_lock); As spin_lock() disable preemption > + > + /* stop the SQ fetching before working on it */ > + err = ufshcd_mcq_sq_stop(hba, hwq); > + if (err) > + goto unlock; > + > + /* SQCTI = EXT_IID, IID, LUN, Task Tag */ > + nexus = lrbp->lun << 8 | task_tag; > + opr_sqd_base = mcq_opr_base(hba, OPR_SQD, i); > + writel(nexus, opr_sqd_base + REG_SQCTI); > + > + /* SQRTCy.ICU = 1 */ > + writel(SQ_ICU, opr_sqd_base + REG_SQRTC); > + > + /* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */ > + reg = opr_sqd_base + REG_SQRTS; > + err = read_poll_timeout(readl, val, val & SQ_CUS, 20, > + MCQ_POLL_US, false, reg); read_poll_timeout() was ufshcd_mcq_poll_register() in last patch, right? ufshcd_mcq_poll_register() calls usleep_range() causing KE as reported above. Same issue seems to still exist as read_poll_timeout() sleeps. Skipping ufshcd_mcq_sq_cleanup() by returning FAILED directly to trigger reset in ufshcd error handler successfully recover host. BTW, is there maybe a change list between RFC v3 and this v1 patch? :) Thanks Po-Wen
On 4/11/2023 6:14 AM, Powen Kao (高伯文) wrote: > Hi Bao D., > > We have done some test based on your RFC v3 patches and an issue is > reported. Hi Powen, Thank you very much for catching this issue. It seems that I cannot use read_poll_timeout() because it sleeps while holding the spin_lock(). I will make the change to poll the registers in a tight loop with udelay(20) polling interval in the next revision. Thanks, Bao > > kworker/u16:4: BUG: scheduling while atomic: > kworker/u16:4/5736/0x00000002 > kworker/u16:4: [name:core&]Preemption disabled at: > kworker/u16:4: [<ffffffef97e33024>] ufshcd_mcq_sq_cleanup+0x9c/0x27c > kworker/u16:4: CPU: 2 PID: 5736 Comm: kworker/u16:4 Tainted: G > S W OE > kworker/u16:4: Workqueue: ufs_eh_wq_0 ufshcd_err_handler > kworker/u16:4: Call trace: > kworker/u16:4: dump_backtrace+0x108/0x15c > kworker/u16:4: show_stack+0x20/0x30 > kworker/u16:4: dump_stack_lvl+0x6c/0x8c > kworker/u16:4: dump_stack+0x20/0x44 > kworker/u16:4: __schedule_bug+0xd4/0x100 > kworker/u16:4: __schedule+0x660/0xa5c > kworker/u16:4: schedule+0x80/0xec > kworker/u16:4: schedule_hrtimeout_range_clock+0xa0/0x140 > kworker/u16:4: schedule_hrtimeout_range+0x1c/0x30 > kworker/u16:4: usleep_range_state+0x88/0xd8 > kworker/u16:4: ufshcd_mcq_sq_cleanup+0x170/0x27c > kworker/u16:4: ufshcd_clear_cmds+0x78/0x184 > kworker/u16:4: ufshcd_wait_for_dev_cmd+0x234/0x348 > kworker/u16:4: ufshcd_exec_dev_cmd+0x220/0x298 > kworker/u16:4: ufshcd_verify_dev_init+0x68/0x124 > kworker/u16:4: ufshcd_probe_hba+0x390/0x9bc > kworker/u16:4: ufshcd_host_reset_and_restore+0x74/0x158 > kworker/u16:4: ufshcd_reset_and_restore+0x70/0x31c > kworker/u16:4: ufshcd_err_handler+0xad4/0xe58 > kworker/u16:4: process_one_work+0x214/0x5b8 > kworker/u16:4: worker_thread+0x2d4/0x448 > kworker/u16:4: kthread+0x110/0x1e0 > kworker/u16:4: ret_from_fork+0x10/0x20 > kworker/u16:4: ------------[ cut here ]------------ > > > On Wed, 2023-03-29 at 03:01 -0700, Bao D. Nguyen wrote: > >> +/** >> + * ufshcd_mcq_sq_cleanup - Clean up Submission Queue resources >> + * associated with the pending command. >> + * @hba - per adapter instance. >> + * @task_tag - The command's task tag. >> + * @result - Result of the Clean up operation. >> + * >> + * Returns 0 and result on completion. Returns error code if >> + * the operation fails. >> + */ >> +int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag, int >> *result) >> +{ >> + struct ufshcd_lrb *lrbp = &hba->lrb[task_tag]; >> + struct scsi_cmnd *cmd = lrbp->cmd; >> + struct ufs_hw_queue *hwq; >> + void __iomem *reg, *opr_sqd_base; >> + u32 nexus, i, val; >> + int err; >> + >> + if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) { >> + if (!cmd) >> + return FAILED; >> + hwq = ufshcd_mcq_req_to_hwq(hba, >> scsi_cmd_to_rq(cmd)); >> + } else { >> + hwq = hba->dev_cmd_queue; >> + } >> + >> + i = hwq->id; >> + >> + spin_lock(&hwq->sq_lock); > As spin_lock() disable preemption > >> + >> + /* stop the SQ fetching before working on it */ >> + err = ufshcd_mcq_sq_stop(hba, hwq); >> + if (err) >> + goto unlock; >> + >> + /* SQCTI = EXT_IID, IID, LUN, Task Tag */ >> + nexus = lrbp->lun << 8 | task_tag; >> + opr_sqd_base = mcq_opr_base(hba, OPR_SQD, i); >> + writel(nexus, opr_sqd_base + REG_SQCTI); >> + >> + /* SQRTCy.ICU = 1 */ >> + writel(SQ_ICU, opr_sqd_base + REG_SQRTC); >> + >> + /* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */ >> + reg = opr_sqd_base + REG_SQRTS; >> + err = read_poll_timeout(readl, val, val & SQ_CUS, 20, >> + MCQ_POLL_US, false, reg); > read_poll_timeout() was ufshcd_mcq_poll_register() in last patch, > right? ufshcd_mcq_poll_register() calls usleep_range() causing KE as > reported above. Same issue seems to still exist as read_poll_timeout() > sleeps. > > Skipping ufshcd_mcq_sq_cleanup() by returning FAILED directly to > trigger reset in ufshcd error handler successfully recover host. > > BTW, is there maybe a change list between RFC v3 and this v1 patch? :) > Thanks > > Po-Wen > >
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c index 31df052..6f6b22c 100644 --- a/drivers/ufs/core/ufs-mcq.c +++ b/drivers/ufs/core/ufs-mcq.c @@ -12,6 +12,10 @@ #include <linux/module.h> #include <linux/platform_device.h> #include "ufshcd-priv.h" +#include <linux/delay.h> +#include <scsi/scsi_cmnd.h> +#include <linux/bitfield.h> +#include <linux/iopoll.h> #define MAX_QUEUE_SUP GENMASK(7, 0) #define UFS_MCQ_MIN_RW_QUEUES 2 @@ -27,6 +31,9 @@ #define MCQ_ENTRY_SIZE_IN_DWORD 8 #define CQE_UCD_BA GENMASK_ULL(63, 7) +/* Max mcq register polling time in micro second unit */ +#define MCQ_POLL_US 500000 + static int rw_queue_count_set(const char *val, const struct kernel_param *kp) { return param_set_uint_minmax(val, kp, UFS_MCQ_MIN_RW_QUEUES, @@ -429,3 +436,170 @@ int ufshcd_mcq_init(struct ufs_hba *hba) host->host_tagset = 1; return 0; } + +static int ufshcd_mcq_sq_stop(struct ufs_hba *hba, struct ufs_hw_queue *hwq) +{ + void __iomem *reg; + u32 i = hwq->id, val; + int err; + + writel(SQ_STOP, mcq_opr_base(hba, OPR_SQD, i) + REG_SQRTC); + reg = mcq_opr_base(hba, OPR_SQD, i) + REG_SQRTS; + err = read_poll_timeout(readl, val, val & SQ_STS, 20, + MCQ_POLL_US, false, reg); + if (err) + dev_err(hba->dev, "%s: failed. hwq-id=%d, err=%d\n", + __func__, i, err); + return err; +} + +static int ufshcd_mcq_sq_start(struct ufs_hba *hba, struct ufs_hw_queue *hwq) +{ + void __iomem *reg; + u32 i = hwq->id, val; + int err; + + writel(SQ_START, mcq_opr_base(hba, OPR_SQD, i) + REG_SQRTC); + reg = mcq_opr_base(hba, OPR_SQD, i) + REG_SQRTS; + err = read_poll_timeout(readl, val, !(val & SQ_STS), 20, + MCQ_POLL_US, false, reg); + if (err) + dev_err(hba->dev, "%s: failed. hwq-id=%d, err=%d\n", + __func__, i, err); + return err; +} + +/** + * ufshcd_mcq_sq_cleanup - Clean up Submission Queue resources + * associated with the pending command. + * @hba - per adapter instance. + * @task_tag - The command's task tag. + * @result - Result of the Clean up operation. + * + * Returns 0 and result on completion. Returns error code if + * the operation fails. + */ +int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag, int *result) +{ + struct ufshcd_lrb *lrbp = &hba->lrb[task_tag]; + struct scsi_cmnd *cmd = lrbp->cmd; + struct ufs_hw_queue *hwq; + void __iomem *reg, *opr_sqd_base; + u32 nexus, i, val; + int err; + + if (task_tag != hba->nutrs - UFSHCD_NUM_RESERVED) { + if (!cmd) + return FAILED; + hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(cmd)); + } else { + hwq = hba->dev_cmd_queue; + } + + i = hwq->id; + + spin_lock(&hwq->sq_lock); + + /* stop the SQ fetching before working on it */ + err = ufshcd_mcq_sq_stop(hba, hwq); + if (err) + goto unlock; + + /* SQCTI = EXT_IID, IID, LUN, Task Tag */ + nexus = lrbp->lun << 8 | task_tag; + opr_sqd_base = mcq_opr_base(hba, OPR_SQD, i); + writel(nexus, opr_sqd_base + REG_SQCTI); + + /* SQRTCy.ICU = 1 */ + writel(SQ_ICU, opr_sqd_base + REG_SQRTC); + + /* Poll SQRTSy.CUS = 1. Return result from SQRTSy.RTC */ + reg = opr_sqd_base + REG_SQRTS; + err = read_poll_timeout(readl, val, val & SQ_CUS, 20, + MCQ_POLL_US, false, reg); + if (err) + dev_err(hba->dev, "%s: failed. hwq=%d, lun=0x%x, tag=%d\n", + __func__, i, lrbp->lun, task_tag); + + *result = FIELD_GET(SQ_ICU_ERR_CODE_MASK, readl(reg)); + + if (ufshcd_mcq_sq_start(hba, hwq)) + err = FAILED; + +unlock: + spin_unlock(&hwq->sq_lock); + return err; +} + +static u64 ufshcd_mcq_get_cmd_desc_addr(struct ufs_hba *hba, + int task_tag) +{ + struct ufshcd_lrb *lrbp = &hba->lrb[task_tag]; + __le32 hi = lrbp->utr_descriptor_ptr->command_desc_base_addr_hi; + __le32 lo = lrbp->utr_descriptor_ptr->command_desc_base_addr_lo; + + return le64_to_cpu((__le64)hi << 32 | lo); +} + +/** + * ufshcd_mcq_nullify_cmd - Nullify utrd. Host controller does not fetch + * transfer with Command Type = 0xF. post the Completion Queue with OCS=ABORTED. + * @hba - per adapter instance. + * @hwq - Hardware Queue of the nullified utrd. + */ +static void ufshcd_mcq_nullify_cmd(struct ufs_hba *hba, struct ufs_hw_queue *hwq) +{ + struct utp_transfer_req_desc *utrd; + u32 dword_0; + + utrd = (struct utp_transfer_req_desc *)(hwq->sqe_base_addr + + hwq->id * sizeof(struct utp_transfer_req_desc)); + dword_0 = le32_to_cpu(utrd->header.dword_0); + dword_0 &= ~UPIU_COMMAND_TYPE_MASK; + dword_0 |= FIELD_PREP(UPIU_COMMAND_TYPE_MASK, 0xF); + utrd->header.dword_0 = cpu_to_le32(dword_0); +} + +/** + * ufshcd_mcq_sqe_search - Search for the cmd in the Submission Queue (SQ) + * @hba - per adapter instance. + * @hwq - Hardware Queue to be searched. + * @task_tag - The command's task tag. + * + * Returns true if the SQE containing the command is present in the SQ + * (not fetched by the controller); returns false if the SQE is not in the SQ. + */ +static bool ufshcd_mcq_sqe_search(struct ufs_hba *hba, + struct ufs_hw_queue *hwq, int task_tag) +{ + struct utp_transfer_req_desc *utrd; + u32 mask = hwq->max_entries - 1; + bool ret = false; + u64 addr, match; + u32 sq_head_slot; + + spin_lock(&hwq->sq_lock); + + ufshcd_mcq_sq_stop(hba, hwq); + sq_head_slot = ufshcd_mcq_get_sq_head_slot(hwq); + if (sq_head_slot == hwq->sq_tail_slot) + goto out; + + addr = ufshcd_mcq_get_cmd_desc_addr(hba, task_tag); + while (sq_head_slot != hwq->sq_tail_slot) { + utrd = (struct utp_transfer_req_desc *)(hwq->sqe_base_addr + + sq_head_slot * sizeof(struct utp_transfer_req_desc)); + match = le64_to_cpu((__le64)utrd->command_desc_base_addr_hi << 32 | + utrd->command_desc_base_addr_lo) & CQE_UCD_BA; + if (addr == match) { + ret = true; + goto out; + } + sq_head_slot = (sq_head_slot + 1) & mask; + } + +out: + ufshcd_mcq_sq_start(hba, hwq); + spin_unlock(&hwq->sq_lock); + return ret; +} diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h index 529f850..1a40cf7 100644 --- a/drivers/ufs/core/ufshcd-priv.h +++ b/drivers/ufs/core/ufshcd-priv.h @@ -78,6 +78,8 @@ struct ufs_hw_queue *ufshcd_mcq_req_to_hwq(struct ufs_hba *hba, unsigned long ufshcd_mcq_poll_cqe_lock(struct ufs_hba *hba, struct ufs_hw_queue *hwq); +int ufshcd_mcq_sq_cleanup(struct ufs_hba *hba, int task_tag, int *result); + #define UFSHCD_MCQ_IO_QUEUE_OFFSET 1 #define SD_ASCII_STD true #define SD_RAW false @@ -403,4 +405,12 @@ static inline struct cq_entry *ufshcd_mcq_cur_cqe(struct ufs_hw_queue *q) return cqe + q->cq_head_slot; } + +static inline u32 ufshcd_mcq_get_sq_head_slot(struct ufs_hw_queue *q) +{ + u32 val = readl(q->mcq_sq_head); + + return val / sizeof(struct utp_transfer_req_desc); +} + #endif /* _UFSHCD_PRIV_H_ */ diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 35a3bd9..808387c 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -56,7 +56,6 @@ #define NOP_OUT_RETRIES 10 /* Timeout after 50 msecs if NOP OUT hangs without response */ #define NOP_OUT_TIMEOUT 50 /* msecs */ - /* Query request retries */ #define QUERY_REQ_RETRIES 3 /* Query request timeout */ @@ -173,7 +172,6 @@ EXPORT_SYMBOL_GPL(ufshcd_dump_regs); enum { UFSHCD_MAX_CHANNEL = 0, UFSHCD_MAX_ID = 1, - UFSHCD_NUM_RESERVED = 1, UFSHCD_CMD_PER_LUN = 32 - UFSHCD_NUM_RESERVED, UFSHCD_CAN_QUEUE = 32 - UFSHCD_NUM_RESERVED, }; diff --git a/include/ufs/ufshci.h b/include/ufs/ufshci.h index 11424bb..ceb8664 100644 --- a/include/ufs/ufshci.h +++ b/include/ufs/ufshci.h @@ -99,6 +99,9 @@ enum { enum { REG_SQHP = 0x0, REG_SQTP = 0x4, + REG_SQRTC = 0x8, + REG_SQCTI = 0xC, + REG_SQRTS = 0x10, }; enum { @@ -111,12 +114,26 @@ enum { REG_CQIE = 0x4, }; +enum { + SQ_START = 0x0, + SQ_STOP = 0x1, + SQ_ICU = 0x2, +}; + +enum { + SQ_STS = 0x1, + SQ_CUS = 0x2, +}; + +#define SQ_ICU_ERR_CODE_MASK GENMASK(7, 4) +#define UPIU_COMMAND_TYPE_MASK GENMASK(31, 28) #define UFS_MASK(mask, offset) ((mask) << (offset)) /* UFS Version 08h */ #define MINOR_VERSION_NUM_MASK UFS_MASK(0xFFFF, 0) #define MAJOR_VERSION_NUM_MASK UFS_MASK(0xFFFF, 16) +#define UFSHCD_NUM_RESERVED 1 /* * Controller UFSHCI version * - 2.x and newer use the following scheme:
Add supporting functions to handle ufs abort in mcq mode. Signed-off-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> --- drivers/ufs/core/ufs-mcq.c | 174 +++++++++++++++++++++++++++++++++++++++++ drivers/ufs/core/ufshcd-priv.h | 10 +++ drivers/ufs/core/ufshcd.c | 2 - include/ufs/ufshci.h | 17 ++++ 4 files changed, 201 insertions(+), 2 deletions(-)