Message ID | 20240925095546.19492-1-peter.wang@mediatek.com |
---|---|
Headers | show |
Series | fix abort defect | expand |
On Wed, 2024-09-25 at 09:49 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 9/25/24 2:55 AM, peter.wang@mediatek.com wrote: > > case OCS_INVALID_COMMAND_STATUS: > > result |= DID_REQUEUE << 16; > > +dev_warn(hba->dev, > > +"OCS invaild from controller for tag %d\n", > > +lrbp->task_tag); > > Please change "invaild" into "invalid". Once that change has been > made, > feel free to add: > > Reviewed-by: Bart Van Assche <bvanassche@acm.org> Hi Bart, Sorry, this typo wasn't corrected. However, I still feel that this patch is not quite reasonable. The reason is that in the first patch, "ufs: core: fix the issue of ICU failure" we corrected the ICU problem, allowing the SQ to clean up correctly, while the CQ will have a corresponding response. But the second patch directly ignores the CQ's response and continues to use a workaround to release the command right after ufshcd_try_to_abort_task. The Legacy SDB mode's approach would not release the command after ufshcd_try_to_abort_task. Instead, it releases after the DBR is cleared. Therefore, as I previously suggested, it would probably be more reasonable to directly requeue the ABORTED commands as shown in the patch below. --------------------------------------------------------------------- --- diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 24a32e2fd75e..06aa4ed1a9e6 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -5417,10 +5417,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp, } break; case OCS_ABORTED: - result |= DID_ABORT << 16; - break; case OCS_INVALID_COMMAND_STATUS: result |= DID_REQUEUE << 16; + dev_warn(hba->dev, + "OCS %s from controller for tag %d\n", + (ocs == OCS_ABORTED? "aborted" : "invalid"), + lrbp->task_tag); break; case OCS_INVALID_CMD_TABLE_ATTR: case OCS_INVALID_PRDT_ATTR: @@ -6466,26 +6468,12 @@ static bool ufshcd_abort_one(struct request *rq, void *priv) struct scsi_device *sdev = cmd->device; struct Scsi_Host *shost = sdev->host; struct ufs_hba *hba = shost_priv(shost); - struct ufshcd_lrb *lrbp = &hba->lrb[tag]; - struct ufs_hw_queue *hwq; - unsigned long flags; *ret = ufshcd_try_to_abort_task(hba, tag); dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag, hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1, *ret ? "failed" : "succeeded"); - /* Release cmd in MCQ mode if abort succeeds */ - if (hba->mcq_enabled && (*ret == 0)) { - hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp- >cmd)); - if (!hwq) - return 0; - spin_lock_irqsave(&hwq->cq_lock, flags); - if (ufshcd_cmd_inflight(lrbp->cmd)) - ufshcd_release_scsi_cmd(hba, lrbp); - spin_unlock_irqrestore(&hwq->cq_lock, flags); - } - return *ret == 0; } --------------------------------------------------------------------- --- This patch has several advantages: 1. It makes the patch 'ufs: core: fix the issue of ICU failure' seem valuable. 2. The patch is more concise. 3. There is no need to fetch OCS to determine OCS: ABORTED on every CQ completion, which increases ISR time. 4. The err_handler flow for SDB and MCQ would be consistent. 5. There is no need for the MediaTek SDB quirk. What do you think?"
On 9/25/24 8:45 PM, Peter Wang (王信友) wrote: > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index 24a32e2fd75e..06aa4ed1a9e6 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -5417,10 +5417,12 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, > struct ufshcd_lrb *lrbp, > } > break; > case OCS_ABORTED: > - result |= DID_ABORT << 16; > - break; > case OCS_INVALID_COMMAND_STATUS: > result |= DID_REQUEUE << 16; > + dev_warn(hba->dev, > + "OCS %s from controller for tag %d\n", > + (ocs == OCS_ABORTED? "aborted" : > "invalid"), > + lrbp->task_tag); > break; > case OCS_INVALID_CMD_TABLE_ATTR: > case OCS_INVALID_PRDT_ATTR: > @@ -6466,26 +6468,12 @@ static bool ufshcd_abort_one(struct request > *rq, void *priv) > struct scsi_device *sdev = cmd->device; > struct Scsi_Host *shost = sdev->host; > struct ufs_hba *hba = shost_priv(shost); > - struct ufshcd_lrb *lrbp = &hba->lrb[tag]; > - struct ufs_hw_queue *hwq; > - unsigned long flags; > > *ret = ufshcd_try_to_abort_task(hba, tag); > dev_err(hba->dev, "Aborting tag %d / CDB %#02x %s\n", tag, > hba->lrb[tag].cmd ? hba->lrb[tag].cmd->cmnd[0] : -1, > *ret ? "failed" : "succeeded"); > > - /* Release cmd in MCQ mode if abort succeeds */ > - if (hba->mcq_enabled && (*ret == 0)) { > - hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(lrbp- >> cmd)); > - if (!hwq) > - return 0; > - spin_lock_irqsave(&hwq->cq_lock, flags); > - if (ufshcd_cmd_inflight(lrbp->cmd)) > - ufshcd_release_scsi_cmd(hba, lrbp); > - spin_unlock_irqrestore(&hwq->cq_lock, flags); > - } > - > return *ret == 0; > } > > --------------------------------------------------------------------- > > > This patch has several advantages: > > 1. It makes the patch 'ufs: core: fix the issue of ICU failure' > seem valuable. > 2. The patch is more concise. > 3. There is no need to fetch OCS to determine OCS: ABORTED > on every CQ completion, which increases ISR time. > 4. The err_handler flow for SDB and MCQ would be consistent. > 5. There is no need for the MediaTek SDB quirk. > > > What do you think?" Hi Peter, Is the above patch sufficient? In MCQ mode, aborting a command happens as follows (simplified): (1) Send the ABORT TASK TMF. If this TMF succeeds, no SQE will be generated. If this TMF succeeds it means that the SCSI command has reached the UFS device and hence is no longer present in any submission queue (SQ). (2) If the command is still in a submission queue, nullify the SQE. In this case a CQE will be generated with status ABORTED. It seems to me that the above patch handles (2) but not (1)? Thanks, Bart.
From: Peter Wang <peter.wang@mediatek.com> V9: - Revise the OCS content printed. V8: - Remove the abort variable to simplify the abort process. - Correct error handler successfully aborts release flow. - Ingore MCQ OCS: ABORTED. V7: - Use a variable instead of a flag. - Add a check for MCQ mode when setting this variable to UFS_ERR_HANDLER. - Print OCS information for OCS_ABORTED and OCS_INVALID_COMMAND_STATUS. - Add a MediaTek quirk for handling OCS_ABORTED in SDB mode. - Skip notifying SCSI from ISR during SCSI abort (ufshcd_abort()). V6: - Add err handler check before set flag true. V5: - Change flag name. - Amend comment and patch description. V4: - Remove nullify SQ entry abort requeue. - Add more comment for flag usage and set description. - Fix build warning. V3: - Change comment and use variable(rtc) for error print - Change flag name and move flag set before ufshcd_clear_cmd - Add SDB mode clear UTRLCLR tag receive OCS_ABORTED requeue V2: - Fix mcq_enabled build error. Peter Wang (3): ufs: core: fix the issue of ICU failure ufs: core: fix error handler process for MCQ abort ufs: core: add a quirk for MediaTek SDB mode aborted drivers/ufs/core/ufs-mcq.c | 15 ++++++++------- drivers/ufs/core/ufshcd.c | 28 ++++++++++++++++++++++++++-- drivers/ufs/host/ufs-mediatek.c | 1 + include/ufs/ufshcd.h | 6 ++++++ 4 files changed, 41 insertions(+), 9 deletions(-)