From patchwork Mon Aug 3 09:04:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Can Guo X-Patchwork-Id: 258028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F02FC433DF for ; Mon, 3 Aug 2020 09:05:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE7B8206D7 for ; Mon, 3 Aug 2020 09:05:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726160AbgHCJEv (ORCPT ); Mon, 3 Aug 2020 05:04:51 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:8658 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725806AbgHCJEv (ORCPT ); Mon, 3 Aug 2020 05:04:51 -0400 IronPort-SDR: dajFYW/VqHUFnplPAQ6+aKrMR0HZvwvm685X8F5DC/fWctPM/9JsX8kq3BWkU8uY90lAO0mPQx nX/XlFF4TI9zd5l8fnJ6LlyY4Z6e1Lbf253DcqQFplP3OHX3TeNC0f5OTOyIbNUEFboLr+YbV0 bE8ODrWu7SgUzjBLKyJ6qqAGMZ3j/GGTFvNUwSeGzecsW1Vj4Y1zO/4ECpD46yva7VG+NpXc2D Q1DjEEg4Zyo20+TFkx2lXuBekaH7NHFL0VHBETRpfK0CY004TdcvRwC3JGY+BDvpM77Sp4ohp4 9w4= X-IronPort-AV: E=Sophos;i="5.75,429,1589266800"; d="scan'208";a="47240952" Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by labrats.qualcomm.com with ESMTP; 03 Aug 2020 02:04:50 -0700 Received: from stor-presley.qualcomm.com ([192.168.140.85]) by ironmsg05-sd.qualcomm.com with ESMTP; 03 Aug 2020 02:04:49 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 71C3D214E4; Mon, 3 Aug 2020 02:04:49 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Matthias Brugger , Stanley Chu , Bean Huo , Bart Van Assche , linux-kernel@vger.kernel.org (open list), linux-arm-kernel@lists.infradead.org (moderated list:ARM/Mediatek SoC support), linux-mediatek@lists.infradead.org (moderated list:ARM/Mediatek SoC support) Subject: [PATCH v9 1/9] scsi: ufs: Add checks before setting clk-gating states Date: Mon, 3 Aug 2020 02:04:36 -0700 Message-Id: <1596445485-19834-2-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596445485-19834-1-git-send-email-cang@codeaurora.org> References: <1596445485-19834-1-git-send-email-cang@codeaurora.org> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Clock gating features can be turned on/off selectively which means its state information is only important if it is enabled. This change makes sure that we only look at state of clk-gating if it is enabled. Signed-off-by: Can Guo Reviewed-by: Avri Altman Reviewed-by: Hongwu Su Reviewed-by: Stanley Chu Reviewed-by: Bean Huo --- drivers/scsi/ufs/ufshcd.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 3076222..5acb38c 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -1839,6 +1839,8 @@ static void ufshcd_init_clk_gating(struct ufs_hba *hba) if (!ufshcd_is_clkgating_allowed(hba)) return; + hba->clk_gating.state = CLKS_ON; + hba->clk_gating.delay_ms = 150; INIT_DELAYED_WORK(&hba->clk_gating.gate_work, ufshcd_gate_work); INIT_WORK(&hba->clk_gating.ungate_work, ufshcd_ungate_work); @@ -2541,7 +2543,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) err = SCSI_MLQUEUE_HOST_BUSY; goto out; } - WARN_ON(hba->clk_gating.state != CLKS_ON); + WARN_ON(ufshcd_is_clkgating_allowed(hba) && + (hba->clk_gating.state != CLKS_ON)); lrbp = &hba->lrb[tag]; @@ -8326,8 +8329,11 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum ufs_pm_op pm_op) /* If link is active, device ref_clk can't be switched off */ __ufshcd_setup_clocks(hba, false, true); - hba->clk_gating.state = CLKS_OFF; - trace_ufshcd_clk_gating(dev_name(hba->dev), hba->clk_gating.state); + if (ufshcd_is_clkgating_allowed(hba)) { + hba->clk_gating.state = CLKS_OFF; + trace_ufshcd_clk_gating(dev_name(hba->dev), + hba->clk_gating.state); + } /* Put the host controller in low power mode if possible */ ufshcd_hba_vreg_set_lpm(hba); @@ -8467,6 +8473,11 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) if (hba->clk_scaling.is_allowed) ufshcd_suspend_clkscaling(hba); ufshcd_setup_clocks(hba, false); + if (ufshcd_is_clkgating_allowed(hba)) { + hba->clk_gating.state = CLKS_OFF; + trace_ufshcd_clk_gating(dev_name(hba->dev), + hba->clk_gating.state); + } out: hba->pm_op_in_progress = 0; if (ret) From patchwork Mon Aug 3 09:04:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Can Guo X-Patchwork-Id: 258032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1B48C433DF for ; Mon, 3 Aug 2020 09:04:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B8EFE20738 for ; Mon, 3 Aug 2020 09:04:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726489AbgHCJEx (ORCPT ); Mon, 3 Aug 2020 05:04:53 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:8658 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725971AbgHCJEw (ORCPT ); Mon, 3 Aug 2020 05:04:52 -0400 IronPort-SDR: /Ez29SrVwXZjHEoGl3kpa6/DOIdI0wmUOnW6dHdFZD7L+CfvVKUSLxU0ASJMm90ztGWYnDQdJn 3rn6tuiAESWQkGtv5gTDTjQoHZeJXk33tX5elokpDq3ElgBmE8JtgqGe4Xu7ELAuwbwAlpaSsh gTWnEAZXOTBHQY5GfORm8BA65sBgBVHKyoon3dTolVwUsdzXsOb9KfD5fIS32KvgTntucQyu4o 6/n+4n4DyEWb9DKU4SCN/5wz8uBKOYsPhPJTRIikHR5R6aB0svawy6E1y3UG8Lzn/nhY/eK9q3 ed8= X-IronPort-AV: E=Sophos;i="5.75,429,1589266800"; d="scan'208";a="47240954" Received: from unknown (HELO ironmsg-SD-alpha.qualcomm.com) ([10.53.140.30]) by labrats.qualcomm.com with ESMTP; 03 Aug 2020 02:04:51 -0700 Received: from wsp769891wss.qualcomm.com (HELO stor-presley.qualcomm.com) ([192.168.140.85]) by ironmsg-SD-alpha.qualcomm.com with ESMTP; 03 Aug 2020 02:04:50 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id A4965214E4; Mon, 3 Aug 2020 02:04:50 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Andy Gross , Bjorn Andersson , Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , linux-arm-msm@vger.kernel.org (open list:ARM/QUALCOMM SUPPORT), linux-kernel@vger.kernel.org (open list) Subject: [PATCH v9 3/9] scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs Date: Mon, 3 Aug 2020 02:04:38 -0700 Message-Id: <1596445485-19834-4-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596445485-19834-1-git-send-email-cang@codeaurora.org> References: <1596445485-19834-1-git-send-email-cang@codeaurora.org> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Dumping testbus registers is heavy enough to cause stability issues sometime, just remove them as of now. Signed-off-by: Can Guo Reviewed-by: Hongwu Su Reviewed-by: Avri Altman Reviewed-by: Bean Huo --- drivers/scsi/ufs/ufs-qcom.c | 32 -------------------------------- 1 file changed, 32 deletions(-) diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c index 823eccf..6b75338 100644 --- a/drivers/scsi/ufs/ufs-qcom.c +++ b/drivers/scsi/ufs/ufs-qcom.c @@ -1630,44 +1630,12 @@ int ufs_qcom_testbus_config(struct ufs_qcom_host *host) return 0; } -static void ufs_qcom_testbus_read(struct ufs_hba *hba) -{ - ufshcd_dump_regs(hba, UFS_TEST_BUS, 4, "UFS_TEST_BUS "); -} - -static void ufs_qcom_print_unipro_testbus(struct ufs_hba *hba) -{ - struct ufs_qcom_host *host = ufshcd_get_variant(hba); - u32 *testbus = NULL; - int i, nminor = 256, testbus_len = nminor * sizeof(u32); - - testbus = kmalloc(testbus_len, GFP_KERNEL); - if (!testbus) - return; - - host->testbus.select_major = TSTBUS_UNIPRO; - for (i = 0; i < nminor; i++) { - host->testbus.select_minor = i; - ufs_qcom_testbus_config(host); - testbus[i] = ufshcd_readl(hba, UFS_TEST_BUS); - } - print_hex_dump(KERN_ERR, "UNIPRO_TEST_BUS ", DUMP_PREFIX_OFFSET, - 16, 4, testbus, testbus_len, false); - kfree(testbus); -} - static void ufs_qcom_dump_dbg_regs(struct ufs_hba *hba) { ufshcd_dump_regs(hba, REG_UFS_SYS1CLK_1US, 16 * 4, "HCI Vendor Specific Registers "); - /* sleep a bit intermittently as we are dumping too much data */ ufs_qcom_print_hw_debug_reg_all(hba, NULL, ufs_qcom_dump_regs_wrapper); - udelay(1000); - ufs_qcom_testbus_read(hba); - udelay(1000); - ufs_qcom_print_unipro_testbus(hba); - udelay(1000); } /** From patchwork Mon Aug 3 09:04:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Can Guo X-Patchwork-Id: 258031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C739CC433DF for ; Mon, 3 Aug 2020 09:05:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A628820738 for ; Mon, 3 Aug 2020 09:05:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726767AbgHCJFI (ORCPT ); Mon, 3 Aug 2020 05:05:08 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:3956 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726337AbgHCJFI (ORCPT ); Mon, 3 Aug 2020 05:05:08 -0400 IronPort-SDR: t1JhFbN83uYpwYSfhW57WkE9t415KSwyenRzHfqXwYdJlTuaCOX7XnnIlJfZbYyCf+zSSvgXAw gs2crTwwHMirsFywV7TykhPgD6Ez6rcUvS9zulDtXjxYXWXyZfOQWjTt6bPoyx1cbu3ZpB4Hd+ B9f7s5Grruh7Gh1zji78Ksm7NrSaR3It7s/4d1Er1wuVUC/amIhVdbpBpFT6YgRwNDkGyeJ6iG Lmb2b1+2v/twScdGWjXzvFhkwsh7qFWHt8aeatqWU4kzUs/MVHN/3jGBBcXuwGtJ+r5tR3V4zP 22w= X-IronPort-AV: E=Sophos;i="5.75,429,1589266800"; d="scan'208";a="47240957" Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by labrats.qualcomm.com with ESMTP; 03 Aug 2020 02:05:08 -0700 Received: from wsp769891wss.qualcomm.com (HELO stor-presley.qualcomm.com) ([192.168.140.85]) by ironmsg01-sd.qualcomm.com with ESMTP; 03 Aug 2020 02:05:07 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 3EE89214E4; Mon, 3 Aug 2020 02:05:07 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Stanley Chu , Bean Huo , Bart Van Assche , linux-kernel@vger.kernel.org (open list) Subject: [PATCH v9 6/9] scsi: ufs: Recover hba runtime PM error in error handler Date: Mon, 3 Aug 2020 02:04:41 -0700 Message-Id: <1596445485-19834-7-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596445485-19834-1-git-send-email-cang@codeaurora.org> References: <1596445485-19834-1-git-send-email-cang@codeaurora.org> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Current error handler cannot work well or recover hba runtime PM error if ufshcd_suspend/resume has failed due to UFS errors, e.g. hibern8 enter/exit error or SSU cmd error. When this happens, error handler may fail doing full reset and restore because error handler always assumes that powers, IRQs and clocks are ready after pm_runtime_get_sync returns, but actually they are not if ufshcd_reusme fails [1]. Besides, if ufschd_suspend/resume fails due to UFS error, runtime PM framework saves the error value to dev.power.runtime_error. After that, hba dev runtime suspend/resume would not be invoked anymore unless runtime_error is cleared [2]. In case of ufshcd_suspend/resume fails due to UFS errors, for scenario [1], error handler cannot assume anything of pm_runtime_get_sync, meaning error handler should explicitly turn ON powers, IRQs and clocks again. To get the hba runtime PM work as regard for scenario [2], error handler can clear the runtime_error by calling pm_runtime_set_active() if full reset and restore succeeds. And, more important, if pm_runtime_set_active() returns no error, which means runtime_error has been cleared, we also need to resume those scsi devices under hba in case any of them has failed to be resumed due to hba runtime resume failure. This is to unblock blk_queue_enter in case there are bios waiting inside it. Signed-off-by: Can Guo Reviewed-by: Bean Huo --- drivers/scsi/ufs/ufshcd.c | 108 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 99 insertions(+), 9 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 2604016..6a10003 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -15,6 +15,7 @@ #include #include #include +#include #include "ufshcd.h" #include "ufs_quirks.h" #include "unipro.h" @@ -229,6 +230,10 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba); static int ufshcd_change_power_mode(struct ufs_hba *hba, struct ufs_pa_layer_attr *pwr_mode); static void ufshcd_schedule_eh_work(struct ufs_hba *hba); +static int ufshcd_setup_hba_vreg(struct ufs_hba *hba, bool on); +static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on); +static inline int ufshcd_config_vreg_hpm(struct ufs_hba *hba, + struct ufs_vreg *vreg); static int ufshcd_wb_buf_flush_enable(struct ufs_hba *hba); static int ufshcd_wb_buf_flush_disable(struct ufs_hba *hba); static int ufshcd_wb_ctrl(struct ufs_hba *hba, bool enable); @@ -5553,6 +5558,84 @@ static inline void ufshcd_schedule_eh_work(struct ufs_hba *hba) } } +static void ufshcd_err_handling_prepare(struct ufs_hba *hba) +{ + pm_runtime_get_sync(hba->dev); + if (pm_runtime_suspended(hba->dev)) { + /* + * Don't assume anything of pm_runtime_get_sync(), if + * resume fails, irq and clocks can be OFF, and powers + * can be OFF or in LPM. + */ + ufshcd_setup_hba_vreg(hba, true); + ufshcd_enable_irq(hba); + ufshcd_setup_vreg(hba, true); + ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq); + ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq2); + ufshcd_hold(hba, false); + if (!ufshcd_is_clkgating_allowed(hba)) + ufshcd_setup_clocks(hba, true); + ufshcd_release(hba); + ufshcd_vops_resume(hba, UFS_RUNTIME_PM); + } else { + ufshcd_hold(hba, false); + if (hba->clk_scaling.is_allowed) { + cancel_work_sync(&hba->clk_scaling.suspend_work); + cancel_work_sync(&hba->clk_scaling.resume_work); + ufshcd_suspend_clkscaling(hba); + } + } +} + +static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) +{ + ufshcd_release(hba); + if (hba->clk_scaling.is_allowed) + ufshcd_resume_clkscaling(hba); + pm_runtime_put(hba->dev); +} + +static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) +{ + return (hba->ufshcd_state == UFSHCD_STATE_ERROR || + (!(hba->saved_err || hba->saved_uic_err || hba->force_reset || + ufshcd_is_link_broken(hba)))); +} + +#ifdef CONFIG_PM +static void ufshcd_recover_pm_error(struct ufs_hba *hba) +{ + struct Scsi_Host *shost = hba->host; + struct scsi_device *sdev; + struct request_queue *q; + int ret; + + /* + * Set RPM status of hba device to RPM_ACTIVE, + * this also clears its runtime error. + */ + ret = pm_runtime_set_active(hba->dev); + /* + * If hba device had runtime error, we also need to resume those + * scsi devices under hba in case any of them has failed to be + * resumed due to hba runtime resume failure. This is to unblock + * blk_queue_enter in case there are bios waiting inside it. + */ + if (!ret) { + list_for_each_entry(sdev, &shost->__devices, siblings) { + q = sdev->request_queue; + if (q->dev && (q->rpm_status == RPM_SUSPENDED || + q->rpm_status == RPM_SUSPENDING)) + pm_request_resume(q->dev); + } + } +} +#else +static inline void ufshcd_recover_pm_error(struct ufs_hba *hba) +{ +} +#endif + /** * ufshcd_err_handler - handle UFS errors that require s/w attention * @work: pointer to work structure @@ -5570,9 +5653,7 @@ static void ufshcd_err_handler(struct work_struct *work) hba = container_of(work, struct ufs_hba, eh_work); spin_lock_irqsave(hba->host->host_lock, flags); - if (hba->ufshcd_state == UFSHCD_STATE_ERROR || - (!(hba->saved_err || hba->saved_uic_err || hba->force_reset || - ufshcd_is_link_broken(hba)))) { + if (ufshcd_err_handling_should_stop(hba)) { if (hba->ufshcd_state != UFSHCD_STATE_ERROR) hba->ufshcd_state = UFSHCD_STATE_OPERATIONAL; spin_unlock_irqrestore(hba->host->host_lock, flags); @@ -5581,10 +5662,17 @@ static void ufshcd_err_handler(struct work_struct *work) } ufshcd_set_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); - pm_runtime_get_sync(hba->dev); - ufshcd_hold(hba, false); - + ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); + /* + * A full reset and restore might have happened after preparation + * is finished, double check whether we should stop. + */ + if (ufshcd_err_handling_should_stop(hba)) { + if (hba->ufshcd_state != UFSHCD_STATE_ERROR) + hba->ufshcd_state = UFSHCD_STATE_OPERATIONAL; + goto out; + } hba->ufshcd_state = UFSHCD_STATE_RESET; /* Complete requests that have door-bell cleared by h/w */ @@ -5662,10 +5750,12 @@ static void ufshcd_err_handler(struct work_struct *work) hba->force_reset = false; spin_unlock_irqrestore(hba->host->host_lock, flags); err = ufshcd_reset_and_restore(hba); - spin_lock_irqsave(hba->host->host_lock, flags); if (err) dev_err(hba->dev, "%s: reset and restore failed with err %d\n", __func__, err); + else + ufshcd_recover_pm_error(hba); + spin_lock_irqsave(hba->host->host_lock, flags); } skip_err_handling: @@ -5677,11 +5767,11 @@ static void ufshcd_err_handler(struct work_struct *work) __func__, hba->saved_err, hba->saved_uic_err); } +out: ufshcd_clear_eh_in_progress(hba); spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_scsi_unblock_requests(hba); - ufshcd_release(hba); - pm_runtime_put_sync(hba->dev); + ufshcd_err_handling_unprepare(hba); } /** From patchwork Mon Aug 3 09:04:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Can Guo X-Patchwork-Id: 258030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B33D5C433E3 for ; Mon, 3 Aug 2020 09:05:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96C53206E2 for ; Mon, 3 Aug 2020 09:05:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726808AbgHCJFR (ORCPT ); Mon, 3 Aug 2020 05:05:17 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:41081 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726785AbgHCJFQ (ORCPT ); Mon, 3 Aug 2020 05:05:16 -0400 IronPort-SDR: X+X4/D/CLdgaUp8sV+uPxmTskQbmi0w3ePUuYgZjkGvQevnzfOhLr4Y6xy/CiJ2r9+2Q6IlhVV 2xVOZm7hzLfpuWoYdlkSpTJw2vwiCKSb5Tdl8WpVxmUO2QCe5xnnfFMQt2Df2waFBBY0ifO6sH Oh31H4e6HQ70LFThWOs0rklt+L6+hKFiumZjsubCpHMk/wKhqbDk1T2//YOxaBym102BdPlblF lk4vFzmCGPR1PWigE/hT3/qMyWBOYa8wT+wQQu6wM75YfxHQHtHiA1xEnpDiV+kS696MmQjX8j dgE= X-IronPort-AV: E=Sophos;i="5.75,429,1589266800"; d="scan'208";a="29063954" Received: from unknown (HELO ironmsg04-sd.qualcomm.com) ([10.53.140.144]) by labrats.qualcomm.com with ESMTP; 03 Aug 2020 02:05:15 -0700 Received: from wsp769891wss.qualcomm.com (HELO stor-presley.qualcomm.com) ([192.168.140.85]) by ironmsg04-sd.qualcomm.com with ESMTP; 03 Aug 2020 02:05:15 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 99C5F214E4; Mon, 3 Aug 2020 02:05:15 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Stanley Chu , Bean Huo , Bart Van Assche , linux-kernel@vger.kernel.org (open list) Subject: [PATCH v9 8/9] scsi: ufs: Fix a racing problem btw error handler and runtime PM ops Date: Mon, 3 Aug 2020 02:04:43 -0700 Message-Id: <1596445485-19834-9-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596445485-19834-1-git-send-email-cang@codeaurora.org> References: <1596445485-19834-1-git-send-email-cang@codeaurora.org> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Current IRQ handler blocks scsi requests before scheduling eh_work, when error handler calls pm_runtime_get_sync, if ufshcd_suspend/resume sends a scsi cmd, most likely the SSU cmd, since scsi requests are blocked, pm_runtime_get_sync() will never return because ufshcd_suspend/reusme is blocked by the scsi cmd. Some changes and code re-arrangement can be made to resolve it. o In queuecommand path, hba->ufshcd_state check and ufshcd_send_command should stay into the same spin lock. This is to make sure that no more commands leak into doorbell after hba->ufshcd_state is changed. o Don't block scsi requests before scheduling eh_work, let error handler block scsi requests when it is ready to start error recovery. o Don't let scsi layer keep requeuing the scsi cmds sent from hba runtime PM ops, let them pass or fail them. Let them pass if eh_work is scheduled due to non-fatal errors. Fail them if eh_work is scheduled due to fatal errors, otherwise the cmds may eventually time out since UFS is in bad state, which gets error handler blocked for too long. If we fail the scsi cmds sent from hba runtime PM ops, hba runtime PM ops fails too, but it does not hurt since error handler can recover hba runtime PM error. Signed-off-by: Can Guo Reviewed-by: Bean Huo --- drivers/scsi/ufs/ufshcd.c | 84 ++++++++++++++++++++++++++++------------------- 1 file changed, 50 insertions(+), 34 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index a79fbbd..d7d2758 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -126,7 +126,8 @@ enum { UFSHCD_STATE_RESET, UFSHCD_STATE_ERROR, UFSHCD_STATE_OPERATIONAL, - UFSHCD_STATE_EH_SCHEDULED, + UFSHCD_STATE_EH_SCHEDULED_FATAL, + UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, }; /* UFSHCD error handling flags */ @@ -2515,34 +2516,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) if (!down_read_trylock(&hba->clk_scaling_lock)) return SCSI_MLQUEUE_HOST_BUSY; - spin_lock_irqsave(hba->host->host_lock, flags); - switch (hba->ufshcd_state) { - case UFSHCD_STATE_OPERATIONAL: - break; - case UFSHCD_STATE_EH_SCHEDULED: - case UFSHCD_STATE_RESET: - err = SCSI_MLQUEUE_HOST_BUSY; - goto out_unlock; - case UFSHCD_STATE_ERROR: - set_host_byte(cmd, DID_ERROR); - cmd->scsi_done(cmd); - goto out_unlock; - default: - dev_WARN_ONCE(hba->dev, 1, "%s: invalid state %d\n", - __func__, hba->ufshcd_state); - set_host_byte(cmd, DID_BAD_TARGET); - cmd->scsi_done(cmd); - goto out_unlock; - } - - /* if error handling is in progress, don't issue commands */ - if (ufshcd_eh_in_progress(hba)) { - set_host_byte(cmd, DID_ERROR); - cmd->scsi_done(cmd); - goto out_unlock; - } - spin_unlock_irqrestore(hba->host->host_lock, flags); - hba->req_abort_count = 0; err = ufshcd_hold(hba, true); @@ -2578,11 +2551,50 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) /* Make sure descriptors are ready before ringing the doorbell */ wmb(); - /* issue command to the controller */ spin_lock_irqsave(hba->host->host_lock, flags); + switch (hba->ufshcd_state) { + case UFSHCD_STATE_OPERATIONAL: + case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: + break; + case UFSHCD_STATE_EH_SCHEDULED_FATAL: + /* + * pm_runtime_get_sync() is used at error handling preparation + * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's + * PM ops, it can never be finished if we let SCSI layer keep + * retrying it, which gets err handler stuck forever. Neither + * can we let the scsi cmd pass through, because UFS is in bad + * state, the scsi cmd may eventually time out, which will get + * err handler blocked for too long. So, just fail the scsi cmd + * sent from PM ops, err handler can recover PM error anyways. + */ + if (hba->pm_op_in_progress) { + hba->force_reset = true; + set_host_byte(cmd, DID_BAD_TARGET); + goto out_compl_cmd; + } + case UFSHCD_STATE_RESET: + err = SCSI_MLQUEUE_HOST_BUSY; + goto out_compl_cmd; + case UFSHCD_STATE_ERROR: + set_host_byte(cmd, DID_ERROR); + goto out_compl_cmd; + default: + dev_WARN_ONCE(hba->dev, 1, "%s: invalid state %d\n", + __func__, hba->ufshcd_state); + set_host_byte(cmd, DID_BAD_TARGET); + goto out_compl_cmd; + } ufshcd_send_command(hba, tag); -out_unlock: spin_unlock_irqrestore(hba->host->host_lock, flags); + goto out; + +out_compl_cmd: + scsi_dma_unmap(lrbp->cmd); + lrbp->cmd = NULL; + spin_unlock_irqrestore(hba->host->host_lock, flags); + ufshcd_release(hba); + if (!err) + cmd->scsi_done(cmd); out: up_read(&hba->clk_scaling_lock); return err; @@ -5552,9 +5564,12 @@ static inline void ufshcd_schedule_eh_work(struct ufs_hba *hba) { /* handle fatal errors only when link is not in error state */ if (hba->ufshcd_state != UFSHCD_STATE_ERROR) { - hba->ufshcd_state = UFSHCD_STATE_EH_SCHEDULED; - if (queue_work(hba->eh_wq, &hba->eh_work)) - ufshcd_scsi_block_requests(hba); + if (hba->force_reset || ufshcd_is_link_broken(hba) || + ufshcd_is_saved_err_fatal(hba)) + hba->ufshcd_state = UFSHCD_STATE_EH_SCHEDULED_FATAL; + else + hba->ufshcd_state = UFSHCD_STATE_EH_SCHEDULED_NON_FATAL; + queue_work(hba->eh_wq, &hba->eh_work); } } @@ -5664,6 +5679,7 @@ static void ufshcd_err_handler(struct work_struct *work) spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_err_handling_prepare(hba); spin_lock_irqsave(hba->host->host_lock, flags); + ufshcd_scsi_block_requests(hba); /* * A full reset and restore might have happened after preparation * is finished, double check whether we should stop. From patchwork Mon Aug 3 09:04:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Can Guo X-Patchwork-Id: 258029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4989CC433E4 for ; Mon, 3 Aug 2020 09:05:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30762206D7 for ; Mon, 3 Aug 2020 09:05:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726820AbgHCJFY (ORCPT ); Mon, 3 Aug 2020 05:05:24 -0400 Received: from labrats.qualcomm.com ([199.106.110.90]:21494 "EHLO labrats.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726062AbgHCJFX (ORCPT ); Mon, 3 Aug 2020 05:05:23 -0400 IronPort-SDR: tiMR4BmNoE4dGZXEe91/ql5tjyhGGdcKzVBe6GFDh5EPp4+AcxMCmZLDPHJnWA+sOQv/ltA46z mU9ijqkmdMJlQJfcxSytc4TKkbX1cpn6x0No59DuvABy0CJ+EIcDljLzOIrkcvMfhgjml9fMSl ajgrm5HGIE0QMkK9vXIrj6WOxkgCAfyiY9pBjBzq7RDZj34UlBm2tXzcfjinD4Y5+jB4Vzkf5p XO9a9aETGR7DJVr6EFFR7BJ2Lnit4TMbXXnrQhbFIgsLKUTw9FOTtdAq3ZtR4U9mniBa1uo57M W9A= X-IronPort-AV: E=Sophos;i="5.75,429,1589266800"; d="scan'208";a="47240958" Received: from unknown (HELO ironmsg02-sd.qualcomm.com) ([10.53.140.142]) by labrats.qualcomm.com with ESMTP; 03 Aug 2020 02:05:21 -0700 Received: from stor-presley.qualcomm.com ([192.168.140.85]) by ironmsg02-sd.qualcomm.com with ESMTP; 03 Aug 2020 02:05:20 -0700 Received: by stor-presley.qualcomm.com (Postfix, from userid 359480) id 0AB2A214E4; Mon, 3 Aug 2020 02:05:21 -0700 (PDT) From: Can Guo To: asutoshd@codeaurora.org, nguyenb@codeaurora.org, hongwus@codeaurora.org, rnayak@codeaurora.org, linux-scsi@vger.kernel.org, kernel-team@android.com, saravanak@google.com, salyzyn@google.com, cang@codeaurora.org Cc: Stanley Chu , Alim Akhtar , Avri Altman , "James E.J. Bottomley" , "Martin K. Petersen" , Matthias Brugger , Bean Huo , Bart Van Assche , linux-kernel@vger.kernel.org (open list), linux-arm-kernel@lists.infradead.org (moderated list:ARM/Mediatek SoC support), linux-mediatek@lists.infradead.org (moderated list:ARM/Mediatek SoC support) Subject: [PATCH v9 9/9] scsi: ufs: Properly release resources if a task is aborted successfully Date: Mon, 3 Aug 2020 02:04:44 -0700 Message-Id: <1596445485-19834-10-git-send-email-cang@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1596445485-19834-1-git-send-email-cang@codeaurora.org> References: <1596445485-19834-1-git-send-email-cang@codeaurora.org> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org In current UFS task abort hook, namely ufshcd_abort(), if a task is aborted successfully, clock scaling busy time statistics is not updated and, most important, clk_gating.active_reqs is not decreased, which makes clk_gating.active_reqs stay above zero forever, thus clock gating would never happen. To fix it, instead of releasing resources "mannually", use the existing func __ufshcd_transfer_req_compl(). This can also eliminate racing of scsi_dma_unmap() from the real completion in IRQ handler path. Signed-off-by: Can Guo CC: Stanley Chu --- drivers/scsi/ufs/ufshcd.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index d7d2758..9a48389 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -6635,11 +6635,8 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) goto out; } - scsi_dma_unmap(cmd); - spin_lock_irqsave(host->host_lock, flags); - ufshcd_outstanding_req_clear(hba, tag); - hba->lrb[tag].cmd = NULL; + __ufshcd_transfer_req_compl(hba, (1UL << tag)); spin_unlock_irqrestore(host->host_lock, flags); out: