Message ID | cover.1618426513.git.asutoshd@codeaurora.org |
---|---|
Headers | show |
Series | Enable power management for ufs wlun | expand |
On 14/04/21 9:58 pm, Asutosh Das wrote: > During runtime-suspend of ufs host, the scsi devices are > already suspended and so are the queues associated with them. > But the ufs host sends SSU (START_STOP_UNIT) to wlun > during its runtime-suspend. > During the process blk_queue_enter checks if the queue is not in > suspended state. If so, it waits for the queue to resume, and never > comes out of it. > The commit > (d55d15a33: scsi: block: Do not accept any requests while suspended) > adds the check if the queue is in suspended state in blk_queue_enter(). > > Call trace: > __switch_to+0x174/0x2c4 > __schedule+0x478/0x764 > schedule+0x9c/0xe0 > blk_queue_enter+0x158/0x228 > blk_mq_alloc_request+0x40/0xa4 > blk_get_request+0x2c/0x70 > __scsi_execute+0x60/0x1c4 > ufshcd_set_dev_pwr_mode+0x124/0x1e4 > ufshcd_suspend+0x208/0x83c > ufshcd_runtime_suspend+0x40/0x154 > ufshcd_pltfrm_runtime_suspend+0x14/0x20 > pm_generic_runtime_suspend+0x28/0x3c > __rpm_callback+0x80/0x2a4 > rpm_suspend+0x308/0x614 > rpm_idle+0x158/0x228 > pm_runtime_work+0x84/0xac > process_one_work+0x1f0/0x470 > worker_thread+0x26c/0x4c8 > kthread+0x13c/0x320 > ret_from_fork+0x10/0x18 > > Fix this by registering ufs device wlun as a scsi driver and > registering it for block runtime-pm. Also make this as a > supplier for all other luns. That way, this device wlun > suspends after all the consumers and resumes after > hba resumes. Can you also explain the new driver for RPMB WLUN and UAC changes here? And also mention that the driver will now always be runtime resumed before system suspend. > > Co-developed-by: Can Guo <cang@codeaurora.org> > Signed-off-by: Can Guo <cang@codeaurora.org> > Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> > --- <SNIP> > +static void ufshcd_setup_links(struct ufs_hba *hba, struct scsi_device *sdev) > +{ > + struct device_link *link; > + > + /* > + * device wlun is the supplier & rest of the luns are consumers > + * This ensures that device wlun suspends after all other luns. > + */ > + if (hba->sdev_ufs_device) { > + link = device_link_add(&sdev->sdev_gendev, > + &hba->sdev_ufs_device->sdev_gendev, > + DL_FLAG_PM_RUNTIME|DL_FLAG_RPM_ACTIVE); "|" could be surrounded by spaces i.e. DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE); > + if (!link) { > + dev_err(&sdev->sdev_gendev, "Failed establishing link - %s\n", > + dev_name(&hba->sdev_ufs_device->sdev_gendev)); > + return; > + } > + hba->luns_avail--; > + /* Ignore REPORT_LUN wlun probing */ > + if (hba->luns_avail == 1) { > + ufshcd_rpm_put(hba); > + return; > + } > + } else { > + /* device wlun is probed */ > + hba->luns_avail--; > + } > +} <SNIP> > @@ -8916,42 +8906,214 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op) > if (hba->ee_usr_mask) > ufshcd_write_ee_control(hba); > > - hba->clk_gating.is_suspended = false; > - > - if (ufshcd_is_clkscaling_supported(hba)) > - ufshcd_clk_scaling_suspend(hba, false); > - > - /* Enable Auto-Hibernate if configured */ > - ufshcd_auto_hibern8_enable(hba); > + if (hba->clk_scaling.is_allowed) > + ufshcd_resume_clkscaling(hba); We don't use clks, regulators, clk gating, clk scaling, but the original code looks more correct because hba->clk_scaling.is_allowed will have been set to false by ufshcd_clk_scaling_suspend(hba, true) in __ufshcd_wl_suspend(). <SNIP> > +#ifdef CONFIG_PM_SLEEP > +static int ufshcd_wl_suspend(struct device *dev) > +{ > + struct scsi_device *sdev = to_scsi_device(dev); > + struct ufs_hba *hba; > + int ret; For below: int ret = 0; > + ktime_t start = ktime_get(); > + > + hba = shost_priv(sdev->host); > + down(&hba->host_sem); If we get here with dev runtime suspended, then skip __ufshcd_wl_suspend() i.e. if (pm_runtime_suspended(dev)) goto out; > + ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); > + if (ret) { > + dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); > + up(&hba->host_sem); > + } else { > + hba->is_sys_suspended = true; > + } out: if (!ret) hba->is_sys_suspended = true; > + > + trace_ufshcd_wl_suspend(dev_name(dev), ret, > + ktime_to_us(ktime_sub(ktime_get(), start)), > + hba->curr_dev_pwr_mode, hba->uic_link_state); > + > + return ret; > +} <SNIP>
On 4/14/21 11:58 AM, Asutosh Das wrote: > [ ... ] Patches sent to the SCSI mailing list should not have a "scsi: " prefix in the subject. That prefix is inserted before any SCSI patches go into Martin's tree. > diff --git a/drivers/scsi/ufs/cdns-pltfrm.c b/drivers/scsi/ufs/cdns-pltfrm.c > index 13d9204..b9105e4 100644 > --- a/drivers/scsi/ufs/cdns-pltfrm.c > +++ b/drivers/scsi/ufs/cdns-pltfrm.c > @@ -323,6 +323,8 @@ static const struct dev_pm_ops cdns_ufs_dev_pm_ops = { > .runtime_suspend = ufshcd_pltfrm_runtime_suspend, > .runtime_resume = ufshcd_pltfrm_runtime_resume, > .runtime_idle = ufshcd_pltfrm_runtime_idle, > + .prepare = ufshcd_suspend_prepare, > + .complete = ufshcd_resume_complete, > }; > > static struct platform_driver cdns_ufs_pltfrm_driver = { > diff --git a/drivers/scsi/ufs/tc-dwc-g210-pci.c b/drivers/scsi/ufs/tc-dwc-g210-pci.c > index 67a6a61..b01db12 100644 > --- a/drivers/scsi/ufs/tc-dwc-g210-pci.c > +++ b/drivers/scsi/ufs/tc-dwc-g210-pci.c > @@ -148,6 +148,8 @@ static const struct dev_pm_ops tc_dwc_g210_pci_pm_ops = { > .runtime_suspend = tc_dwc_g210_pci_runtime_suspend, > .runtime_resume = tc_dwc_g210_pci_runtime_resume, > .runtime_idle = tc_dwc_g210_pci_runtime_idle, > + .prepare = ufshcd_suspend_prepare, > + .complete = ufshcd_resume_complete, > }; [ ... ] > --- a/drivers/scsi/ufs/ufs-exynos.c > +++ b/drivers/scsi/ufs/ufs-exynos.c > @@ -1267,6 +1267,8 @@ static const struct dev_pm_ops exynos_ufs_pm_ops = { > .runtime_suspend = ufshcd_pltfrm_runtime_suspend, > .runtime_resume = ufshcd_pltfrm_runtime_resume, > .runtime_idle = ufshcd_pltfrm_runtime_idle, > + .prepare = ufshcd_suspend_prepare, > + .complete = ufshcd_resume_complete, > }; > > static struct platform_driver exynos_ufs_pltform = { > diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c > index 0aa5813..d463b44 100644 > --- a/drivers/scsi/ufs/ufs-hisi.c > +++ b/drivers/scsi/ufs/ufs-hisi.c > @@ -574,6 +574,8 @@ static const struct dev_pm_ops ufs_hisi_pm_ops = { > .runtime_suspend = ufshcd_pltfrm_runtime_suspend, > .runtime_resume = ufshcd_pltfrm_runtime_resume, > .runtime_idle = ufshcd_pltfrm_runtime_idle, > + .prepare = ufshcd_suspend_prepare, > + .complete = ufshcd_resume_complete, > }; A minor comment about source code formatting: please make sure that the equality signs are aligned in struct dev_pm_ops definitions. > +static inline bool is_rpmb_wlun(struct scsi_device *sdev) > +{ > + return (sdev->lun == ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_RPMB_WLUN)); > +} > + > +static inline bool is_device_wlun(struct scsi_device *sdev) > +{ > + return (sdev->lun == > + ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_UFS_DEVICE_WLUN)); > +} The Linux kernel coding style requires not to surround expressions with parentheses in return statements. > /** > + * ufshcd_setup_links - associate link b/w device wlun and other luns > + * @sdev: pointer to SCSI device > + * @hba: pointer to ufs hba > + */ > +static void ufshcd_setup_links(struct ufs_hba *hba, struct scsi_device *sdev) > +{ > + struct device_link *link; > + > + /* > + * device wlun is the supplier & rest of the luns are consumers > + * This ensures that device wlun suspends after all other luns. > + */ > + if (hba->sdev_ufs_device) { > + link = device_link_add(&sdev->sdev_gendev, > + &hba->sdev_ufs_device->sdev_gendev, > + DL_FLAG_PM_RUNTIME|DL_FLAG_RPM_ACTIVE); > + if (!link) { > + dev_err(&sdev->sdev_gendev, "Failed establishing link - %s\n", > + dev_name(&hba->sdev_ufs_device->sdev_gendev)); > + return; > + } > + hba->luns_avail--; > + /* Ignore REPORT_LUN wlun probing */ > + if (hba->luns_avail == 1) { > + ufshcd_rpm_put(hba); > + return; > + } > + } else { > + /* device wlun is probed */ > + hba->luns_avail--; > + } > +} Please add a comment that explains that it is assumed that the WLUNs are scanned before the other LUNs. > @@ -4862,8 +4913,13 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); > if (hba->quirks & UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE) > blk_queue_update_dma_alignment(q, PAGE_SIZE - 1); > - > - if (ufshcd_is_rpm_autosuspend_allowed(hba)) > + /* > + * Block runtime-pm until all consumers are added. > + * Refer ufshcd_setup_links(). > + */ > + if (is_device_wlun(sdev)) > + pm_runtime_get_noresume(&sdev->sdev_gendev); > + else if (ufshcd_is_rpm_autosuspend_allowed(hba)) > sdev->rpm_autosuspend = 1; > > ufshcd_crypto_setup_rq_keyslot_manager(hba, q); The following code is executed before ufshcd_async_scan() is called: dev = hba->dev; [ ... ] /* Hold auto suspend until async scan completes */ pm_runtime_get_sync(dev); and the following code occurs in ufshcd_add_lus(): pm_runtime_put_sync(hba->dev); Isn't that sufficient to postpone enabling of runtime PM until LUN scanning has finished? Or in other words, is adding a pm_runtime_get_noresume() call in ufshcd_slave_configure() really necessary? > @@ -4979,15 +5035,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) > */ > if (!hba->pm_op_in_progress && > !ufshcd_eh_in_progress(hba) && > - ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) && > - schedule_work(&hba->eeh_work)) { > - /* > - * Prevent suspend once eeh_work is scheduled > - * to avoid deadlock between ufshcd_suspend > - * and exception event handler. > - */ > - pm_runtime_get_noresume(hba->dev); > - } > + ufshcd_is_exception_event(lrbp->ucd_rsp_ptr)) > + /* Flushed in suspend */ > + schedule_work(&hba->eeh_work); What makes it safe to leave out the above pm_runtime_get_noresume() call? Thanks, Bart.
Hi Bart! > Patches sent to the SCSI mailing list should not have a "scsi: " prefix > in the subject. That prefix is inserted before any SCSI patches go into > Martin's tree. This doesn't actually matter. My script will add the prefix if it's not present.
On 4/15/2021 4:11 PM, Bart Van Assche wrote: > On 4/14/21 11:58 AM, Asutosh Das wrote: >> [ ... ] > Hi Bart, Thanks for the comments. I will fix the comments in the next version. > The following code is executed before ufshcd_async_scan() is called: > > dev = hba->dev; > [ ... ] > /* Hold auto suspend until async scan completes */ > pm_runtime_get_sync(dev); > That would only keep the hba runtime resumed. At this point of time the luns are not detected yet. > and the following code occurs in ufshcd_add_lus(): > > pm_runtime_put_sync(hba->dev); > > Isn't that sufficient to postpone enabling of runtime PM until LUN > scanning has finished? Or in other words, is adding a > pm_runtime_get_noresume() call in ufshcd_slave_configure() really necessary? > Yes, because the supplier (device wlun) may be suspended otherwise in scsi_sysfs_add_sdev(). >> @@ -4979,15 +5035,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) >> */ >> if (!hba->pm_op_in_progress && >> !ufshcd_eh_in_progress(hba) && >> - ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) && >> - schedule_work(&hba->eeh_work)) { >> - /* >> - * Prevent suspend once eeh_work is scheduled >> - * to avoid deadlock between ufshcd_suspend >> - * and exception event handler. >> - */ >> - pm_runtime_get_noresume(hba->dev); >> - } >> + ufshcd_is_exception_event(lrbp->ucd_rsp_ptr)) >> + /* Flushed in suspend */ >> + schedule_work(&hba->eeh_work); > > What makes it safe to leave out the above pm_runtime_get_noresume() call? > The __ufshcd_wl_suspend() would flush this work so that it doesn't run after suspend. > Thanks, > > Bart. > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, Linux Foundation Collaborative Project