mbox series

[v18,0/2] Enable power management for ufs wlun

Message ID cover.1618426513.git.asutoshd@codeaurora.org
Headers show
Series Enable power management for ufs wlun | expand

Message

Asutosh Das (asd) April 14, 2021, 6:58 p.m. UTC
This patch attempts to fix a deadlock in ufs while sending SSU.
Recently, blk_queue_enter() added a check to not process requests if the
queue is suspended. That leads to a resume of the associated device which
is suspended. In ufs, that device is ufs device wlun and it's parent is
ufs_hba. This resume tries to resume ufs device wlun which in turn tries
to resume ufs_hba, which is already in the process of suspending, thus
causing a deadlock.

This patch takes care of:
* Suspending the ufs device lun only after all other luns are suspended
* Sending SSU during ufs device wlun suspend
* Clearing uac for rpmb and ufs device wlun
* Not sending commands to the device during host suspend

v17 -> v18:
- Addressed Adrian's comments

v16 -> v17:
- Addressed Adrian's & Daejun's comments

v15 -> v16:
- Brought back the missing changes
  * Added scsi_autopm_[get/put] to ufs_debugfs[get/put]_user_access()
  * Fix ufshcd_wl_poweroff()

v14 -> v15:
- Rebased on 5.13/scsi-staging

v13 -> v14:
- Addressed Adrian's comments
  * Rebased it on top of scsi-next
  * Added scsi_autopm_[get/put] to ufs_debugfs[get/put]_user_access()
  * Resume the device in ufshcd_remove()
  * Unregister ufs_rpmb_wlun before ufs_dev_wlun
  * hba->shutting_down moved to ufshcd_wl_shutdown()

v12 -> v13:
- Addressed Adrian's comments
  * Paired pm_runtime_get_noresume() with pm_runtime_put()
  * no rpm_autosuspend for ufs device wlun
  * Moved runtime-pm init functionality to ufshcd_wl_probe()
- Addressed Bart's comments
  * Expanded abbrevs in commit message

v11 -> v12:
- Addressed Adrian's comments
  * Fixed ahit for Mediatek driver
  * Fixed error handling in ufshcd_core_init()
  * Tested this patch and the issue is still seen.

v10 -> v11:
- Fixed supplier suspending before consumer race
- Addressed Adrian's comments
  * Added proper resume/suspend cb to ufshcd_auto_hibern8_update()
  * Cosmetic changes to ufshcd-pci.c
  * Cleaned up ufshcd_system_suspend()
  * Added ufshcd_debugfs_eh_exit to ufshcd_core_init()

v9 -> v10:
- Addressed Adrian's comments
  * Moved suspend/resume vops to __ufshcd_wl_[suspend/resume]()
  * Added correct resume in ufs_bsg

v8 -> v9:
- Addressed Adrian's comments
  * Moved link transition to __ufshcd_wl_[suspend/resume]()
  * Fixed the other minor comments

v7 -> v8:
- Addressed Adrian's comments
  * Removed separate autosuspend delay for ufs-device lun
  * Fixed the ee handler getting scheduled during pm
  * Always runtime resume in suspend_prepare()
  * Added CONFIG_PM_SLEEP where needed
  
v6 -> v7:
  * Resume the ufs device before shutting it down

v5 -> v6:
- Addressed Adrian's comments
  * Added complete() cb
  * Added suspend_prepare() and complete() to all drivers
  * Moved suspend_prepare() and complete() to ufshcd
  * .poweroff() uses ufhcd_wl_poweroff()
  * Removed several forward declarations
  * Moved scsi_register_driver() to ufshcd_core_init()

v4 -> v5:
- Addressed Adrian's comments
  * Used the rpmb driver contributed by Adrian
  * Runtime-resume the ufs device during suspend to honor spm-lvl
  * Unregister the scsi_driver in ufshcd_remove()
  * Currently shutdown() puts the ufs device to power-down mode
    so, just removed ufshcd_pci_poweroff()
  * Quiesce the scsi device during shutdown instead of remove

v3 RFC -> v4:
- Addressed Bart's comments
  * Except that I didn't get any checkpatch failures
- Addressed Avri's comments
- Addressed Adrian's comments
  * Added a check for deepsleep power mode
  * Removed a couple of forward declarations
  * Didn't separate the scsi drivers because in rpmb case it just sends uac
    in resume and it seemed pretty neat to me.
- Added sysfs changes to resume the devices before accessing


Asutosh Das (2):
  scsi: ufs: Enable power management for wlun
  ufs: sysfs: Resume the proper scsi device

 drivers/scsi/ufs/cdns-pltfrm.c     |   2 +
 drivers/scsi/ufs/tc-dwc-g210-pci.c |   2 +
 drivers/scsi/ufs/ufs-debugfs.c     |   6 +-
 drivers/scsi/ufs/ufs-debugfs.h     |   2 +-
 drivers/scsi/ufs/ufs-exynos.c      |   2 +
 drivers/scsi/ufs/ufs-hisi.c        |   2 +
 drivers/scsi/ufs/ufs-mediatek.c    |  12 +-
 drivers/scsi/ufs/ufs-qcom.c        |   2 +
 drivers/scsi/ufs/ufs-sysfs.c       |  24 +-
 drivers/scsi/ufs/ufs_bsg.c         |   6 +-
 drivers/scsi/ufs/ufshcd-pci.c      |  36 +-
 drivers/scsi/ufs/ufshcd.c          | 692 +++++++++++++++++++++++++------------
 drivers/scsi/ufs/ufshcd.h          |  21 ++
 include/trace/events/ufs.h         |  20 ++
 14 files changed, 561 insertions(+), 268 deletions(-)

Comments

Adrian Hunter April 15, 2021, 11:06 a.m. UTC | #1
On 14/04/21 9:58 pm, Asutosh Das wrote:
> During runtime-suspend of ufs host, the scsi devices are
> already suspended and so are the queues associated with them.
> But the ufs host sends SSU (START_STOP_UNIT) to wlun
> during its runtime-suspend.
> During the process blk_queue_enter checks if the queue is not in
> suspended state. If so, it waits for the queue to resume, and never
> comes out of it.
> The commit
> (d55d15a33: scsi: block: Do not accept any requests while suspended)
> adds the check if the queue is in suspended state in blk_queue_enter().
> 
> Call trace:
>  __switch_to+0x174/0x2c4
>  __schedule+0x478/0x764
>  schedule+0x9c/0xe0
>  blk_queue_enter+0x158/0x228
>  blk_mq_alloc_request+0x40/0xa4
>  blk_get_request+0x2c/0x70
>  __scsi_execute+0x60/0x1c4
>  ufshcd_set_dev_pwr_mode+0x124/0x1e4
>  ufshcd_suspend+0x208/0x83c
>  ufshcd_runtime_suspend+0x40/0x154
>  ufshcd_pltfrm_runtime_suspend+0x14/0x20
>  pm_generic_runtime_suspend+0x28/0x3c
>  __rpm_callback+0x80/0x2a4
>  rpm_suspend+0x308/0x614
>  rpm_idle+0x158/0x228
>  pm_runtime_work+0x84/0xac
>  process_one_work+0x1f0/0x470
>  worker_thread+0x26c/0x4c8
>  kthread+0x13c/0x320
>  ret_from_fork+0x10/0x18
> 
> Fix this by registering ufs device wlun as a scsi driver and
> registering it for block runtime-pm. Also make this as a
> supplier for all other luns. That way, this device wlun
> suspends after all the consumers and resumes after
> hba resumes.

Can you also explain the new driver for RPMB WLUN and UAC changes here?

And also mention that the driver will now always be runtime
resumed before system suspend.

> 
> Co-developed-by: Can Guo <cang@codeaurora.org>
> Signed-off-by: Can Guo <cang@codeaurora.org>
> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
> ---

<SNIP>

> +static void ufshcd_setup_links(struct ufs_hba *hba, struct scsi_device *sdev)
> +{
> +	struct device_link *link;
> +
> +	/*
> +	 * device wlun is the supplier & rest of the luns are consumers
> +	 * This ensures that device wlun suspends after all other luns.
> +	 */
> +	if (hba->sdev_ufs_device) {
> +		link = device_link_add(&sdev->sdev_gendev,
> +				       &hba->sdev_ufs_device->sdev_gendev,
> +				       DL_FLAG_PM_RUNTIME|DL_FLAG_RPM_ACTIVE);

"|" could be surrounded by spaces i.e.
				       DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE);

> +		if (!link) {
> +			dev_err(&sdev->sdev_gendev, "Failed establishing link - %s\n",
> +				dev_name(&hba->sdev_ufs_device->sdev_gendev));
> +			return;
> +		}
> +		hba->luns_avail--;
> +		/* Ignore REPORT_LUN wlun probing */
> +		if (hba->luns_avail == 1) {
> +			ufshcd_rpm_put(hba);
> +			return;
> +		}
> +	} else {
> +		/* device wlun is probed */
> +		hba->luns_avail--;
> +	}
> +}

<SNIP>

> @@ -8916,42 +8906,214 @@ static int ufshcd_resume(struct ufs_hba *hba, enum ufs_pm_op pm_op)
>  	if (hba->ee_usr_mask)
>  		ufshcd_write_ee_control(hba);
>  
> -	hba->clk_gating.is_suspended = false;
> -
> -	if (ufshcd_is_clkscaling_supported(hba))
> -		ufshcd_clk_scaling_suspend(hba, false);
> -
> -	/* Enable Auto-Hibernate if configured */
> -	ufshcd_auto_hibern8_enable(hba);
> +	if (hba->clk_scaling.is_allowed)
> +		ufshcd_resume_clkscaling(hba);

We don't use clks, regulators, clk gating, clk scaling, but the
original code looks more correct because hba->clk_scaling.is_allowed
will have been set to false by ufshcd_clk_scaling_suspend(hba, true)
in __ufshcd_wl_suspend().

<SNIP>

> +#ifdef CONFIG_PM_SLEEP
> +static int ufshcd_wl_suspend(struct device *dev)
> +{
> +	struct scsi_device *sdev = to_scsi_device(dev);
> +	struct ufs_hba *hba;
> +	int ret;

For below:
	int ret = 0;

> +	ktime_t start = ktime_get();
> +
> +	hba = shost_priv(sdev->host);
> +	down(&hba->host_sem);

If we get here with dev runtime suspended, then skip
__ufshcd_wl_suspend() i.e.

	if (pm_runtime_suspended(dev))
		goto out;

> +	ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM);
> +	if (ret) {
> +		dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__,  ret);
> +		up(&hba->host_sem);
> +	} else {
> +		hba->is_sys_suspended = true;
> +	}

out:
	if (!ret)
		hba->is_sys_suspended = true;

> +
> +	trace_ufshcd_wl_suspend(dev_name(dev), ret,
> +		ktime_to_us(ktime_sub(ktime_get(), start)),
> +		hba->curr_dev_pwr_mode, hba->uic_link_state);
> +
> +	return ret;
> +}

<SNIP>
Bart Van Assche April 15, 2021, 11:11 p.m. UTC | #2
On 4/14/21 11:58 AM, Asutosh Das wrote:
> [ ... ]

Patches sent to the SCSI mailing list should not have a "scsi: " prefix
in the subject. That prefix is inserted before any SCSI patches go into
Martin's tree.

> diff --git a/drivers/scsi/ufs/cdns-pltfrm.c b/drivers/scsi/ufs/cdns-pltfrm.c
> index 13d9204..b9105e4 100644
> --- a/drivers/scsi/ufs/cdns-pltfrm.c
> +++ b/drivers/scsi/ufs/cdns-pltfrm.c
> @@ -323,6 +323,8 @@ static const struct dev_pm_ops cdns_ufs_dev_pm_ops = {
>  	.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
>  	.runtime_resume  = ufshcd_pltfrm_runtime_resume,
>  	.runtime_idle    = ufshcd_pltfrm_runtime_idle,
> +	.prepare	 = ufshcd_suspend_prepare,
> +	.complete	= ufshcd_resume_complete,
>  };
>  
>  static struct platform_driver cdns_ufs_pltfrm_driver = {
> diff --git a/drivers/scsi/ufs/tc-dwc-g210-pci.c b/drivers/scsi/ufs/tc-dwc-g210-pci.c
> index 67a6a61..b01db12 100644
> --- a/drivers/scsi/ufs/tc-dwc-g210-pci.c
> +++ b/drivers/scsi/ufs/tc-dwc-g210-pci.c
> @@ -148,6 +148,8 @@ static const struct dev_pm_ops tc_dwc_g210_pci_pm_ops = {
>  	.runtime_suspend = tc_dwc_g210_pci_runtime_suspend,
>  	.runtime_resume  = tc_dwc_g210_pci_runtime_resume,
>  	.runtime_idle    = tc_dwc_g210_pci_runtime_idle,
> +	.prepare	 = ufshcd_suspend_prepare,
> +	.complete	= ufshcd_resume_complete,
>  };

[ ... ]

> --- a/drivers/scsi/ufs/ufs-exynos.c
> +++ b/drivers/scsi/ufs/ufs-exynos.c
> @@ -1267,6 +1267,8 @@ static const struct dev_pm_ops exynos_ufs_pm_ops = {
>  	.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
>  	.runtime_resume  = ufshcd_pltfrm_runtime_resume,
>  	.runtime_idle    = ufshcd_pltfrm_runtime_idle,
> +	.prepare	 = ufshcd_suspend_prepare,
> +	.complete	= ufshcd_resume_complete,
>  };
>  
>  static struct platform_driver exynos_ufs_pltform = {
> diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
> index 0aa5813..d463b44 100644
> --- a/drivers/scsi/ufs/ufs-hisi.c
> +++ b/drivers/scsi/ufs/ufs-hisi.c
> @@ -574,6 +574,8 @@ static const struct dev_pm_ops ufs_hisi_pm_ops = {
>  	.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
>  	.runtime_resume  = ufshcd_pltfrm_runtime_resume,
>  	.runtime_idle    = ufshcd_pltfrm_runtime_idle,
> +	.prepare	 = ufshcd_suspend_prepare,
> +	.complete	= ufshcd_resume_complete,
>  };

A minor comment about source code formatting: please make sure that the
equality signs are aligned in struct dev_pm_ops definitions.

> +static inline bool is_rpmb_wlun(struct scsi_device *sdev)
> +{
> +	return (sdev->lun == ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_RPMB_WLUN));
> +}
> +
> +static inline bool is_device_wlun(struct scsi_device *sdev)
> +{
> +	return (sdev->lun ==
> +		ufshcd_upiu_wlun_to_scsi_wlun(UFS_UPIU_UFS_DEVICE_WLUN));
> +}

The Linux kernel coding style requires not to surround expressions with
parentheses in return statements.

>  /**
> + * ufshcd_setup_links - associate link b/w device wlun and other luns
> + * @sdev: pointer to SCSI device
> + * @hba: pointer to ufs hba
> + */
> +static void ufshcd_setup_links(struct ufs_hba *hba, struct scsi_device *sdev)
> +{
> +	struct device_link *link;
> +
> +	/*
> +	 * device wlun is the supplier & rest of the luns are consumers
> +	 * This ensures that device wlun suspends after all other luns.
> +	 */
> +	if (hba->sdev_ufs_device) {
> +		link = device_link_add(&sdev->sdev_gendev,
> +				       &hba->sdev_ufs_device->sdev_gendev,
> +				       DL_FLAG_PM_RUNTIME|DL_FLAG_RPM_ACTIVE);
> +		if (!link) {
> +			dev_err(&sdev->sdev_gendev, "Failed establishing link - %s\n",
> +				dev_name(&hba->sdev_ufs_device->sdev_gendev));
> +			return;
> +		}
> +		hba->luns_avail--;
> +		/* Ignore REPORT_LUN wlun probing */
> +		if (hba->luns_avail == 1) {
> +			ufshcd_rpm_put(hba);
> +			return;
> +		}
> +	} else {
> +		/* device wlun is probed */
> +		hba->luns_avail--;
> +	}
> +}

Please add a comment that explains that it is assumed that the WLUNs are
scanned before the other LUNs.

> @@ -4862,8 +4913,13 @@ static int ufshcd_slave_configure(struct scsi_device *sdev)
>  	blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1);
>  	if (hba->quirks & UFSHCD_QUIRK_ALIGN_SG_WITH_PAGE_SIZE)
>  		blk_queue_update_dma_alignment(q, PAGE_SIZE - 1);
> -
> -	if (ufshcd_is_rpm_autosuspend_allowed(hba))
> +	/*
> +	 * Block runtime-pm until all consumers are added.
> +	 * Refer ufshcd_setup_links().
> +	 */
> +	if (is_device_wlun(sdev))
> +		pm_runtime_get_noresume(&sdev->sdev_gendev);
> +	else if (ufshcd_is_rpm_autosuspend_allowed(hba))
>  		sdev->rpm_autosuspend = 1;
>  
>  	ufshcd_crypto_setup_rq_keyslot_manager(hba, q);

The following code is executed before ufshcd_async_scan() is called:

	dev = hba->dev;
	[ ... ]
	/* Hold auto suspend until async scan completes */
	pm_runtime_get_sync(dev);

and the following code occurs in ufshcd_add_lus():

	pm_runtime_put_sync(hba->dev);

Isn't that sufficient to postpone enabling of runtime PM until LUN
scanning has finished? Or in other words, is adding a
pm_runtime_get_noresume() call in ufshcd_slave_configure() really necessary?

> @@ -4979,15 +5035,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
>  			 */
>  			if (!hba->pm_op_in_progress &&
>  			    !ufshcd_eh_in_progress(hba) &&
> -			    ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) &&
> -			    schedule_work(&hba->eeh_work)) {
> -				/*
> -				 * Prevent suspend once eeh_work is scheduled
> -				 * to avoid deadlock between ufshcd_suspend
> -				 * and exception event handler.
> -				 */
> -				pm_runtime_get_noresume(hba->dev);
> -			}
> +			    ufshcd_is_exception_event(lrbp->ucd_rsp_ptr))
> +				/* Flushed in suspend */
> +				schedule_work(&hba->eeh_work);

What makes it safe to leave out the above pm_runtime_get_noresume() call?

Thanks,

Bart.
Martin K. Petersen April 16, 2021, 2:10 a.m. UTC | #3
Hi Bart!

> Patches sent to the SCSI mailing list should not have a "scsi: " prefix
> in the subject. That prefix is inserted before any SCSI patches go into
> Martin's tree.

This doesn't actually matter. My script will add the prefix if it's not
present.
Asutosh Das (asd) April 16, 2021, 6:21 p.m. UTC | #4
On 4/15/2021 4:11 PM, Bart Van Assche wrote:
> On 4/14/21 11:58 AM, Asutosh Das wrote:

>> [ ... ]

> 

Hi Bart,
Thanks for the comments. I will fix the comments in the next version.

> The following code is executed before ufshcd_async_scan() is called:

> 

> 	dev = hba->dev;

> 	[ ... ]

> 	/* Hold auto suspend until async scan completes */

> 	pm_runtime_get_sync(dev);

> 

That would only keep the hba runtime resumed. At this point of time the 
luns are not detected yet.
> and the following code occurs in ufshcd_add_lus():

> 

> 	pm_runtime_put_sync(hba->dev);

> 

> Isn't that sufficient to postpone enabling of runtime PM until LUN

> scanning has finished? Or in other words, is adding a

> pm_runtime_get_noresume() call in ufshcd_slave_configure() really necessary?

> 

Yes, because the supplier (device wlun) may be suspended otherwise in 
scsi_sysfs_add_sdev().
>> @@ -4979,15 +5035,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)

>>   			 */

>>   			if (!hba->pm_op_in_progress &&

>>   			    !ufshcd_eh_in_progress(hba) &&

>> -			    ufshcd_is_exception_event(lrbp->ucd_rsp_ptr) &&

>> -			    schedule_work(&hba->eeh_work)) {

>> -				/*

>> -				 * Prevent suspend once eeh_work is scheduled

>> -				 * to avoid deadlock between ufshcd_suspend

>> -				 * and exception event handler.

>> -				 */

>> -				pm_runtime_get_noresume(hba->dev);

>> -			}

>> +			    ufshcd_is_exception_event(lrbp->ucd_rsp_ptr))

>> +				/* Flushed in suspend */

>> +				schedule_work(&hba->eeh_work);

> 

> What makes it safe to leave out the above pm_runtime_get_noresume() call?

> 

The __ufshcd_wl_suspend() would flush this work so that it doesn't run 
after suspend.
> Thanks,

> 

> Bart.

> 



-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project