[v1] mmc: core: check R1_STATUS for erase/trim/discard

Message ID	20240423200234.21480-1-kamal.dasu@broadcom.com
State	New
Headers	show Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FBDB14198A for <linux-mmc@vger.kernel.org>; Tue, 23 Apr 2024 20:02:43 +0000 (UTC) From: Kamal Dasu <kamal.dasu@broadcom.com> To: ulf.hansson@linaro.org, linux-kernel@vger.kernel.org, linux-mmc@vger.kernel.org, ludovic.barre@st.com Cc: f.fainelli@gmail.com, bcm-kernel-feedback-list@broadcom.com, Kamal Dasu <kamal.dasu@broadcom.com> Subject: [PATCH v1] mmc: core: check R1_STATUS for erase/trim/discard Date: Tue, 23 Apr 2024 16:02:34 -0400 Message-Id: <20240423200234.21480-1-kamal.dasu@broadcom.com> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="00000000000079113f0616c90992" Precedence: bulk
Series	[v1] mmc: core: check R1_STATUS for erase/trim/discard \| expand [v1] mmc: core: check R1_STATUS for erase/trim/discard

Kamal Dasu April 23, 2024, 8:02 p.m. UTC

When erase/trim/discard completion was converted to mmc_poll_for_busy(),
optional ->card_busy() host ops support was added. sdhci card->busy()
could return busy for long periods to cause mmc_do_erase() to block during
discard operation as shown below during mkfs.f2fs :

Info: [/dev/mmcblk1p9] Discarding device
[   39.597258] sysrq: Show Blocked State
[   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
[   39.610609] Call trace:
[   39.613098]  __switch_to+0xd8/0xf4
[   39.616582]  __schedule+0x440/0x4f4
[   39.620137]  schedule+0x2c/0x48
[   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
[   39.628562]  schedule_hrtimeout_range+0x10/0x18
[   39.633169]  usleep_range_state+0x5c/0x90
[   39.637253]  __mmc_poll_for_busy+0xec/0x128
[   39.641514]  mmc_poll_for_busy+0x48/0x70
[   39.645511]  mmc_do_erase+0x1ec/0x210
[   39.649237]  mmc_erase+0x1b4/0x1d4
[   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
[   39.657037]  mmc_mq_queue_rq+0x18c/0x214
[   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
[   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
[   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
[   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
[   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
[   39.685278]  blk_add_rq_to_plug+0x68/0xd8
[   39.689365]  blk_mq_submit_bio+0x3a4/0x458
[   39.693539]  __submit_bio+0x1c/0x80
[   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
[   39.701875]  submit_bio_noacct+0x1b0/0x22c
[   39.706042]  submit_bio+0xac/0xe8
[   39.709424]  blk_next_bio+0x4c/0x5c
[   39.712973]  blkdev_issue_secure_erase+0x118/0x170
[   39.717835]  blkdev_common_ioctl+0x374/0x728
[   39.722175]  blkdev_ioctl+0x8c/0x2b0
[   39.725816]  vfs_ioctl+0x24/0x40
[   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
[   39.733114]  invoke_syscall+0x68/0xec
[   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
[   39.741609]  do_el0_svc+0x18/0x20
[   39.744981]  el0_svc+0x68/0x94
[   39.748107]  el0t_64_sync_handler+0x88/0x124
[   39.752455]  el0t_64_sync+0x168/0x16c

Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.

Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
---
 drivers/mmc/core/mmc_ops.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Ulf Hansson April 25, 2024, 4:18 p.m. UTC | #1

+ Wolfram, Adrian (to see if they have some input)

On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
>
> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
> optional ->card_busy() host ops support was added. sdhci card->busy()
> could return busy for long periods to cause mmc_do_erase() to block during
> discard operation as shown below during mkfs.f2fs :
>
> Info: [/dev/mmcblk1p9] Discarding device
> [   39.597258] sysrq: Show Blocked State
> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
> [   39.610609] Call trace:
> [   39.613098]  __switch_to+0xd8/0xf4
> [   39.616582]  __schedule+0x440/0x4f4
> [   39.620137]  schedule+0x2c/0x48
> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
> [   39.633169]  usleep_range_state+0x5c/0x90
> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
> [   39.641514]  mmc_poll_for_busy+0x48/0x70
> [   39.645511]  mmc_do_erase+0x1ec/0x210
> [   39.649237]  mmc_erase+0x1b4/0x1d4
> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
> [   39.693539]  __submit_bio+0x1c/0x80
> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
> [   39.706042]  submit_bio+0xac/0xe8
> [   39.709424]  blk_next_bio+0x4c/0x5c
> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
> [   39.717835]  blkdev_common_ioctl+0x374/0x728
> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
> [   39.725816]  vfs_ioctl+0x24/0x40
> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
> [   39.733114]  invoke_syscall+0x68/0xec
> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
> [   39.741609]  do_el0_svc+0x18/0x20
> [   39.744981]  el0_svc+0x68/0x94
> [   39.748107]  el0t_64_sync_handler+0x88/0x124
> [   39.752455]  el0t_64_sync+0x168/0x16c

Thanks for the detailed log!

>
> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
>
> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
> ---
>  drivers/mmc/core/mmc_ops.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> index 3b3adbddf664..603fbd78c342 100644
> --- a/drivers/mmc/core/mmc_ops.c
> +++ b/drivers/mmc/core/mmc_ops.c
> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
>         u32 status = 0;
>         int err;
>
> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
> +       if (data->busy_cmd != MMC_BUSY_IO &&
> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
>                 *busy = host->ops->card_busy(host);
>                 return 0;
>         }

So it seems like the ->card_busy() callback is broken in for your mmc
host-driver and platform. Can you perhaps provide the information
about what HW/driver you are using?

The point with using the ->card_busy() callback, is to avoid sending
the CMD13. Ideally it should be cheaper/faster and in most cases it
translates to a read of a register. For larger erases, we would
probably end up sending the CMD13 periodically every 32-64 ms, which
shouldn't be a problem. However, for smaller erases and discards, we
may want the benefit the ->card_busy() callback provides us.

I would suggest that we first try to fix the implementation of the
->card_busy() callback for your HW. If that isn't possible or fails,
then let's consider the approach you have taken in the $subject patch.

Kind regards
Uffe

Kamal Dasu April 26, 2024, 3:10 p.m. UTC | #2

On Fri, Apr 26, 2024 at 2:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 25/04/24 19:18, Ulf Hansson wrote:
> > + Wolfram, Adrian (to see if they have some input)
> >
> > On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> >>
> >> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
> >> optional ->card_busy() host ops support was added. sdhci card->busy()
> >> could return busy for long periods to cause mmc_do_erase() to block during
> >> discard operation as shown below during mkfs.f2fs :
> >>
> >> Info: [/dev/mmcblk1p9] Discarding device
> >> [   39.597258] sysrq: Show Blocked State
> >> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
> >> [   39.610609] Call trace:
> >> [   39.613098]  __switch_to+0xd8/0xf4
> >> [   39.616582]  __schedule+0x440/0x4f4
> >> [   39.620137]  schedule+0x2c/0x48
> >> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
> >> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
> >> [   39.633169]  usleep_range_state+0x5c/0x90
> >> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
> >> [   39.641514]  mmc_poll_for_busy+0x48/0x70
> >> [   39.645511]  mmc_do_erase+0x1ec/0x210
> >> [   39.649237]  mmc_erase+0x1b4/0x1d4
> >> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
> >> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
> >> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
> >> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
> >> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
> >> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
> >> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
> >> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
> >> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
> >> [   39.693539]  __submit_bio+0x1c/0x80
> >> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
> >> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
> >> [   39.706042]  submit_bio+0xac/0xe8
> >> [   39.709424]  blk_next_bio+0x4c/0x5c
> >> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
> >> [   39.717835]  blkdev_common_ioctl+0x374/0x728
> >> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
> >> [   39.725816]  vfs_ioctl+0x24/0x40
> >> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
> >> [   39.733114]  invoke_syscall+0x68/0xec
> >> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
> >> [   39.741609]  do_el0_svc+0x18/0x20
> >> [   39.744981]  el0_svc+0x68/0x94
> >> [   39.748107]  el0t_64_sync_handler+0x88/0x124
> >> [   39.752455]  el0t_64_sync+0x168/0x16c
> >
> > Thanks for the detailed log!
> >
> >>
> >> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
> >> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
> >>
> >> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
> >> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
> >> ---
> >>  drivers/mmc/core/mmc_ops.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> >> index 3b3adbddf664..603fbd78c342 100644
> >> --- a/drivers/mmc/core/mmc_ops.c
> >> +++ b/drivers/mmc/core/mmc_ops.c
> >> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
> >>         u32 status = 0;
> >>         int err;
> >>
> >> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
> >> +       if (data->busy_cmd != MMC_BUSY_IO &&
> >> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
> >>                 *busy = host->ops->card_busy(host);
> >>                 return 0;
> >>         }
> >
> > So it seems like the ->card_busy() callback is broken in for your mmc
> > host-driver and platform. Can you perhaps provide the information
> > about what HW/driver you are using?
> >

Using the sdhci-brcmstb driver on a Broadcom Settop based SoC.

> > The point with using the ->card_busy() callback, is to avoid sending
> > the CMD13. Ideally it should be cheaper/faster and in most cases it
> > translates to a read of a register. For larger erases, we would
> > probably end up sending the CMD13 periodically every 32-64 ms, which
> > shouldn't be a problem. However, for smaller erases and discards, we
> > may want the benefit the ->card_busy() callback provides us.
> >

I have tested two scenarios. One is like the mkfs.f2fs app that calls :
ioctl(fd, BLKSECDISCARD, &range);

This has the following CMD and completion sequence:
{CMD35->CMD36->CMD38} and poll for  DAT0  signal via card->busy .
CMD38 has a response of R1b. The DAT0 (Busy line) will be driven by the device.
Busy (DAT0  = 0) is asserted by a device for  erasing blocks. Stays
busy in brcmstb sdhci controller.

With the additional change followed by CMD13 (response of R1), which
returns the device status, the
DAT0 will be pulled-up and next time we read the BUSY status it will
indicate it is not busy.

Also I tried the mmc util and that does not show the same issue with
exactly the same ranges, however in that case there are some
differences in the way the CMD sequence is sent for the entire discard
operation.
# mmc erase discard 0x000087a4 0x002effff /dev/mmcblk1
/* send erase cmd with multi-cmd */
ret = ioctl(dev_fd, MMC_IOC_MULTI_CMD, multi_cmd);

CMD35->CMD13->CMD36->CMD13->CMD38->CMD13
I do not see any hang in all the erase options discard, legacy, trim, trim2,
secure-trim used here with mmc util .

Also looking at JEDEC Standard No. 84-B51 Page 276, 277
"Once the erase groups are selected the host will send an ERASE
(CMD38) command. It is recom-
mended that the host terminates the sequence with a SEND_STATUS
(CMD13) to poll any additional
status information the Device may have (e.g., WP_ERASE_SKIP, etc.)."

> > I would suggest that we first try to fix the implementation of the
> > ->card_busy() callback for your HW. If that isn't possible or fails,
> > then let's consider the approach you have taken in the $subject patch.
>

I have verified this with hardware behavior and default
sdhci->card->busy() seems to work fine except
in the above sequence of CMD35->CMD36->CMD38 without any MMC_STATUS interleaved.

Maybe we can do both, check the card busy and send CMD13.
__mmc_poll_for_busy() does have
throttling changes as well that would limit the rate at which CMD13
will be sent.

> Note, sdhci drivers can override host->ops.  For example,
> sdhci-omap.c has:
>
>         host->mmc_host_ops.card_busy = sdhci_omap_card_busy;
>
> Probably, if ->card_busy() cannot be supported, then setting
> it to NULL would work.
>
>         host->mmc_host_ops.card_busy = NULL; /* Cannot detect card busy */
>

Yes, that's an option I have tested to be working and will have to
take in case  mmc_busy_cb() change  is not acceptable.

Thanks
Kamal

Ulf Hansson May 3, 2024, 1:09 p.m. UTC | #3

On Fri, 26 Apr 2024 at 17:11, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
>
> On Fri, Apr 26, 2024 at 2:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 25/04/24 19:18, Ulf Hansson wrote:
> > > + Wolfram, Adrian (to see if they have some input)
> > >
> > > On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> > >>
> > >> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
> > >> optional ->card_busy() host ops support was added. sdhci card->busy()
> > >> could return busy for long periods to cause mmc_do_erase() to block during
> > >> discard operation as shown below during mkfs.f2fs :
> > >>
> > >> Info: [/dev/mmcblk1p9] Discarding device
> > >> [   39.597258] sysrq: Show Blocked State
> > >> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
> > >> [   39.610609] Call trace:
> > >> [   39.613098]  __switch_to+0xd8/0xf4
> > >> [   39.616582]  __schedule+0x440/0x4f4
> > >> [   39.620137]  schedule+0x2c/0x48
> > >> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
> > >> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
> > >> [   39.633169]  usleep_range_state+0x5c/0x90
> > >> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
> > >> [   39.641514]  mmc_poll_for_busy+0x48/0x70
> > >> [   39.645511]  mmc_do_erase+0x1ec/0x210
> > >> [   39.649237]  mmc_erase+0x1b4/0x1d4
> > >> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
> > >> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
> > >> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
> > >> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
> > >> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
> > >> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
> > >> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
> > >> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
> > >> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
> > >> [   39.693539]  __submit_bio+0x1c/0x80
> > >> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
> > >> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
> > >> [   39.706042]  submit_bio+0xac/0xe8
> > >> [   39.709424]  blk_next_bio+0x4c/0x5c
> > >> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
> > >> [   39.717835]  blkdev_common_ioctl+0x374/0x728
> > >> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
> > >> [   39.725816]  vfs_ioctl+0x24/0x40
> > >> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
> > >> [   39.733114]  invoke_syscall+0x68/0xec
> > >> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
> > >> [   39.741609]  do_el0_svc+0x18/0x20
> > >> [   39.744981]  el0_svc+0x68/0x94
> > >> [   39.748107]  el0t_64_sync_handler+0x88/0x124
> > >> [   39.752455]  el0t_64_sync+0x168/0x16c
> > >
> > > Thanks for the detailed log!
> > >
> > >>
> > >> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
> > >> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
> > >>
> > >> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
> > >> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
> > >> ---
> > >>  drivers/mmc/core/mmc_ops.c | 3 ++-
> > >>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> > >> index 3b3adbddf664..603fbd78c342 100644
> > >> --- a/drivers/mmc/core/mmc_ops.c
> > >> +++ b/drivers/mmc/core/mmc_ops.c
> > >> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
> > >>         u32 status = 0;
> > >>         int err;
> > >>
> > >> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
> > >> +       if (data->busy_cmd != MMC_BUSY_IO &&
> > >> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
> > >>                 *busy = host->ops->card_busy(host);
> > >>                 return 0;
> > >>         }
> > >
> > > So it seems like the ->card_busy() callback is broken in for your mmc
> > > host-driver and platform. Can you perhaps provide the information
> > > about what HW/driver you are using?
> > >
>
> Using the sdhci-brcmstb driver on a Broadcom Settop based SoC.
>
> > > The point with using the ->card_busy() callback, is to avoid sending
> > > the CMD13. Ideally it should be cheaper/faster and in most cases it
> > > translates to a read of a register. For larger erases, we would
> > > probably end up sending the CMD13 periodically every 32-64 ms, which
> > > shouldn't be a problem. However, for smaller erases and discards, we
> > > may want the benefit the ->card_busy() callback provides us.
> > >
>
> I have tested two scenarios. One is like the mkfs.f2fs app that calls :
> ioctl(fd, BLKSECDISCARD, &range);
>
> This has the following CMD and completion sequence:
> {CMD35->CMD36->CMD38} and poll for  DAT0  signal via card->busy .
> CMD38 has a response of R1b. The DAT0 (Busy line) will be driven by the device.
> Busy (DAT0  = 0) is asserted by a device for  erasing blocks. Stays
> busy in brcmstb sdhci controller.

How big is the "range"?

Just so I get this right, it stays busy and we are waiting for the
timeout to fire? And ideally you think we should not be busy for that
long, right?

>
> With the additional change followed by CMD13 (response of R1), which
> returns the device status, the
> DAT0 will be pulled-up and next time we read the BUSY status it will
> indicate it is not busy.

So you are referring to read the BUSY status with you ->card_busy()
callback? Or did you actually verify that this is true from an
electrical point of view, by monitoring the DAT0 signal?

If the latter, perhaps it's the card that is failing and simply
requires CMD13 to be used to poll for busy. What card is this?

Have you tried different cards with the same platform/driver?

>
> Also I tried the mmc util and that does not show the same issue with
> exactly the same ranges, however in that case there are some
> differences in the way the CMD sequence is sent for the entire discard
> operation.
> # mmc erase discard 0x000087a4 0x002effff /dev/mmcblk1
> /* send erase cmd with multi-cmd */
> ret = ioctl(dev_fd, MMC_IOC_MULTI_CMD, multi_cmd);
>
> CMD35->CMD13->CMD36->CMD13->CMD38->CMD13
> I do not see any hang in all the erase options discard, legacy, trim, trim2,
> secure-trim used here with mmc util .

So CMD13 seems to do the trick for you. Although, I think we need to
figure out if this a special "broken" card or if the problem is with
the ->card_busy() implementation for your platform.

>
> Also looking at JEDEC Standard No. 84-B51 Page 276, 277
> "Once the erase groups are selected the host will send an ERASE
> (CMD38) command. It is recom-
> mended that the host terminates the sequence with a SEND_STATUS
> (CMD13) to poll any additional
> status information the Device may have (e.g., WP_ERASE_SKIP, etc.)."

Isn't that exactly what is being done? After the card has stopped
signaling busy, we send a CMD13 in mmc_busy_cb() to read the
additional status information.

I don't get it, why should the card stop signal busy, just because we
send a CMD13. If so, the card should probably be considered broken.
For broken cards, we can try to use a card-quirk instead - to enforce
CMD13 polling.

>
> > > I would suggest that we first try to fix the implementation of the
> > > ->card_busy() callback for your HW. If that isn't possible or fails,
> > > then let's consider the approach you have taken in the $subject patch.
> >
>
> I have verified this with hardware behavior and default
> sdhci->card->busy() seems to work fine except
> in the above sequence of CMD35->CMD36->CMD38 without any MMC_STATUS interleaved.
>
> Maybe we can do both, check the card busy and send CMD13.
> __mmc_poll_for_busy() does have
> throttling changes as well that would limit the rate at which CMD13
> will be sent.

Polling with both wouldn't help, but make the polling-loop heaviers,
we certainly don't want that.

>
> > Note, sdhci drivers can override host->ops.  For example,
> > sdhci-omap.c has:
> >
> >         host->mmc_host_ops.card_busy = sdhci_omap_card_busy;
> >
> > Probably, if ->card_busy() cannot be supported, then setting
> > it to NULL would work.
> >
> >         host->mmc_host_ops.card_busy = NULL; /* Cannot detect card busy */
> >
>
> Yes, that's an option I have tested to be working and will have to
> take in case  mmc_busy_cb() change  is not acceptable.
>
> Thanks
> Kamal

Kind regards
Uffe

Christian Loehle May 3, 2024, 1:55 p.m. UTC | #4

On 03/05/2024 14:09, Ulf Hansson wrote:
> On Fri, 26 Apr 2024 at 17:11, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
>>
>> On Fri, Apr 26, 2024 at 2:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>
>>> On 25/04/24 19:18, Ulf Hansson wrote:
>>>> + Wolfram, Adrian (to see if they have some input)
>>>>
>>>> On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
>>>>>
>>>>> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
>>>>> optional ->card_busy() host ops support was added. sdhci card->busy()
>>>>> could return busy for long periods to cause mmc_do_erase() to block during
>>>>> discard operation as shown below during mkfs.f2fs :
>>>>>
>>>>> Info: [/dev/mmcblk1p9] Discarding device
>>>>> [   39.597258] sysrq: Show Blocked State
>>>>> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
>>>>> [   39.610609] Call trace:
>>>>> [   39.613098]  __switch_to+0xd8/0xf4
>>>>> [   39.616582]  __schedule+0x440/0x4f4
>>>>> [   39.620137]  schedule+0x2c/0x48
>>>>> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
>>>>> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
>>>>> [   39.633169]  usleep_range_state+0x5c/0x90
>>>>> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
>>>>> [   39.641514]  mmc_poll_for_busy+0x48/0x70
>>>>> [   39.645511]  mmc_do_erase+0x1ec/0x210
>>>>> [   39.649237]  mmc_erase+0x1b4/0x1d4
>>>>> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
>>>>> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
>>>>> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
>>>>> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
>>>>> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
>>>>> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
>>>>> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
>>>>> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
>>>>> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
>>>>> [   39.693539]  __submit_bio+0x1c/0x80
>>>>> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
>>>>> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
>>>>> [   39.706042]  submit_bio+0xac/0xe8
>>>>> [   39.709424]  blk_next_bio+0x4c/0x5c
>>>>> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
>>>>> [   39.717835]  blkdev_common_ioctl+0x374/0x728
>>>>> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
>>>>> [   39.725816]  vfs_ioctl+0x24/0x40
>>>>> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
>>>>> [   39.733114]  invoke_syscall+0x68/0xec
>>>>> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
>>>>> [   39.741609]  do_el0_svc+0x18/0x20
>>>>> [   39.744981]  el0_svc+0x68/0x94
>>>>> [   39.748107]  el0t_64_sync_handler+0x88/0x124
>>>>> [   39.752455]  el0t_64_sync+0x168/0x16c
>>>>
>>>> Thanks for the detailed log!
>>>>
>>>>>
>>>>> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
>>>>> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
>>>>>
>>>>> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
>>>>> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
>>>>> ---
>>>>>  drivers/mmc/core/mmc_ops.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
>>>>> index 3b3adbddf664..603fbd78c342 100644
>>>>> --- a/drivers/mmc/core/mmc_ops.c
>>>>> +++ b/drivers/mmc/core/mmc_ops.c
>>>>> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
>>>>>         u32 status = 0;
>>>>>         int err;
>>>>>
>>>>> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
>>>>> +       if (data->busy_cmd != MMC_BUSY_IO &&
>>>>> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
>>>>>                 *busy = host->ops->card_busy(host);
>>>>>                 return 0;
>>>>>         }
>>>>
>>>> So it seems like the ->card_busy() callback is broken in for your mmc
>>>> host-driver and platform. Can you perhaps provide the information
>>>> about what HW/driver you are using?
>>>>
>>
>> Using the sdhci-brcmstb driver on a Broadcom Settop based SoC.
>>
>>>> The point with using the ->card_busy() callback, is to avoid sending
>>>> the CMD13. Ideally it should be cheaper/faster and in most cases it
>>>> translates to a read of a register. For larger erases, we would
>>>> probably end up sending the CMD13 periodically every 32-64 ms, which
>>>> shouldn't be a problem. However, for smaller erases and discards, we
>>>> may want the benefit the ->card_busy() callback provides us.
>>>>
>>
>> I have tested two scenarios. One is like the mkfs.f2fs app that calls :
>> ioctl(fd, BLKSECDISCARD, &range);
>>
>> This has the following CMD and completion sequence:
>> {CMD35->CMD36->CMD38} and poll for  DAT0  signal via card->busy .
>> CMD38 has a response of R1b. The DAT0 (Busy line) will be driven by the device.
>> Busy (DAT0  = 0) is asserted by a device for  erasing blocks. Stays
>> busy in brcmstb sdhci controller.
> 
> How big is the "range"?
> 
> Just so I get this right, it stays busy and we are waiting for the
> timeout to fire? And ideally you think we should not be busy for that
> long, right?
> 
>>
>> With the additional change followed by CMD13 (response of R1), which
>> returns the device status, the
>> DAT0 will be pulled-up and next time we read the BUSY status it will
>> indicate it is not busy.
> 
> So you are referring to read the BUSY status with you ->card_busy()
> callback? Or did you actually verify that this is true from an
> electrical point of view, by monitoring the DAT0 signal?
> 
> If the latter, perhaps it's the card that is failing and simply
> requires CMD13 to be used to poll for busy. What card is this?
> 
> Have you tried different cards with the same platform/driver?
> 
>>
>> Also I tried the mmc util and that does not show the same issue with
>> exactly the same ranges, however in that case there are some
>> differences in the way the CMD sequence is sent for the entire discard
>> operation.
>> # mmc erase discard 0x000087a4 0x002effff /dev/mmcblk1
>> /* send erase cmd with multi-cmd */
>> ret = ioctl(dev_fd, MMC_IOC_MULTI_CMD, multi_cmd);
>>
>> CMD35->CMD13->CMD36->CMD13->CMD38->CMD13
>> I do not see any hang in all the erase options discard, legacy, trim, trim2,
>> secure-trim used here with mmc util .
> 
> So CMD13 seems to do the trick for you. Although, I think we need to
> figure out if this a special "broken" card or if the problem is with
> the ->card_busy() implementation for your platform.
> 
>>
>> Also looking at JEDEC Standard No. 84-B51 Page 276, 277
>> "Once the erase groups are selected the host will send an ERASE
>> (CMD38) command. It is recom-
>> mended that the host terminates the sequence with a SEND_STATUS
>> (CMD13) to poll any additional
>> status information the Device may have (e.g., WP_ERASE_SKIP, etc.)."
> 
> Isn't that exactly what is being done? After the card has stopped
> signaling busy, we send a CMD13 in mmc_busy_cb() to read the
> additional status information.

Agreed with your interpretation FWIW.

> 
> I don't get it, why should the card stop signal busy, just because we
> send a CMD13. If so, the card should probably be considered broken.
> For broken cards, we can try to use a card-quirk instead - to enforce
> CMD13 polling.

I'll mention it here, I've seen some broken IP out there where the card's
FSM (including busy-signalling) was dependent on the host providing the
CLK, can't remember which one it was, though.
Anyway for Kamal, it might be interesting to know if your host controller
autostops the CLK (which it is allowed to) during the busy-signalling and
if not stopping it also works around the problem.

> [snip]

Kamal Dasu May 6, 2024, 6:39 p.m. UTC | #5

I was testing with different.

On Fri, May 3, 2024 at 9:55 AM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 03/05/2024 14:09, Ulf Hansson wrote:
> > On Fri, 26 Apr 2024 at 17:11, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> >>
> >> On Fri, Apr 26, 2024 at 2:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>>
> >>> On 25/04/24 19:18, Ulf Hansson wrote:
> >>>> + Wolfram, Adrian (to see if they have some input)
> >>>>
> >>>> On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> >>>>>
> >>>>> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
> >>>>> optional ->card_busy() host ops support was added. sdhci card->busy()
> >>>>> could return busy for long periods to cause mmc_do_erase() to block during
> >>>>> discard operation as shown below during mkfs.f2fs :
> >>>>>
> >>>>> Info: [/dev/mmcblk1p9] Discarding device
> >>>>> [   39.597258] sysrq: Show Blocked State
> >>>>> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
> >>>>> [   39.610609] Call trace:
> >>>>> [   39.613098]  __switch_to+0xd8/0xf4
> >>>>> [   39.616582]  __schedule+0x440/0x4f4
> >>>>> [   39.620137]  schedule+0x2c/0x48
> >>>>> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
> >>>>> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
> >>>>> [   39.633169]  usleep_range_state+0x5c/0x90
> >>>>> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
> >>>>> [   39.641514]  mmc_poll_for_busy+0x48/0x70
> >>>>> [   39.645511]  mmc_do_erase+0x1ec/0x210
> >>>>> [   39.649237]  mmc_erase+0x1b4/0x1d4
> >>>>> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
> >>>>> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
> >>>>> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
> >>>>> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
> >>>>> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
> >>>>> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
> >>>>> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
> >>>>> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
> >>>>> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
> >>>>> [   39.693539]  __submit_bio+0x1c/0x80
> >>>>> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
> >>>>> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
> >>>>> [   39.706042]  submit_bio+0xac/0xe8
> >>>>> [   39.709424]  blk_next_bio+0x4c/0x5c
> >>>>> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
> >>>>> [   39.717835]  blkdev_common_ioctl+0x374/0x728
> >>>>> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
> >>>>> [   39.725816]  vfs_ioctl+0x24/0x40
> >>>>> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
> >>>>> [   39.733114]  invoke_syscall+0x68/0xec
> >>>>> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
> >>>>> [   39.741609]  do_el0_svc+0x18/0x20
> >>>>> [   39.744981]  el0_svc+0x68/0x94
> >>>>> [   39.748107]  el0t_64_sync_handler+0x88/0x124
> >>>>> [   39.752455]  el0t_64_sync+0x168/0x16c
> >>>>
> >>>> Thanks for the detailed log!
> >>>>
> >>>>>
> >>>>> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
> >>>>> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
> >>>>>
> >>>>> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
> >>>>> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
> >>>>> ---
> >>>>>  drivers/mmc/core/mmc_ops.c | 3 ++-
> >>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> >>>>> index 3b3adbddf664..603fbd78c342 100644
> >>>>> --- a/drivers/mmc/core/mmc_ops.c
> >>>>> +++ b/drivers/mmc/core/mmc_ops.c
> >>>>> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
> >>>>>         u32 status = 0;
> >>>>>         int err;
> >>>>>
> >>>>> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
> >>>>> +       if (data->busy_cmd != MMC_BUSY_IO &&
> >>>>> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
> >>>>>                 *busy = host->ops->card_busy(host);
> >>>>>                 return 0;
> >>>>>         }
> >>>>
> >>>> So it seems like the ->card_busy() callback is broken in for your mmc
> >>>> host-driver and platform. Can you perhaps provide the information
> >>>> about what HW/driver you are using?
> >>>>
> >>
> >> Using the sdhci-brcmstb driver on a Broadcom Settop based SoC.
> >>
> >>>> The point with using the ->card_busy() callback, is to avoid sending
> >>>> the CMD13. Ideally it should be cheaper/faster and in most cases it
> >>>> translates to a read of a register. For larger erases, we would
> >>>> probably end up sending the CMD13 periodically every 32-64 ms, which
> >>>> shouldn't be a problem. However, for smaller erases and discards, we
> >>>> may want the benefit the ->card_busy() callback provides us.
> >>>>
> >>
> >> I have tested two scenarios. One is like the mkfs.f2fs app that calls :
> >> ioctl(fd, BLKSECDISCARD, &range);
> >>
> >> This has the following CMD and completion sequence:
> >> {CMD35->CMD36->CMD38} and poll for  DAT0  signal via card->busy .
> >> CMD38 has a response of R1b. The DAT0 (Busy line) will be driven by the device.
> >> Busy (DAT0  = 0) is asserted by a device for  erasing blocks. Stays
> >> busy in brcmstb sdhci controller.
> >
> > How big is the "range"?
> >

The task just blocks here. I have attached the stack.

> > Just so I get this right, it stays busy and we are waiting for the
> > timeout to fire? And ideally you think we should not be busy for that
> > long, right?
> >
The timeout does not fire.

> >>
> >> With the additional change followed by CMD13 (response of R1), which
> >> returns the device status, the
> >> DAT0 will be pulled-up and next time we read the BUSY status it will
> >> indicate it is not busy.
> >
> > So you are referring to read the BUSY status with you ->card_busy()
> > callback? Or did you actually verify that this is true from an
> > electrical point of view, by monitoring the DAT0 signal?
> >
> > If the latter, perhaps it's the card that is failing and simply
> > requires CMD13 to be used to poll for busy. What card is this?
> >
> > Have you tried different cards with the same platform/driver?
> >
Yes, I have tried Micron and Sandisk 8GB cards with the same platform
with the same results.

> >>
> >> Also I tried the mmc util and that does not show the same issue with
> >> exactly the same ranges, however in that case there are some
> >> differences in the way the CMD sequence is sent for the entire discard
> >> operation.
> >> # mmc erase discard 0x000087a4 0x002effff /dev/mmcblk1
> >> /* send erase cmd with multi-cmd */
> >> ret = ioctl(dev_fd, MMC_IOC_MULTI_CMD, multi_cmd);
> >>
> >> CMD35->CMD13->CMD36->CMD13->CMD38->CMD13
> >> I do not see any hang in all the erase options discard, legacy, trim, trim2,
> >> secure-trim used here with mmc util .
> >
> > So CMD13 seems to do the trick for you. Although, I think we need to
> > figure out if this a special "broken" card or if the problem is with
> > the ->card_busy() implementation for your platform.
> >
> >>
> >> Also looking at JEDEC Standard No. 84-B51 Page 276, 277
> >> "Once the erase groups are selected the host will send an ERASE
> >> (CMD38) command. It is recom-
> >> mended that the host terminates the sequence with a SEND_STATUS
> >> (CMD13) to poll any additional
> >> status information the Device may have (e.g., WP_ERASE_SKIP, etc.)."
> >
> > Isn't that exactly what is being done? After the card has stopped
> > signaling busy, we send a CMD13 in mmc_busy_cb() to read the
> > additional status information.
>
Yes, that is what the fix does. However in the original if
->card->busy() is being used we do not send the CMD13.

> Agreed with your interpretation FWIW.
>
> >
> > I don't get it, why should the card stop signal busy, just because we
> > send a CMD13. If so, the card should probably be considered broken.
> > For broken cards, we can try to use a card-quirk instead - to enforce
> > CMD13 polling.
>
Yes  would be one of the solutions that can be considered. Would open
to any suggestion on what card-quirk to use.

> I'll mention it here, I've seen some broken IP out there where the card's
> FSM (including busy-signalling) was dependent on the host providing the
> CLK, can't remember which one it was, though.
> Anyway for Kamal, it might be interesting to know if your host controller
> autostops the CLK (which it is allowed to) during the busy-signalling and
> if not stopping it also works around the problem.
>
The host controller is not stopping the clock. Also stopping the clock
does not work around the problem. host is expecting the device FSM
pull up the busy line.

At this point I can either do this when  adding host:
+       /* we dont use busy signal */
+       host->mmc_host_ops.card_busy = NULL;

Or introduce a card-quirk for this situation.

Kamal

Ulf Hansson May 8, 2024, 8:51 a.m. UTC | #6

On Mon, 6 May 2024 at 20:39, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
>
> I was testing with different.
>
> On Fri, May 3, 2024 at 9:55 AM Christian Loehle
> <christian.loehle@arm.com> wrote:
> >
> > On 03/05/2024 14:09, Ulf Hansson wrote:
> > > On Fri, 26 Apr 2024 at 17:11, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> > >>
> > >> On Fri, Apr 26, 2024 at 2:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>>
> > >>> On 25/04/24 19:18, Ulf Hansson wrote:
> > >>>> + Wolfram, Adrian (to see if they have some input)
> > >>>>
> > >>>> On Tue, 23 Apr 2024 at 22:02, Kamal Dasu <kamal.dasu@broadcom.com> wrote:
> > >>>>>
> > >>>>> When erase/trim/discard completion was converted to mmc_poll_for_busy(),
> > >>>>> optional ->card_busy() host ops support was added. sdhci card->busy()
> > >>>>> could return busy for long periods to cause mmc_do_erase() to block during
> > >>>>> discard operation as shown below during mkfs.f2fs :
> > >>>>>
> > >>>>> Info: [/dev/mmcblk1p9] Discarding device
> > >>>>> [   39.597258] sysrq: Show Blocked State
> > >>>>> [   39.601183] task:mkfs.f2fs       state:D stack:0     pid:1561  tgid:1561  ppid:1542   flags:0x0000000d
> > >>>>> [   39.610609] Call trace:
> > >>>>> [   39.613098]  __switch_to+0xd8/0xf4
> > >>>>> [   39.616582]  __schedule+0x440/0x4f4
> > >>>>> [   39.620137]  schedule+0x2c/0x48
> > >>>>> [   39.623341]  schedule_hrtimeout_range_clock+0xe0/0x114
> > >>>>> [   39.628562]  schedule_hrtimeout_range+0x10/0x18
> > >>>>> [   39.633169]  usleep_range_state+0x5c/0x90
> > >>>>> [   39.637253]  __mmc_poll_for_busy+0xec/0x128
> > >>>>> [   39.641514]  mmc_poll_for_busy+0x48/0x70
> > >>>>> [   39.645511]  mmc_do_erase+0x1ec/0x210
> > >>>>> [   39.649237]  mmc_erase+0x1b4/0x1d4
> > >>>>> [   39.652701]  mmc_blk_mq_issue_rq+0x35c/0x6ac
> > >>>>> [   39.657037]  mmc_mq_queue_rq+0x18c/0x214
> > >>>>> [   39.661022]  blk_mq_dispatch_rq_list+0x3a8/0x528
> > >>>>> [   39.665722]  __blk_mq_sched_dispatch_requests+0x3a0/0x4ac
> > >>>>> [   39.671198]  blk_mq_sched_dispatch_requests+0x28/0x5c
> > >>>>> [   39.676322]  blk_mq_run_hw_queue+0x11c/0x12c
> > >>>>> [   39.680668]  blk_mq_flush_plug_list+0x200/0x33c
> > >>>>> [   39.685278]  blk_add_rq_to_plug+0x68/0xd8
> > >>>>> [   39.689365]  blk_mq_submit_bio+0x3a4/0x458
> > >>>>> [   39.693539]  __submit_bio+0x1c/0x80
> > >>>>> [   39.697096]  submit_bio_noacct_nocheck+0x94/0x174
> > >>>>> [   39.701875]  submit_bio_noacct+0x1b0/0x22c
> > >>>>> [   39.706042]  submit_bio+0xac/0xe8
> > >>>>> [   39.709424]  blk_next_bio+0x4c/0x5c
> > >>>>> [   39.712973]  blkdev_issue_secure_erase+0x118/0x170
> > >>>>> [   39.717835]  blkdev_common_ioctl+0x374/0x728
> > >>>>> [   39.722175]  blkdev_ioctl+0x8c/0x2b0
> > >>>>> [   39.725816]  vfs_ioctl+0x24/0x40
> > >>>>> [   39.729117]  __arm64_sys_ioctl+0x5c/0x8c
> > >>>>> [   39.733114]  invoke_syscall+0x68/0xec
> > >>>>> [   39.736839]  el0_svc_common.constprop.0+0x70/0xd8
> > >>>>> [   39.741609]  do_el0_svc+0x18/0x20
> > >>>>> [   39.744981]  el0_svc+0x68/0x94
> > >>>>> [   39.748107]  el0t_64_sync_handler+0x88/0x124
> > >>>>> [   39.752455]  el0t_64_sync+0x168/0x16c
> > >>>>
> > >>>> Thanks for the detailed log!
> > >>>>
> > >>>>>
> > >>>>> Fix skips the card->busy() and uses MMC_SEND_STATUS and R1_STATUS
> > >>>>> check for MMC_ERASE_BUSY busy_cmd case in the mmc_busy_cb() function.
> > >>>>>
> > >>>>> Fixes: 0d84c3e6a5b2 ("mmc: core: Convert to mmc_poll_for_busy() for erase/trim/discard")
> > >>>>> Signed-off-by: Kamal Dasu <kamal.dasu@broadcom.com>
> > >>>>> ---
> > >>>>>  drivers/mmc/core/mmc_ops.c | 3 ++-
> > >>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>>>>
> > >>>>> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> > >>>>> index 3b3adbddf664..603fbd78c342 100644
> > >>>>> --- a/drivers/mmc/core/mmc_ops.c
> > >>>>> +++ b/drivers/mmc/core/mmc_ops.c
> > >>>>> @@ -464,7 +464,8 @@ static int mmc_busy_cb(void *cb_data, bool *busy)
> > >>>>>         u32 status = 0;
> > >>>>>         int err;
> > >>>>>
> > >>>>> -       if (data->busy_cmd != MMC_BUSY_IO && host->ops->card_busy) {
> > >>>>> +       if (data->busy_cmd != MMC_BUSY_IO &&
> > >>>>> +           data->busy_cmd != MMC_BUSY_ERASE && host->ops->card_busy) {
> > >>>>>                 *busy = host->ops->card_busy(host);
> > >>>>>                 return 0;
> > >>>>>         }
> > >>>>
> > >>>> So it seems like the ->card_busy() callback is broken in for your mmc
> > >>>> host-driver and platform. Can you perhaps provide the information
> > >>>> about what HW/driver you are using?
> > >>>>
> > >>
> > >> Using the sdhci-brcmstb driver on a Broadcom Settop based SoC.
> > >>
> > >>>> The point with using the ->card_busy() callback, is to avoid sending
> > >>>> the CMD13. Ideally it should be cheaper/faster and in most cases it
> > >>>> translates to a read of a register. For larger erases, we would
> > >>>> probably end up sending the CMD13 periodically every 32-64 ms, which
> > >>>> shouldn't be a problem. However, for smaller erases and discards, we
> > >>>> may want the benefit the ->card_busy() callback provides us.
> > >>>>
> > >>
> > >> I have tested two scenarios. One is like the mkfs.f2fs app that calls :
> > >> ioctl(fd, BLKSECDISCARD, &range);
> > >>
> > >> This has the following CMD and completion sequence:
> > >> {CMD35->CMD36->CMD38} and poll for  DAT0  signal via card->busy .
> > >> CMD38 has a response of R1b. The DAT0 (Busy line) will be driven by the device.
> > >> Busy (DAT0  = 0) is asserted by a device for  erasing blocks. Stays
> > >> busy in brcmstb sdhci controller.
> > >
> > > How big is the "range"?
> > >
>
> The task just blocks here. I have attached the stack.
>
> > > Just so I get this right, it stays busy and we are waiting for the
> > > timeout to fire? And ideally you think we should not be busy for that
> > > long, right?
> > >
> The timeout does not fire.
>
> > >>
> > >> With the additional change followed by CMD13 (response of R1), which
> > >> returns the device status, the
> > >> DAT0 will be pulled-up and next time we read the BUSY status it will
> > >> indicate it is not busy.
> > >
> > > So you are referring to read the BUSY status with you ->card_busy()
> > > callback? Or did you actually verify that this is true from an
> > > electrical point of view, by monitoring the DAT0 signal?
> > >
> > > If the latter, perhaps it's the card that is failing and simply
> > > requires CMD13 to be used to poll for busy. What card is this?
> > >
> > > Have you tried different cards with the same platform/driver?
> > >
> Yes, I have tried Micron and Sandisk 8GB cards with the same platform
> with the same results.

Okay, so it sounds like it's the ->card_busy() callback that may not
work as expected after all.

Do we have an SD card slot on this platform that we can try to run
similar tests for but with an SD card instead?

>
> > >>
> > >> Also I tried the mmc util and that does not show the same issue with
> > >> exactly the same ranges, however in that case there are some
> > >> differences in the way the CMD sequence is sent for the entire discard
> > >> operation.
> > >> # mmc erase discard 0x000087a4 0x002effff /dev/mmcblk1
> > >> /* send erase cmd with multi-cmd */
> > >> ret = ioctl(dev_fd, MMC_IOC_MULTI_CMD, multi_cmd);
> > >>
> > >> CMD35->CMD13->CMD36->CMD13->CMD38->CMD13
> > >> I do not see any hang in all the erase options discard, legacy, trim, trim2,
> > >> secure-trim used here with mmc util .
> > >
> > > So CMD13 seems to do the trick for you. Although, I think we need to
> > > figure out if this a special "broken" card or if the problem is with
> > > the ->card_busy() implementation for your platform.
> > >
> > >>
> > >> Also looking at JEDEC Standard No. 84-B51 Page 276, 277
> > >> "Once the erase groups are selected the host will send an ERASE
> > >> (CMD38) command. It is recom-
> > >> mended that the host terminates the sequence with a SEND_STATUS
> > >> (CMD13) to poll any additional
> > >> status information the Device may have (e.g., WP_ERASE_SKIP, etc.)."
> > >
> > > Isn't that exactly what is being done? After the card has stopped
> > > signaling busy, we send a CMD13 in mmc_busy_cb() to read the
> > > additional status information.
> >
> Yes, that is what the fix does. However in the original if
> ->card->busy() is being used we do not send the CMD13.

We are not sending CMD13 to poll for busy, but we are indeed sending a
CMD13 *after* the card has stopped signal busy, to read the additional
status information.

This seems to be according to the spec.

>
> > Agreed with your interpretation FWIW.
> >
> > >
> > > I don't get it, why should the card stop signal busy, just because we
> > > send a CMD13. If so, the card should probably be considered broken.
> > > For broken cards, we can try to use a card-quirk instead - to enforce
> > > CMD13 polling.
> >
> Yes  would be one of the solutions that can be considered. Would open
> to any suggestion on what card-quirk to use.

As you have tested different cards, it's more likely to be a problem
with the platform/HW and the ->card_busy() callback.

>
> > I'll mention it here, I've seen some broken IP out there where the card's
> > FSM (including busy-signalling) was dependent on the host providing the
> > CLK, can't remember which one it was, though.
> > Anyway for Kamal, it might be interesting to know if your host controller
> > autostops the CLK (which it is allowed to) during the busy-signalling and
> > if not stopping it also works around the problem.
> >
> The host controller is not stopping the clock. Also stopping the clock
> does not work around the problem. host is expecting the device FSM
> pull up the busy line.
>
> At this point I can either do this when  adding host:
> +       /* we dont use busy signal */
> +       host->mmc_host_ops.card_busy = NULL;

Looks like this is the best option, unless we can figure out why the
->card_busy() callback doesn't work as expected.

>
> Or introduce a card-quirk for this situation.
>
> Kamal

Kind regards
Uffe

[v1] mmc: core: check R1_STATUS for erase/trim/discard

Commit Message

Comments

Patch