[PATH,v3,0/2] Two UFS bug fixes

Message ID	20210106214109.44041-1-jaegeuk@kernel.org
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Jaegeuk Kim <jaegeuk@kernel.org> To: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org, kernel-team@android.com Cc: cang@codeaurora.org, alim.akhtar@samsung.com, avri.altman@wdc.com, bvanassche@acm.org, martin.petersen@oracle.com, stanley.chu@mediatek.com Subject: [PATH v3 0/2] Two UFS bug fixes Date: Wed, 6 Jan 2021 13:41:07 -0800 Message-Id: <20210106214109.44041-1-jaegeuk@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Two UFS bug fixes \| expand [PATH,v3,0/2] Two UFS bug fixes [v3,1/2] scsi: ufs: fix livelock of ufshcd_clear_ua_wluns

Jaegeuk Kim Jan. 6, 2021, 9:41 p.m. UTC

Change log from v2:
 - fix build warning

Can Guo Jan. 7, 2021, 6:07 a.m. UTC | #1

Hi Jaegeuk,

On 2021-01-07 05:41, Jaegeuk Kim wrote:
> From: Jaegeuk Kim <jaegeuk@google.com>

> 

> This fixes a warning caused by wrong reserve tag usage in 

> __ufshcd_issue_tm_cmd.

> 

> WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 

> blk_get_request+0x68/0x70

> WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 

> blk_mq_get_tag+0x438/0x46c

> 

> And, in ufshcd_err_handler(), we can avoid to send tm_cmd before 

> aborting

> outstanding commands by waiting a bit for IO completion like this.

> 

> __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

> 

> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

> ---

>  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

>  1 file changed, 32 insertions(+), 4 deletions(-)

> 

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 1678cec08b51..47fc8da3cbf9 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -44,6 +44,9 @@

>  /* Query request timeout */

>  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

> 

> +/* LINERESET TIME OUT */

> +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

> +

>  /* Task management command timeout */

>  #define TM_CMD_TIMEOUT	100 /* msecs */

> 

> @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct 

> *work)

>  	 * check if power mode restore is needed.

>  	 */

>  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

> +		ktime_t start = ktime_get();


I don't see the connection btw line-reset and following tmf cmd.
My point is that line-reset is not the only non-fatal error which
leads us to the following tmf cmd. So the wait should be outside
of this check - just put it right before clearing outstanding reqs.

Thanks,
Can Guo.

> +

>  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

>  		if (!hba->saved_uic_err)

>  			hba->saved_err &= ~UIC_ERROR;

> @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct 

> work_struct *work)

>  		if (ufshcd_is_pwr_mode_restore_needed(hba))

>  			needs_restore = true;

>  		spin_lock_irqsave(hba->host->host_lock, flags);

> +		/* Wait for IO completion to avoid aborting IOs */

> +		while (hba->outstanding_reqs) {

> +			ufshcd_complete_requests(hba);

> +			spin_unlock_irqrestore(hba->host->host_lock, flags);

> +			schedule();

> +			spin_lock_irqsave(hba->host->host_lock, flags);

> +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

> +						LINERESET_IO_TIMEOUT_MS) {

> +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

> +					__func__, hba->outstanding_reqs);

> +				break;

> +			}

> +		}

> +

>  		if (!hba->saved_err && !needs_restore)

>  			goto skip_err_handling;

>  	}

> @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void 

> *__hba)

>  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

>  	}

> 

> -	if (enabled_intr_status && retval == IRQ_NONE) {

> -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

> -					__func__, intr_status);

> +	if (enabled_intr_status && retval == IRQ_NONE &&

> +				!ufshcd_eh_in_progress(hba)) {

> +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 

> 0x%08x)\n",

> +					__func__,

> +					intr_status,

> +					hba->ufs_stats.last_intr_status,

> +					enabled_intr_status);

>  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

>  	}

> 

> @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba 

> *hba,

>  	 * Even though we use wait_event() which sleeps indefinitely,

>  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

>  	 */

> -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

> +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

> +						BLK_MQ_REQ_NOWAIT);

> +	if (IS_ERR(req))

> +		return PTR_ERR(req);

> +

>  	req->end_io_data = &wait;

>  	free_slot = req->tag;

>  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

> @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

> __iomem *mmio_base, unsigned int irq)

> 

>  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

>  		.nr_hw_queues	= 1,

> +		.reserved_tags	= 1,

>  		.queue_depth	= hba->nutmrs,

>  		.ops		= &ufshcd_tmf_ops,

>  		.flags		= BLK_MQ_F_NO_SCHED,

Can Guo Jan. 7, 2021, 6:38 a.m. UTC | #2

On 2021-01-07 05:41, Jaegeuk Kim wrote:
> From: Jaegeuk Kim <jaegeuk@google.com>

> 

> This fixes a warning caused by wrong reserve tag usage in 

> __ufshcd_issue_tm_cmd.

> 

> WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 

> blk_get_request+0x68/0x70

> WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 

> blk_mq_get_tag+0x438/0x46c

> 

> And, in ufshcd_err_handler(), we can avoid to send tm_cmd before 

> aborting

> outstanding commands by waiting a bit for IO completion like this.

> 

> __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

> 


Would you mind add a Fixes tag?

> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

> ---

>  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

>  1 file changed, 32 insertions(+), 4 deletions(-)

> 

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 1678cec08b51..47fc8da3cbf9 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -44,6 +44,9 @@

>  /* Query request timeout */

>  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

> 

> +/* LINERESET TIME OUT */

> +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

> +

>  /* Task management command timeout */

>  #define TM_CMD_TIMEOUT	100 /* msecs */

> 

> @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct 

> *work)

>  	 * check if power mode restore is needed.

>  	 */

>  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

> +		ktime_t start = ktime_get();

> +

>  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

>  		if (!hba->saved_uic_err)

>  			hba->saved_err &= ~UIC_ERROR;

> @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct 

> work_struct *work)

>  		if (ufshcd_is_pwr_mode_restore_needed(hba))

>  			needs_restore = true;

>  		spin_lock_irqsave(hba->host->host_lock, flags);

> +		/* Wait for IO completion to avoid aborting IOs */

> +		while (hba->outstanding_reqs) {

> +			ufshcd_complete_requests(hba);

> +			spin_unlock_irqrestore(hba->host->host_lock, flags);

> +			schedule();

> +			spin_lock_irqsave(hba->host->host_lock, flags);

> +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

> +						LINERESET_IO_TIMEOUT_MS) {

> +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

> +					__func__, hba->outstanding_reqs);

> +				break;

> +			}

> +		}

> +

>  		if (!hba->saved_err && !needs_restore)

>  			goto skip_err_handling;

>  	}

> @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void 

> *__hba)

>  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

>  	}

> 

> -	if (enabled_intr_status && retval == IRQ_NONE) {

> -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

> -					__func__, intr_status);

> +	if (enabled_intr_status && retval == IRQ_NONE &&

> +				!ufshcd_eh_in_progress(hba)) {

> +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x, 

> 0x%08x)\n",

> +					__func__,

> +					intr_status,

> +					hba->ufs_stats.last_intr_status,

> +					enabled_intr_status);

>  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

>  	}

> 

> @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba 

> *hba,

>  	 * Even though we use wait_event() which sleeps indefinitely,

>  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

>  	 */

> -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

> +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

> +						BLK_MQ_REQ_NOWAIT);


Sorry that I didn't pay much attention to this part of code before.
May I know why must we use the BLK_MQ_REQ_RESERVED flag?

Thanks,
Can Guo.

> +	if (IS_ERR(req))

> +		return PTR_ERR(req);

> +

>  	req->end_io_data = &wait;

>  	free_slot = req->tag;

>  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

> @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

> __iomem *mmio_base, unsigned int irq)

> 

>  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

>  		.nr_hw_queues	= 1,

> +		.reserved_tags	= 1,


If we give reserved_tags as 1 and always ask for a tm requst with
BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated
from the reserved sbitmap_queue, whose depth is set to 1 here.
UFS supports tm queue depth as 8, but here is allowing only one tm
req at a time. Why? Please correct me if my understanding is wrong.

Thanks,
Can Guo.

>  		.queue_depth	= hba->nutmrs,

>  		.ops		= &ufshcd_tmf_ops,

>  		.flags		= BLK_MQ_F_NO_SCHED,

Jaegeuk Kim Jan. 7, 2021, 6:51 a.m. UTC | #3

On 01/07, Can Guo wrote:
> On 2021-01-07 05:41, Jaegeuk Kim wrote:

> > From: Jaegeuk Kim <jaegeuk@google.com>

> > 

> > This fixes a warning caused by wrong reserve tag usage in

> > __ufshcd_issue_tm_cmd.

> > 

> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70

> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82

> > blk_mq_get_tag+0x438/0x46c

> > 

> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before

> > aborting

> > outstanding commands by waiting a bit for IO completion like this.

> > 

> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

> > 

> 

> Would you mind add a Fixes tag?


Ok.

> 

> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

> > ---

> >  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

> >  1 file changed, 32 insertions(+), 4 deletions(-)

> > 

> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> > index 1678cec08b51..47fc8da3cbf9 100644

> > --- a/drivers/scsi/ufs/ufshcd.c

> > +++ b/drivers/scsi/ufs/ufshcd.c

> > @@ -44,6 +44,9 @@

> >  /* Query request timeout */

> >  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

> > 

> > +/* LINERESET TIME OUT */

> > +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

> > +

> >  /* Task management command timeout */

> >  #define TM_CMD_TIMEOUT	100 /* msecs */

> > 

> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct

> > *work)

> >  	 * check if power mode restore is needed.

> >  	 */

> >  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

> > +		ktime_t start = ktime_get();

> > +

> >  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

> >  		if (!hba->saved_uic_err)

> >  			hba->saved_err &= ~UIC_ERROR;

> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct

> > *work)

> >  		if (ufshcd_is_pwr_mode_restore_needed(hba))

> >  			needs_restore = true;

> >  		spin_lock_irqsave(hba->host->host_lock, flags);

> > +		/* Wait for IO completion to avoid aborting IOs */

> > +		while (hba->outstanding_reqs) {

> > +			ufshcd_complete_requests(hba);

> > +			spin_unlock_irqrestore(hba->host->host_lock, flags);

> > +			schedule();

> > +			spin_lock_irqsave(hba->host->host_lock, flags);

> > +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

> > +						LINERESET_IO_TIMEOUT_MS) {

> > +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

> > +					__func__, hba->outstanding_reqs);

> > +				break;

> > +			}

> > +		}

> > +

> >  		if (!hba->saved_err && !needs_restore)

> >  			goto skip_err_handling;

> >  	}

> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void

> > *__hba)

> >  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

> >  	}

> > 

> > -	if (enabled_intr_status && retval == IRQ_NONE) {

> > -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

> > -					__func__, intr_status);

> > +	if (enabled_intr_status && retval == IRQ_NONE &&

> > +				!ufshcd_eh_in_progress(hba)) {

> > +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x,

> > 0x%08x)\n",

> > +					__func__,

> > +					intr_status,

> > +					hba->ufs_stats.last_intr_status,

> > +					enabled_intr_status);

> >  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

> >  	}

> > 

> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba

> > *hba,

> >  	 * Even though we use wait_event() which sleeps indefinitely,

> >  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

> >  	 */

> > -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

> > +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

> > +						BLK_MQ_REQ_NOWAIT);

> 

> Sorry that I didn't pay much attention to this part of code before.

> May I know why must we use the BLK_MQ_REQ_RESERVED flag?


What I understood is the reserved tag is used when aborting outstanding
IOs when all the 32 tags were used.

> 

> Thanks,

> Can Guo.

> 

> > +	if (IS_ERR(req))

> > +		return PTR_ERR(req);

> > +

> >  	req->end_io_data = &wait;

> >  	free_slot = req->tag;

> >  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

> > __iomem *mmio_base, unsigned int irq)

> > 

> >  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

> >  		.nr_hw_queues	= 1,

> > +		.reserved_tags	= 1,

> 

> If we give reserved_tags as 1 and always ask for a tm requst with

> BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated

> from the reserved sbitmap_queue, whose depth is set to 1 here.

> UFS supports tm queue depth as 8, but here is allowing only one tm

> req at a time. Why? Please correct me if my understanding is wrong.


I couldn't find tm can be issued in parallel, so thought it was issued
one at a time. If we set 8, then we can use 24 for IOs, IIUC.

Please correct me as well. I'm still trying to understand the flow.

> 

> Thanks,

> Can Guo.

> 

> >  		.queue_depth	= hba->nutmrs,

> >  		.ops		= &ufshcd_tmf_ops,

> >  		.flags		= BLK_MQ_F_NO_SCHED,

Jaegeuk Kim Jan. 7, 2021, 6:51 a.m. UTC | #4

On 01/07, Can Guo wrote:
> Hi Jaegeuk,

> 

> On 2021-01-07 05:41, Jaegeuk Kim wrote:

> > From: Jaegeuk Kim <jaegeuk@google.com>

> > 

> > This fixes a warning caused by wrong reserve tag usage in

> > __ufshcd_issue_tm_cmd.

> > 

> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70

> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82

> > blk_mq_get_tag+0x438/0x46c

> > 

> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before

> > aborting

> > outstanding commands by waiting a bit for IO completion like this.

> > 

> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

> > 

> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

> > ---

> >  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

> >  1 file changed, 32 insertions(+), 4 deletions(-)

> > 

> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> > index 1678cec08b51..47fc8da3cbf9 100644

> > --- a/drivers/scsi/ufs/ufshcd.c

> > +++ b/drivers/scsi/ufs/ufshcd.c

> > @@ -44,6 +44,9 @@

> >  /* Query request timeout */

> >  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

> > 

> > +/* LINERESET TIME OUT */

> > +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

> > +

> >  /* Task management command timeout */

> >  #define TM_CMD_TIMEOUT	100 /* msecs */

> > 

> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct

> > *work)

> >  	 * check if power mode restore is needed.

> >  	 */

> >  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

> > +		ktime_t start = ktime_get();

> 

> I don't see the connection btw line-reset and following tmf cmd.

> My point is that line-reset is not the only non-fatal error which

> leads us to the following tmf cmd. So the wait should be outside

> of this check - just put it right before clearing outstanding reqs.


Ok. Let me move it in v4.

> 

> Thanks,

> Can Guo.

> 

> > +

> >  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

> >  		if (!hba->saved_uic_err)

> >  			hba->saved_err &= ~UIC_ERROR;

> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct

> > *work)

> >  		if (ufshcd_is_pwr_mode_restore_needed(hba))

> >  			needs_restore = true;

> >  		spin_lock_irqsave(hba->host->host_lock, flags);

> > +		/* Wait for IO completion to avoid aborting IOs */

> > +		while (hba->outstanding_reqs) {

> > +			ufshcd_complete_requests(hba);

> > +			spin_unlock_irqrestore(hba->host->host_lock, flags);

> > +			schedule();

> > +			spin_lock_irqsave(hba->host->host_lock, flags);

> > +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

> > +						LINERESET_IO_TIMEOUT_MS) {

> > +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

> > +					__func__, hba->outstanding_reqs);

> > +				break;

> > +			}

> > +		}

> > +

> >  		if (!hba->saved_err && !needs_restore)

> >  			goto skip_err_handling;

> >  	}

> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void

> > *__hba)

> >  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

> >  	}

> > 

> > -	if (enabled_intr_status && retval == IRQ_NONE) {

> > -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

> > -					__func__, intr_status);

> > +	if (enabled_intr_status && retval == IRQ_NONE &&

> > +				!ufshcd_eh_in_progress(hba)) {

> > +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x,

> > 0x%08x)\n",

> > +					__func__,

> > +					intr_status,

> > +					hba->ufs_stats.last_intr_status,

> > +					enabled_intr_status);

> >  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

> >  	}

> > 

> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba

> > *hba,

> >  	 * Even though we use wait_event() which sleeps indefinitely,

> >  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

> >  	 */

> > -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

> > +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

> > +						BLK_MQ_REQ_NOWAIT);

> > +	if (IS_ERR(req))

> > +		return PTR_ERR(req);

> > +

> >  	req->end_io_data = &wait;

> >  	free_slot = req->tag;

> >  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

> > __iomem *mmio_base, unsigned int irq)

> > 

> >  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

> >  		.nr_hw_queues	= 1,

> > +		.reserved_tags	= 1,

> >  		.queue_depth	= hba->nutmrs,

> >  		.ops		= &ufshcd_tmf_ops,

> >  		.flags		= BLK_MQ_F_NO_SCHED,

Can Guo Jan. 7, 2021, 7:03 a.m. UTC | #5

On 2021-01-07 14:51, Jaegeuk Kim wrote:
> On 01/07, Can Guo wrote:

>> On 2021-01-07 05:41, Jaegeuk Kim wrote:

>> > From: Jaegeuk Kim <jaegeuk@google.com>

>> >

>> > This fixes a warning caused by wrong reserve tag usage in

>> > __ufshcd_issue_tm_cmd.

>> >

>> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70

>> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82

>> > blk_mq_get_tag+0x438/0x46c

>> >

>> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before

>> > aborting

>> > outstanding commands by waiting a bit for IO completion like this.

>> >

>> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

>> >

>> 

>> Would you mind add a Fixes tag?

> 

> Ok.

> 

>> 

>> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

>> > ---

>> >  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

>> >  1 file changed, 32 insertions(+), 4 deletions(-)

>> >

>> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

>> > index 1678cec08b51..47fc8da3cbf9 100644

>> > --- a/drivers/scsi/ufs/ufshcd.c

>> > +++ b/drivers/scsi/ufs/ufshcd.c

>> > @@ -44,6 +44,9 @@

>> >  /* Query request timeout */

>> >  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

>> >

>> > +/* LINERESET TIME OUT */

>> > +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

>> > +

>> >  /* Task management command timeout */

>> >  #define TM_CMD_TIMEOUT	100 /* msecs */

>> >

>> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct

>> > *work)

>> >  	 * check if power mode restore is needed.

>> >  	 */

>> >  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

>> > +		ktime_t start = ktime_get();

>> > +

>> >  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

>> >  		if (!hba->saved_uic_err)

>> >  			hba->saved_err &= ~UIC_ERROR;

>> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct

>> > *work)

>> >  		if (ufshcd_is_pwr_mode_restore_needed(hba))

>> >  			needs_restore = true;

>> >  		spin_lock_irqsave(hba->host->host_lock, flags);

>> > +		/* Wait for IO completion to avoid aborting IOs */

>> > +		while (hba->outstanding_reqs) {

>> > +			ufshcd_complete_requests(hba);

>> > +			spin_unlock_irqrestore(hba->host->host_lock, flags);

>> > +			schedule();

>> > +			spin_lock_irqsave(hba->host->host_lock, flags);

>> > +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

>> > +						LINERESET_IO_TIMEOUT_MS) {

>> > +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

>> > +					__func__, hba->outstanding_reqs);

>> > +				break;

>> > +			}

>> > +		}

>> > +

>> >  		if (!hba->saved_err && !needs_restore)

>> >  			goto skip_err_handling;

>> >  	}

>> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void

>> > *__hba)

>> >  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

>> >  	}

>> >

>> > -	if (enabled_intr_status && retval == IRQ_NONE) {

>> > -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

>> > -					__func__, intr_status);

>> > +	if (enabled_intr_status && retval == IRQ_NONE &&

>> > +				!ufshcd_eh_in_progress(hba)) {

>> > +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x,

>> > 0x%08x)\n",

>> > +					__func__,

>> > +					intr_status,

>> > +					hba->ufs_stats.last_intr_status,

>> > +					enabled_intr_status);

>> >  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

>> >  	}

>> >

>> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba

>> > *hba,

>> >  	 * Even though we use wait_event() which sleeps indefinitely,

>> >  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

>> >  	 */

>> > -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

>> > +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

>> > +						BLK_MQ_REQ_NOWAIT);

>> 

>> Sorry that I didn't pay much attention to this part of code before.

>> May I know why must we use the BLK_MQ_REQ_RESERVED flag?

> 

> What I understood is the reserved tag is used when aborting outstanding

> IOs when all the 32 tags were used.

> 


No, the tm requests and I/O requests are on two different tag sets:
tm requests come from hba->tmf_tag_set, while I/O requests come from
hba->shost->tag_set. Meaning they don't share tags with each other.

>> 

>> Thanks,

>> Can Guo.

>> 

>> > +	if (IS_ERR(req))

>> > +		return PTR_ERR(req);

>> > +

>> >  	req->end_io_data = &wait;

>> >  	free_slot = req->tag;

>> >  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

>> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

>> > __iomem *mmio_base, unsigned int irq)

>> >

>> >  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

>> >  		.nr_hw_queues	= 1,

>> > +		.reserved_tags	= 1,

>> 

>> If we give reserved_tags as 1 and always ask for a tm requst with

>> BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated

>> from the reserved sbitmap_queue, whose depth is set to 1 here.

>> UFS supports tm queue depth as 8, but here is allowing only one tm

>> req at a time. Why? Please correct me if my understanding is wrong.

> 

> I couldn't find tm can be issued in parallel, so thought it was issued

> one at a time. If we set 8, then we can use 24 for IOs, IIUC.

> 

> Please correct me as well. I'm still trying to understand the flow.

> 


UFS allows a queue depth as 8, which means it support sending multiple
tm requests at the same time. You can check commit 69a6c269c097d780a2 -
before this change, we used to use below func to allocate tags for
tm reqs, which can tell you the true story.

So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED 
flag?
Removing it can also fix the warning I suppose. What do you think?

-static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int 
*free_slot)
-{
-       int tag;
-       bool ret = false;
-
-       if (!free_slot)
-               goto out;
-
-       do {
-               tag = find_first_zero_bit(&hba->tm_slots_in_use, 
hba->nutmrs);
-               if (tag >= hba->nutmrs)
-                       goto out;
-       } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use));
-
-       *free_slot = tag;
-       ret = true;
-out:
-       return ret;
-}

Thanks,
Can Guo.

>> 

>> Thanks,

>> Can Guo.

>> 

>> >  		.queue_depth	= hba->nutmrs,

>> >  		.ops		= &ufshcd_tmf_ops,

>> >  		.flags		= BLK_MQ_F_NO_SCHED,

Can Guo Jan. 7, 2021, 7:20 a.m. UTC | #6

On 2021-01-07 15:03, Can Guo wrote:
> On 2021-01-07 14:51, Jaegeuk Kim wrote:

>> On 01/07, Can Guo wrote:

>>> On 2021-01-07 05:41, Jaegeuk Kim wrote:

>>> > From: Jaegeuk Kim <jaegeuk@google.com>

>>> >

>>> > This fixes a warning caused by wrong reserve tag usage in

>>> > __ufshcd_issue_tm_cmd.

>>> >

>>> > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70

>>> > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82

>>> > blk_mq_get_tag+0x438/0x46c

>>> >

>>> > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before

>>> > aborting

>>> > outstanding commands by waiting a bit for IO completion like this.

>>> >

>>> > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

>>> >

>>> 

>>> Would you mind add a Fixes tag?

>> 

>> Ok.

>> 

>>> 

>>> > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

>>> > ---

>>> >  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

>>> >  1 file changed, 32 insertions(+), 4 deletions(-)

>>> >

>>> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

>>> > index 1678cec08b51..47fc8da3cbf9 100644

>>> > --- a/drivers/scsi/ufs/ufshcd.c

>>> > +++ b/drivers/scsi/ufs/ufshcd.c

>>> > @@ -44,6 +44,9 @@

>>> >  /* Query request timeout */

>>> >  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

>>> >

>>> > +/* LINERESET TIME OUT */

>>> > +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

>>> > +

>>> >  /* Task management command timeout */

>>> >  #define TM_CMD_TIMEOUT	100 /* msecs */

>>> >

>>> > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct

>>> > *work)

>>> >  	 * check if power mode restore is needed.

>>> >  	 */

>>> >  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

>>> > +		ktime_t start = ktime_get();

>>> > +

>>> >  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

>>> >  		if (!hba->saved_uic_err)

>>> >  			hba->saved_err &= ~UIC_ERROR;

>>> > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct

>>> > *work)

>>> >  		if (ufshcd_is_pwr_mode_restore_needed(hba))

>>> >  			needs_restore = true;

>>> >  		spin_lock_irqsave(hba->host->host_lock, flags);

>>> > +		/* Wait for IO completion to avoid aborting IOs */

>>> > +		while (hba->outstanding_reqs) {

>>> > +			ufshcd_complete_requests(hba);

>>> > +			spin_unlock_irqrestore(hba->host->host_lock, flags);

>>> > +			schedule();

>>> > +			spin_lock_irqsave(hba->host->host_lock, flags);

>>> > +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

>>> > +						LINERESET_IO_TIMEOUT_MS) {

>>> > +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

>>> > +					__func__, hba->outstanding_reqs);

>>> > +				break;

>>> > +			}

>>> > +		}

>>> > +

>>> >  		if (!hba->saved_err && !needs_restore)

>>> >  			goto skip_err_handling;

>>> >  	}

>>> > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void

>>> > *__hba)

>>> >  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

>>> >  	}

>>> >

>>> > -	if (enabled_intr_status && retval == IRQ_NONE) {

>>> > -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

>>> > -					__func__, intr_status);

>>> > +	if (enabled_intr_status && retval == IRQ_NONE &&

>>> > +				!ufshcd_eh_in_progress(hba)) {

>>> > +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x,

>>> > 0x%08x)\n",

>>> > +					__func__,

>>> > +					intr_status,

>>> > +					hba->ufs_stats.last_intr_status,

>>> > +					enabled_intr_status);

>>> >  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

>>> >  	}

>>> >

>>> > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba

>>> > *hba,

>>> >  	 * Even though we use wait_event() which sleeps indefinitely,

>>> >  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

>>> >  	 */

>>> > -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

>>> > +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

>>> > +						BLK_MQ_REQ_NOWAIT);

>>> 

>>> Sorry that I didn't pay much attention to this part of code before.

>>> May I know why must we use the BLK_MQ_REQ_RESERVED flag?

>> 

>> What I understood is the reserved tag is used when aborting 

>> outstanding

>> IOs when all the 32 tags were used.

>> 

> 

> No, the tm requests and I/O requests are on two different tag sets:

> tm requests come from hba->tmf_tag_set, while I/O requests come from

> hba->shost->tag_set. Meaning they don't share tags with each other.


Add they are issued on two different HW queues - one for tm reqs,
one for I/O reqs, which is why two different tag sets are created.

> 

>>> 

>>> Thanks,

>>> Can Guo.

>>> 

>>> > +	if (IS_ERR(req))

>>> > +		return PTR_ERR(req);

>>> > +

>>> >  	req->end_io_data = &wait;

>>> >  	free_slot = req->tag;

>>> >  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

>>> > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

>>> > __iomem *mmio_base, unsigned int irq)

>>> >

>>> >  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

>>> >  		.nr_hw_queues	= 1,

>>> > +		.reserved_tags	= 1,

>>> 

>>> If we give reserved_tags as 1 and always ask for a tm requst with

>>> BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated

>>> from the reserved sbitmap_queue, whose depth is set to 1 here.

>>> UFS supports tm queue depth as 8, but here is allowing only one tm

>>> req at a time. Why? Please correct me if my understanding is wrong.

>> 

>> I couldn't find tm can be issued in parallel, so thought it was issued

>> one at a time. If we set 8, then we can use 24 for IOs, IIUC.

>> 

>> Please correct me as well. I'm still trying to understand the flow.

>> 

> 

> UFS allows a queue depth as 8, which means it support sending multiple

> tm requests at the same time. You can check commit 69a6c269c097d780a2 -

> before this change, we used to use below func to allocate tags for

> tm reqs, which can tell you the true story.

> 

> So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED 

> flag?

> Removing it can also fix the warning I suppose. What do you think?

> 

> -static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int 

> *free_slot)

> -{

> -       int tag;

> -       bool ret = false;

> -

> -       if (!free_slot)

> -               goto out;

> -

> -       do {

> -               tag = find_first_zero_bit(&hba->tm_slots_in_use, 

> hba->nutmrs);

> -               if (tag >= hba->nutmrs)

> -                       goto out;

> -       } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use));

> -

> -       *free_slot = tag;

> -       ret = true;

> -out:

> -       return ret;

> -}

> 

> Thanks,

> Can Guo.

> 

>>> 

>>> Thanks,

>>> Can Guo.

>>> 

>>> >  		.queue_depth	= hba->nutmrs,

>>> >  		.ops		= &ufshcd_tmf_ops,

>>> >  		.flags		= BLK_MQ_F_NO_SCHED,

Jaegeuk Kim Jan. 7, 2021, 7:23 a.m. UTC | #7

On 01/07, Can Guo wrote:
> On 2021-01-07 14:51, Jaegeuk Kim wrote:

> > On 01/07, Can Guo wrote:

> > > On 2021-01-07 05:41, Jaegeuk Kim wrote:

> > > > From: Jaegeuk Kim <jaegeuk@google.com>

> > > >

> > > > This fixes a warning caused by wrong reserve tag usage in

> > > > __ufshcd_issue_tm_cmd.

> > > >

> > > > WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70

> > > > WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82

> > > > blk_mq_get_tag+0x438/0x46c

> > > >

> > > > And, in ufshcd_err_handler(), we can avoid to send tm_cmd before

> > > > aborting

> > > > outstanding commands by waiting a bit for IO completion like this.

> > > >

> > > > __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out

> > > >

> > > 

> > > Would you mind add a Fixes tag?

> > 

> > Ok.

> > 

> > > 

> > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

> > > > ---

> > > >  drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++----

> > > >  1 file changed, 32 insertions(+), 4 deletions(-)

> > > >

> > > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> > > > index 1678cec08b51..47fc8da3cbf9 100644

> > > > --- a/drivers/scsi/ufs/ufshcd.c

> > > > +++ b/drivers/scsi/ufs/ufshcd.c

> > > > @@ -44,6 +44,9 @@

> > > >  /* Query request timeout */

> > > >  #define QUERY_REQ_TIMEOUT 1500 /* 1.5 seconds */

> > > >

> > > > +/* LINERESET TIME OUT */

> > > > +#define LINERESET_IO_TIMEOUT_MS			(30000) /* 30 sec */

> > > > +

> > > >  /* Task management command timeout */

> > > >  #define TM_CMD_TIMEOUT	100 /* msecs */

> > > >

> > > > @@ -5899,6 +5902,8 @@ static void ufshcd_err_handler(struct work_struct

> > > > *work)

> > > >  	 * check if power mode restore is needed.

> > > >  	 */

> > > >  	if (hba->saved_uic_err & UFSHCD_UIC_PA_GENERIC_ERROR) {

> > > > +		ktime_t start = ktime_get();

> > > > +

> > > >  		hba->saved_uic_err &= ~UFSHCD_UIC_PA_GENERIC_ERROR;

> > > >  		if (!hba->saved_uic_err)

> > > >  			hba->saved_err &= ~UIC_ERROR;

> > > > @@ -5906,6 +5911,20 @@ static void ufshcd_err_handler(struct work_struct

> > > > *work)

> > > >  		if (ufshcd_is_pwr_mode_restore_needed(hba))

> > > >  			needs_restore = true;

> > > >  		spin_lock_irqsave(hba->host->host_lock, flags);

> > > > +		/* Wait for IO completion to avoid aborting IOs */

> > > > +		while (hba->outstanding_reqs) {

> > > > +			ufshcd_complete_requests(hba);

> > > > +			spin_unlock_irqrestore(hba->host->host_lock, flags);

> > > > +			schedule();

> > > > +			spin_lock_irqsave(hba->host->host_lock, flags);

> > > > +			if (ktime_to_ms(ktime_sub(ktime_get(), start)) >

> > > > +						LINERESET_IO_TIMEOUT_MS) {

> > > > +				dev_err(hba->dev, "%s: timeout, outstanding=0x%lx\n",

> > > > +					__func__, hba->outstanding_reqs);

> > > > +				break;

> > > > +			}

> > > > +		}

> > > > +

> > > >  		if (!hba->saved_err && !needs_restore)

> > > >  			goto skip_err_handling;

> > > >  	}

> > > > @@ -6302,9 +6321,13 @@ static irqreturn_t ufshcd_intr(int irq, void

> > > > *__hba)

> > > >  		intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);

> > > >  	}

> > > >

> > > > -	if (enabled_intr_status && retval == IRQ_NONE) {

> > > > -		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x\n",

> > > > -					__func__, intr_status);

> > > > +	if (enabled_intr_status && retval == IRQ_NONE &&

> > > > +				!ufshcd_eh_in_progress(hba)) {

> > > > +		dev_err(hba->dev, "%s: Unhandled interrupt 0x%08x (0x%08x,

> > > > 0x%08x)\n",

> > > > +					__func__,

> > > > +					intr_status,

> > > > +					hba->ufs_stats.last_intr_status,

> > > > +					enabled_intr_status);

> > > >  		ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");

> > > >  	}

> > > >

> > > > @@ -6348,7 +6371,11 @@ static int __ufshcd_issue_tm_cmd(struct ufs_hba

> > > > *hba,

> > > >  	 * Even though we use wait_event() which sleeps indefinitely,

> > > >  	 * the maximum wait time is bounded by %TM_CMD_TIMEOUT.

> > > >  	 */

> > > > -	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED);

> > > > +	req = blk_get_request(q, REQ_OP_DRV_OUT, BLK_MQ_REQ_RESERVED |

> > > > +						BLK_MQ_REQ_NOWAIT);

> > > 

> > > Sorry that I didn't pay much attention to this part of code before.

> > > May I know why must we use the BLK_MQ_REQ_RESERVED flag?

> > 

> > What I understood is the reserved tag is used when aborting outstanding

> > IOs when all the 32 tags were used.

> > 

> 

> No, the tm requests and I/O requests are on two different tag sets:

> tm requests come from hba->tmf_tag_set, while I/O requests come from

> hba->shost->tag_set. Meaning they don't share tags with each other.


I see. :)

> 

> > > 

> > > Thanks,

> > > Can Guo.

> > > 

> > > > +	if (IS_ERR(req))

> > > > +		return PTR_ERR(req);

> > > > +

> > > >  	req->end_io_data = &wait;

> > > >  	free_slot = req->tag;

> > > >  	WARN_ON_ONCE(free_slot < 0 || free_slot >= hba->nutmrs);

> > > > @@ -9355,6 +9382,7 @@ int ufshcd_init(struct ufs_hba *hba, void

> > > > __iomem *mmio_base, unsigned int irq)

> > > >

> > > >  	hba->tmf_tag_set = (struct blk_mq_tag_set) {

> > > >  		.nr_hw_queues	= 1,

> > > > +		.reserved_tags	= 1,

> > > 

> > > If we give reserved_tags as 1 and always ask for a tm requst with

> > > BLK_MQ_REQ_RESERVED flag set, then the tag shall only be allocated

> > > from the reserved sbitmap_queue, whose depth is set to 1 here.

> > > UFS supports tm queue depth as 8, but here is allowing only one tm

> > > req at a time. Why? Please correct me if my understanding is wrong.

> > 

> > I couldn't find tm can be issued in parallel, so thought it was issued

> > one at a time. If we set 8, then we can use 24 for IOs, IIUC.

> > 

> > Please correct me as well. I'm still trying to understand the flow.

> > 

> 

> UFS allows a queue depth as 8, which means it support sending multiple

> tm requests at the same time. You can check commit 69a6c269c097d780a2 -

> before this change, we used to use below func to allocate tags for

> tm reqs, which can tell you the true story.

> 

> So I am thinking why don't we just we remove the BLK_MQ_REQ_RESERVED flag?

> Removing it can also fix the warning I suppose. What do you think?


Yeah, I believe it won't give a warning. Okay, let me check it out.

> 

> -static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int *free_slot)

> -{

> -       int tag;

> -       bool ret = false;

> -

> -       if (!free_slot)

> -               goto out;

> -

> -       do {

> -               tag = find_first_zero_bit(&hba->tm_slots_in_use,

> hba->nutmrs);

> -               if (tag >= hba->nutmrs)

> -                       goto out;

> -       } while (test_and_set_bit_lock(tag, &hba->tm_slots_in_use));

> -

> -       *free_slot = tag;

> -       ret = true;

> -out:

> -       return ret;

> -}

> 

> Thanks,

> Can Guo.

> 

> > > 

> > > Thanks,

> > > Can Guo.

> > > 

> > > >  		.queue_depth	= hba->nutmrs,

> > > >  		.ops		= &ufshcd_tmf_ops,

> > > >  		.flags		= BLK_MQ_F_NO_SCHED,

[PATH,v3,0/2] Two UFS bug fixes

Message

Comments