mbox series

[v2,0/9] Rework runtime suspend and SCSI domain validation

Message ID 20201116030459.13963-1-bvanassche@acm.org
Headers show
Series Rework runtime suspend and SCSI domain validation | expand

Message

Bart Van Assche Nov. 16, 2020, 3:04 a.m. UTC
Hi Martin,

The SCSI runtime suspend and domain validation mechanisms both use
scsi_device_quiesce(). scsi_device_quiesce() restricts blk_queue_enter() to
BLK_MQ_REQ_PREEMPT requests. There is a conflict between the requirements
of runtime suspend and SCSI domain validation: no requests must be sent to
runtime suspended devices that are in the state RPM_SUSPENDED while
BLK_MQ_REQ_PREEMPT requests must be processed during SCSI domain
validation. This conflict is resolved by reworking the SCSI domain
validation implementation.

Hybernation, runtime suspend and SCSI domain validation have been retested.

Please consider this patch series for kernel v5.11.

Thanks,

Bart.

Changes compared to v1:
- Added Tested-by tags for the SCSI SPI patches.
- Rebased on top of mkp-scsi/for-next.

Alan Stern (1):
  block: Do not accept any requests while suspended

Bart Van Assche (8):
  block: Fix a race in the runtime power management code
  ide: Do not set the RQF_PREEMPT flag for sense requests
  scsi: Pass a request queue pointer to __scsi_execute()
  scsi: Rework scsi_mq_alloc_queue()
  scsi: Do not wait for a request in scsi_eh_lock_door()
  scsi_transport_spi: Make spi_execute() accept a request queue pointer
  scsi_transport_spi: Freeze request queues instead of quiescing
  block, scsi, ide: Only process PM requests if rpm_status != RPM_ACTIVE

 block/blk-core.c                  |  12 +--
 block/blk-mq-debugfs.c            |   1 -
 block/blk-mq.c                    |   4 +-
 block/blk-pm.c                    |  15 ++--
 block/blk-pm.h                    |  14 +--
 drivers/ide/ide-atapi.c           |   1 -
 drivers/ide/ide-io.c              |   3 +-
 drivers/ide/ide-pm.c              |   2 +-
 drivers/scsi/scsi_error.c         |   7 +-
 drivers/scsi/scsi_lib.c           |  74 ++++++++--------
 drivers/scsi/scsi_priv.h          |   2 +
 drivers/scsi/scsi_transport_spi.c | 139 +++++++++++++++++-------------
 include/linux/blk-mq.h            |   4 +-
 include/linux/blkdev.h            |   6 +-
 include/scsi/scsi_device.h        |  14 ++-
 15 files changed, 163 insertions(+), 135 deletions(-)

Comments

Can Guo Nov. 18, 2020, 1:13 a.m. UTC | #1
On 2020-11-16 11:04, Bart Van Assche wrote:
> Instead of submitting all SCSI commands submitted with scsi_execute() 

> to

> a SCSI device if rpm_status != RPM_ACTIVE, only submit RQF_PM (power

> management requests) if rpm_status != RPM_ACTIVE. Remove flag

> RQF_PREEMPT since it is no longer necessary.

> 

> Cc: Martin K. Petersen <martin.petersen@oracle.com>

> Cc: Alan Stern <stern@rowland.harvard.edu>

> Cc: Can Guo <cang@codeaurora.org>

> Cc: Stanley Chu <stanley.chu@mediatek.com>

> Cc: Ming Lei <ming.lei@redhat.com>

> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>

> Signed-off-by: Bart Van Assche <bvanassche@acm.org>


Reviewed-by: Can Guo <cang@codeaurora.org>


> ---

>  block/blk-core.c        |  6 +++---

>  block/blk-mq-debugfs.c  |  1 -

>  block/blk-mq.c          |  4 ++--

>  drivers/ide/ide-io.c    |  3 +--

>  drivers/ide/ide-pm.c    |  2 +-

>  drivers/scsi/scsi_lib.c | 27 ++++++++++++++-------------

>  include/linux/blk-mq.h  |  4 ++--

>  include/linux/blkdev.h  |  6 +-----

>  8 files changed, 24 insertions(+), 29 deletions(-)

> 

> diff --git a/block/blk-core.c b/block/blk-core.c

> index 2db8bda43b6e..a00bce9f46d8 100644

> --- a/block/blk-core.c

> +++ b/block/blk-core.c

> @@ -424,11 +424,11 @@ EXPORT_SYMBOL(blk_cleanup_queue);

>  /**

>   * blk_queue_enter() - try to increase q->q_usage_counter

>   * @q: request queue pointer

> - * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PREEMPT

> + * @flags: BLK_MQ_REQ_NOWAIT and/or BLK_MQ_REQ_PM

>   */

>  int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags)

>  {

> -	const bool pm = flags & BLK_MQ_REQ_PREEMPT;

> +	const bool pm = flags & BLK_MQ_REQ_PM;

> 

>  	while (true) {

>  		bool success = false;

> @@ -630,7 +630,7 @@ struct request *blk_get_request(struct

> request_queue *q, unsigned int op,

>  	struct request *req;

> 

>  	WARN_ON_ONCE(op & REQ_NOWAIT);

> -	WARN_ON_ONCE(flags & ~(BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_PREEMPT));

> +	WARN_ON_ONCE(flags & ~(BLK_MQ_REQ_NOWAIT | BLK_MQ_REQ_PM));

> 

>  	req = blk_mq_alloc_request(q, op, flags);

>  	if (!IS_ERR(req) && q->mq_ops->initialize_rq_fn)

> diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c

> index 3094542e12ae..9336a6f8d6ef 100644

> --- a/block/blk-mq-debugfs.c

> +++ b/block/blk-mq-debugfs.c

> @@ -297,7 +297,6 @@ static const char *const rqf_name[] = {

>  	RQF_NAME(MIXED_MERGE),

>  	RQF_NAME(MQ_INFLIGHT),

>  	RQF_NAME(DONTPREP),

> -	RQF_NAME(PREEMPT),

>  	RQF_NAME(FAILED),

>  	RQF_NAME(QUIET),

>  	RQF_NAME(ELVPRIV),

> diff --git a/block/blk-mq.c b/block/blk-mq.c

> index 1b25ec2fe9be..d50504888b68 100644

> --- a/block/blk-mq.c

> +++ b/block/blk-mq.c

> @@ -292,8 +292,8 @@ static struct request *blk_mq_rq_ctx_init(struct

> blk_mq_alloc_data *data,

>  	rq->mq_hctx = data->hctx;

>  	rq->rq_flags = 0;

>  	rq->cmd_flags = data->cmd_flags;

> -	if (data->flags & BLK_MQ_REQ_PREEMPT)

> -		rq->rq_flags |= RQF_PREEMPT;

> +	if (data->flags & BLK_MQ_REQ_PM)

> +		rq->rq_flags |= RQF_PM;

>  	if (blk_queue_io_stat(data->q))

>  		rq->rq_flags |= RQF_IO_STAT;

>  	INIT_LIST_HEAD(&rq->queuelist);

> diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c

> index 1a53c7a75224..beb850679fa9 100644

> --- a/drivers/ide/ide-io.c

> +++ b/drivers/ide/ide-io.c

> @@ -522,8 +522,7 @@ blk_status_t ide_issue_rq(ide_drive_t *drive,

> struct request *rq,

>  		 * state machine.

>  		 */

>  		if ((drive->dev_flags & IDE_DFLAG_BLOCKED) &&

> -		    ata_pm_request(rq) == 0 &&

> -		    (rq->rq_flags & RQF_PREEMPT) == 0) {

> +		    ata_pm_request(rq) == 0) {

>  			/* there should be no pending command at this point */

>  			ide_unlock_port(hwif);

>  			goto plug_device;

> diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c

> index 192e6c65d34e..82ab308f1aaf 100644

> --- a/drivers/ide/ide-pm.c

> +++ b/drivers/ide/ide-pm.c

> @@ -77,7 +77,7 @@ int generic_ide_resume(struct device *dev)

>  	}

> 

>  	memset(&rqpm, 0, sizeof(rqpm));

> -	rq = blk_get_request(drive->queue, REQ_OP_DRV_IN, 

> BLK_MQ_REQ_PREEMPT);

> +	rq = blk_get_request(drive->queue, REQ_OP_DRV_IN, BLK_MQ_REQ_PM);

>  	ide_req(rq)->type = ATA_PRIV_PM_RESUME;

>  	ide_req(rq)->special = &rqpm;

>  	rqpm.pm_step = IDE_PM_START_RESUME;

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c

> index df1f22b32964..fd8d2f4d71f8 100644

> --- a/drivers/scsi/scsi_lib.c

> +++ b/drivers/scsi/scsi_lib.c

> @@ -248,7 +248,8 @@ int __scsi_execute(struct request_queue *q, const

> unsigned char *cmd,

>  	int ret = DRIVER_ERROR << 24;

> 

>  	req = blk_get_request(q, data_direction == DMA_TO_DEVICE ?

> -			REQ_OP_SCSI_OUT : REQ_OP_SCSI_IN, BLK_MQ_REQ_PREEMPT);

> +			      REQ_OP_SCSI_OUT : REQ_OP_SCSI_IN,

> +			      rq_flags & RQF_PM ? BLK_MQ_REQ_PM : 0);

>  	if (IS_ERR(req))

>  		return ret;

>  	rq = scsi_req(req);

> @@ -1204,6 +1205,8 @@ static blk_status_t

>  scsi_device_state_check(struct scsi_device *sdev, struct request *req)

>  {

>  	switch (sdev->sdev_state) {

> +	case SDEV_CREATED:

> +		return BLK_STS_OK;

>  	case SDEV_OFFLINE:

>  	case SDEV_TRANSPORT_OFFLINE:

>  		/*

> @@ -1230,18 +1233,18 @@ scsi_device_state_check(struct scsi_device

> *sdev, struct request *req)

>  		return BLK_STS_RESOURCE;

>  	case SDEV_QUIESCE:

>  		/*

> -		 * If the devices is blocked we defer normal commands.

> +		 * If the device is blocked we only accept power management

> +		 * commands.

>  		 */

> -		if (req && !(req->rq_flags & RQF_PREEMPT))

> +		if (req && WARN_ON_ONCE(!(req->rq_flags & RQF_PM)))

>  			return BLK_STS_RESOURCE;

>  		return BLK_STS_OK;

>  	default:

>  		/*

>  		 * For any other not fully online state we only allow

> -		 * special commands.  In particular any user initiated

> -		 * command is not allowed.

> +		 * power management commands.

>  		 */

> -		if (req && !(req->rq_flags & RQF_PREEMPT))

> +		if (req && !(req->rq_flags & RQF_PM))

>  			return BLK_STS_IOERR;

>  		return BLK_STS_OK;

>  	}

> @@ -2517,15 +2520,13 @@ void sdev_evt_send_simple(struct scsi_device 

> *sdev,

>  EXPORT_SYMBOL_GPL(sdev_evt_send_simple);

> 

>  /**

> - *	scsi_device_quiesce - Block user issued commands.

> + *	scsi_device_quiesce - Block all commands except power management.

>   *	@sdev:	scsi device to quiesce.

>   *

>   *	This works by trying to transition to the SDEV_QUIESCE state

>   *	(which must be a legal transition).  When the device is in this

> - *	state, only special requests will be accepted, all others will

> - *	be deferred.  Since special requests may also be requeued requests,

> - *	a successful return doesn't guarantee the device will be

> - *	totally quiescent.

> + *	state, only power management requests will be accepted, all others 

> will

> + *	be deferred.

>   *

>   *	Must be called with user context, may sleep.

>   *

> @@ -2586,12 +2587,12 @@ void scsi_device_resume(struct scsi_device 

> *sdev)

>  	 * device deleted during suspend)

>  	 */

>  	mutex_lock(&sdev->state_mutex);

> +	if (sdev->sdev_state == SDEV_QUIESCE)

> +		scsi_device_set_state(sdev, SDEV_RUNNING);

>  	if (sdev->quiesced_by) {

>  		sdev->quiesced_by = NULL;

>  		blk_clear_pm_only(sdev->request_queue);

>  	}

> -	if (sdev->sdev_state == SDEV_QUIESCE)

> -		scsi_device_set_state(sdev, SDEV_RUNNING);

>  	mutex_unlock(&sdev->state_mutex);

>  }

>  EXPORT_SYMBOL(scsi_device_resume);

> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h

> index b23eeca4d677..1fa350592830 100644

> --- a/include/linux/blk-mq.h

> +++ b/include/linux/blk-mq.h

> @@ -444,8 +444,8 @@ enum {

>  	BLK_MQ_REQ_NOWAIT	= (__force blk_mq_req_flags_t)(1 << 0),

>  	/* allocate from reserved pool */

>  	BLK_MQ_REQ_RESERVED	= (__force blk_mq_req_flags_t)(1 << 1),

> -	/* set RQF_PREEMPT */

> -	BLK_MQ_REQ_PREEMPT	= (__force blk_mq_req_flags_t)(1 << 3),

> +	/* set RQF_PM */

> +	BLK_MQ_REQ_PM		= (__force blk_mq_req_flags_t)(1 << 3),

>  };

> 

>  struct request *blk_mq_alloc_request(struct request_queue *q, unsigned 

> int op,

> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h

> index 639cae2c158b..7d4b746f7e6a 100644

> --- a/include/linux/blkdev.h

> +++ b/include/linux/blkdev.h

> @@ -79,9 +79,6 @@ typedef __u32 __bitwise req_flags_t;

>  #define RQF_MQ_INFLIGHT		((__force req_flags_t)(1 << 6))

>  /* don't call prep for this one */

>  #define RQF_DONTPREP		((__force req_flags_t)(1 << 7))

> -/* set for "ide_preempt" requests and also for requests for which the 

> SCSI

> -   "quiesce" state must be ignored. */

> -#define RQF_PREEMPT		((__force req_flags_t)(1 << 8))

>  /* vaguely specified driver internal error.  Ignored by the block 

> layer */

>  #define RQF_FAILED		((__force req_flags_t)(1 << 10))

>  /* don't warn about errors */

> @@ -430,8 +427,7 @@ struct request_queue {

>  	unsigned long		queue_flags;

>  	/*

>  	 * Number of contexts that have called blk_set_pm_only(). If this

> -	 * counter is above zero then only RQF_PM and RQF_PREEMPT requests 

> are

> -	 * processed.

> +	 * counter is above zero then only RQF_PM requests are processed.

>  	 */

>  	atomic_t		pm_only;
Can Guo Nov. 18, 2020, 1:15 a.m. UTC | #2
On 2020-11-16 11:04, Bart Van Assche wrote:
> Do not modify sdev->request_queue. Remove the sdev->request_queue

> assignment. That assignment is superfluous because 

> scsi_mq_alloc_queue()

> only has one caller and that caller calls scsi_mq_alloc_queue() as 

> follows:

> 

> 	sdev->request_queue = scsi_mq_alloc_queue(sdev);

> 

> Cc: Martin K. Petersen <martin.petersen@oracle.com>

> Reviewed-by: Alan Stern <stern@rowland.harvard.edu>

> Cc: Can Guo <cang@codeaurora.org>

> Cc: Stanley Chu <stanley.chu@mediatek.com>

> Cc: Ming Lei <ming.lei@redhat.com>

> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> Signed-off-by: Bart Van Assche <bvanassche@acm.org>


Reviewed-by: Can Guo <cang@codeaurora.org>


> ---

>  drivers/scsi/scsi_lib.c | 13 +++++++------

>  1 file changed, 7 insertions(+), 6 deletions(-)

> 

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c

> index e4f9ed355be6..ff480fa6261e 100644

> --- a/drivers/scsi/scsi_lib.c

> +++ b/drivers/scsi/scsi_lib.c

> @@ -1883,14 +1883,15 @@ static const struct blk_mq_ops scsi_mq_ops = {

> 

>  struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev)

>  {

> -	sdev->request_queue = blk_mq_init_queue(&sdev->host->tag_set);

> -	if (IS_ERR(sdev->request_queue))

> +	struct request_queue *q = blk_mq_init_queue(&sdev->host->tag_set);

> +

> +	if (IS_ERR(q))

>  		return NULL;

> 

> -	sdev->request_queue->queuedata = sdev;

> -	__scsi_init_queue(sdev->host, sdev->request_queue);

> -	blk_queue_flag_set(QUEUE_FLAG_SCSI_PASSTHROUGH, sdev->request_queue);

> -	return sdev->request_queue;

> +	q->queuedata = sdev;

> +	__scsi_init_queue(sdev->host, q);

> +	blk_queue_flag_set(QUEUE_FLAG_SCSI_PASSTHROUGH, q);

> +	return q;

>  }

> 

>  int scsi_mq_setup_tags(struct Scsi_Host *shost)
Stanley Chu Nov. 18, 2020, 9:08 a.m. UTC | #3
On Mon, 2020-11-16 at 11:04 +0800, Bart Van Assche wrote:
> Instead of submitting all SCSI commands submitted with scsi_execute() to

> a SCSI device if rpm_status != RPM_ACTIVE, only submit RQF_PM (power

> management requests) if rpm_status != RPM_ACTIVE. Remove flag

> RQF_PREEMPT since it is no longer necessary.

> 

> Cc: Martin K. Petersen <martin.petersen@oracle.com>

> Cc: Alan Stern <stern@rowland.harvard.edu>

> Cc: Can Guo <cang@codeaurora.org>

> Cc: Stanley Chu <stanley.chu@mediatek.com>

> Cc: Ming Lei <ming.lei@redhat.com>

> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>

> Signed-off-by: Bart Van Assche <bvanassche@acm.org>


Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Bart Van Assche Nov. 20, 2020, 1:36 a.m. UTC | #4
On 11/16/20 10:01 AM, Bart Van Assche wrote:
> On 11/16/20 9:17 AM, Christoph Hellwig wrote:

>> On Sun, Nov 15, 2020 at 07:04:54PM -0800, Bart Van Assche wrote:

>>> Do not modify sdev->request_queue. Remove the sdev->request_queue

>>> assignment. That assignment is superfluous because scsi_mq_alloc_queue()

>>> only has one caller and that caller calls scsi_mq_alloc_queue() as

>>> follows:

>>>

>>>     sdev->request_queue = scsi_mq_alloc_queue(sdev);

>>

>> This looks ok to me.  But is there any good to keep scsi_mq_alloc_queue

>> around at all?  It is so trivial that it can be open coded in the

>> currently only caller, as well as a new one if added.

> 

> Hi Christoph,

> 

> A later patch in this series introduces a second call to

> scsi_mq_alloc_queue(). Do we really want to have multiple functions that

> set QUEUE_FLAG_SCSI_PASSTHROUGH? I'm concerned that if the logic for

> creating a SCSI queue would ever be changed that only the copy in the

> SCSI core would be updated but not the copy in the SPI code.


(replying to my own email)

Hi Christoph,

Is this something that you feel strongly about? I can make this change
but that would require reaching out again to someone who owns an SPI
setup for testing this patch series since I do not own an SPI setup
myself ...

Thanks,

Bart.