diff mbox series

nvme-fc: clear q_live at beginning of association teardown

Message ID 20210511045635.12494-1-jsmart2021@gmail.com
State Accepted
Commit a7d139145a6640172516b193abf6d2398620aa14
Headers show
Series nvme-fc: clear q_live at beginning of association teardown | expand

Commit Message

James Smart May 11, 2021, 4:56 a.m. UTC
The __nvmf_check_ready() routine used to bounce all filesystem io if
the controller state isn't LIVE. However, a later patch changed the
logic so that it rejection ends up being based on the Q live check.
The fc transport has a slightly different sequence from rdma and tcp
for shutting down queues/marking them non-live. FC marks its queue
non-live after aborting all ios and waiting for their termination,
leaving a rather large window for filesystem io to continue to hit the
transport. Unfortunately this resulted in filesystem io or applications
seeing I/O errors.

Change the fc transport to mark the queues non-live at the first
sign of teardown for the association (when i/o is initially terminated).

Fixes: 73a5379937ec ("nvme-fabrics: allow to queue requests for live queues")
Cc: <stable@vger.kernel.org> # v5.8+
Signed-off-by: James Smart <jsmart2021@gmail.com>

---
stable trees for 5.8 and 5.9 will require a slightly modified patch
---
 drivers/nvme/host/fc.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Sagi Grimberg May 12, 2021, 11:03 p.m. UTC | #1
> The __nvmf_check_ready() routine used to bounce all filesystem io if

> the controller state isn't LIVE. However, a later patch changed the

> logic so that it rejection ends up being based on the Q live check.

> The fc transport has a slightly different sequence from rdma and tcp

> for shutting down queues/marking them non-live. FC marks its queue

> non-live after aborting all ios and waiting for their termination,

> leaving a rather large window for filesystem io to continue to hit the

> transport. Unfortunately this resulted in filesystem io or applications

> seeing I/O errors.

> 

> Change the fc transport to mark the queues non-live at the first

> sign of teardown for the association (when i/o is initially terminated).


Sounds like the correct behavior to me, what is the motivation for doing
that only after all I/O was aborted?

And,
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Himanshu Madhani May 13, 2021, 2:42 p.m. UTC | #2
On 5/10/21 11:56 PM, James Smart wrote:
> The __nvmf_check_ready() routine used to bounce all filesystem io if

> the controller state isn't LIVE. However, a later patch changed the

> logic so that it rejection ends up being based on the Q live check.

> The fc transport has a slightly different sequence from rdma and tcp

> for shutting down queues/marking them non-live. FC marks its queue

> non-live after aborting all ios and waiting for their termination,

> leaving a rather large window for filesystem io to continue to hit the

> transport. Unfortunately this resulted in filesystem io or applications

> seeing I/O errors.

> 

> Change the fc transport to mark the queues non-live at the first

> sign of teardown for the association (when i/o is initially terminated).

> 

> Fixes: 73a5379937ec ("nvme-fabrics: allow to queue requests for live queues")

> Cc: <stable@vger.kernel.org> # v5.8+

> Signed-off-by: James Smart <jsmart2021@gmail.com>

> 

> ---

> stable trees for 5.8 and 5.9 will require a slightly modified patch

> ---

>   drivers/nvme/host/fc.c | 12 ++++++++++++

>   1 file changed, 12 insertions(+)

> 

> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c

> index d9ab9e7871d0..256e87721a01 100644

> --- a/drivers/nvme/host/fc.c

> +++ b/drivers/nvme/host/fc.c

> @@ -2461,6 +2461,18 @@ nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved)

>   static void

>   __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)

>   {

> +	int q;

> +

> +	/*

> +	 * if aborting io, the queues are no longer good, mark them

> +	 * all as not live.

> +	 */

> +	if (ctrl->ctrl.queue_count > 1) {

> +		for (q = 1; q < ctrl->ctrl.queue_count; q++)

> +			clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[q].flags);

> +	}

> +	clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[0].flags);

> +

>   	/*

>   	 * If io queues are present, stop them and terminate all outstanding

>   	 * ios on them. As FC allocates FC exchange for each io, the

> 


Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>


-- 
Himanshu Madhani                               Oracle Linux Engineering
James Smart May 13, 2021, 6:47 p.m. UTC | #3
On 5/12/2021 4:03 PM, Sagi Grimberg wrote:
> 

>> The __nvmf_check_ready() routine used to bounce all filesystem io if

>> the controller state isn't LIVE. However, a later patch changed the

>> logic so that it rejection ends up being based on the Q live check.

>> The fc transport has a slightly different sequence from rdma and tcp

>> for shutting down queues/marking them non-live. FC marks its queue

>> non-live after aborting all ios and waiting for their termination,

>> leaving a rather large window for filesystem io to continue to hit the

>> transport. Unfortunately this resulted in filesystem io or applications

>> seeing I/O errors.

>>

>> Change the fc transport to mark the queues non-live at the first

>> sign of teardown for the association (when i/o is initially terminated).

> 

> Sounds like the correct behavior to me, what is the motivation for doing

> that only after all I/O was aborted?

> 

> And,

> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>


source evolution over time (rdma/tcp changed how they worked) and the 
need didn't show up earlier based on the earlier checks.

-- james
Sagi Grimberg May 13, 2021, 7:59 p.m. UTC | #4
>>> The __nvmf_check_ready() routine used to bounce all filesystem io if

>>> the controller state isn't LIVE. However, a later patch changed the

>>> logic so that it rejection ends up being based on the Q live check.

>>> The fc transport has a slightly different sequence from rdma and tcp

>>> for shutting down queues/marking them non-live. FC marks its queue

>>> non-live after aborting all ios and waiting for their termination,

>>> leaving a rather large window for filesystem io to continue to hit the

>>> transport. Unfortunately this resulted in filesystem io or applications

>>> seeing I/O errors.

>>>

>>> Change the fc transport to mark the queues non-live at the first

>>> sign of teardown for the association (when i/o is initially terminated).

>>

>> Sounds like the correct behavior to me, what is the motivation for doing

>> that only after all I/O was aborted?

>>

>> And,

>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

> 

> source evolution over time (rdma/tcp changed how they worked) and the 

> need didn't show up earlier based on the earlier checks.


Makes sense...
Hannes Reinecke May 14, 2021, 9:20 a.m. UTC | #5
On 5/11/21 6:56 AM, James Smart wrote:
> The __nvmf_check_ready() routine used to bounce all filesystem io if

> the controller state isn't LIVE. However, a later patch changed the

> logic so that it rejection ends up being based on the Q live check.

> The fc transport has a slightly different sequence from rdma and tcp

> for shutting down queues/marking them non-live. FC marks its queue

> non-live after aborting all ios and waiting for their termination,

> leaving a rather large window for filesystem io to continue to hit the

> transport. Unfortunately this resulted in filesystem io or applications

> seeing I/O errors.

> 

> Change the fc transport to mark the queues non-live at the first

> sign of teardown for the association (when i/o is initially terminated).

> 

> Fixes: 73a5379937ec ("nvme-fabrics: allow to queue requests for live queues")

> Cc: <stable@vger.kernel.org> # v5.8+

> Signed-off-by: James Smart <jsmart2021@gmail.com>

> 

> ---

> stable trees for 5.8 and 5.9 will require a slightly modified patch

> ---

>  drivers/nvme/host/fc.c | 12 ++++++++++++

>  1 file changed, 12 insertions(+)

> 

> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c

> index d9ab9e7871d0..256e87721a01 100644

> --- a/drivers/nvme/host/fc.c

> +++ b/drivers/nvme/host/fc.c

> @@ -2461,6 +2461,18 @@ nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved)

>  static void

>  __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)

>  {

> +	int q;

> +

> +	/*

> +	 * if aborting io, the queues are no longer good, mark them

> +	 * all as not live.

> +	 */

> +	if (ctrl->ctrl.queue_count > 1) {

> +		for (q = 1; q < ctrl->ctrl.queue_count; q++)

> +			clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[q].flags);

> +	}

> +	clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[0].flags);

> +

>  	/*

>  	 * If io queues are present, stop them and terminate all outstanding

>  	 * ios on them. As FC allocates FC exchange for each io, the

> 

Reviewed-by: Hannes Reinecke <hare@suse.de>


Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
Christoph Hellwig May 19, 2021, 6:42 a.m. UTC | #6
Thanks,

applied to nvme-5.13.
diff mbox series

Patch

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index d9ab9e7871d0..256e87721a01 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2461,6 +2461,18 @@  nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved)
 static void
 __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
 {
+	int q;
+
+	/*
+	 * if aborting io, the queues are no longer good, mark them
+	 * all as not live.
+	 */
+	if (ctrl->ctrl.queue_count > 1) {
+		for (q = 1; q < ctrl->ctrl.queue_count; q++)
+			clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[q].flags);
+	}
+	clear_bit(NVME_FC_Q_LIVE, &ctrl->queues[0].flags);
+
 	/*
 	 * If io queues are present, stop them and terminate all outstanding
 	 * ios on them. As FC allocates FC exchange for each io, the