From patchwork Tue Sep 15 14:12:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "gregkh@linuxfoundation.org" X-Patchwork-Id: 263857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7A6AC43461 for ; Wed, 16 Sep 2020 00:07:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7636920739 for ; Wed, 16 Sep 2020 00:07:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600214824; bh=i7L02jWbdrrJ44cD2iixZ98rASUHpzgH9xnh3G5IAsE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=YG8O9UCCiWF/caPDZEppbW3l4AGMOwoumONqP7+v29sNeIZ7cZDe/2VnYSvYbkv/O /VcYERVG0/0nMLUdQTLW6txbb09CNdR3JEL4M/LQydCN3ekHiCjXpiTvZGSDeEQ4dg yXx5+hCKhiD97kdZx2vY/nNQIh+n7qM9OL5B21Gw= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726857AbgIPAHD (ORCPT ); Tue, 15 Sep 2020 20:07:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:43006 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726964AbgIOO2h (ORCPT ); Tue, 15 Sep 2020 10:28:37 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9EB2D224DE; Tue, 15 Sep 2020 14:20:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1600179626; bh=i7L02jWbdrrJ44cD2iixZ98rASUHpzgH9xnh3G5IAsE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=A0xtiEeTPIp3YYLNQoUEzzYbPzlyVIXikFNuVWmjIwsyzDz5nIWXFbpxIUhfqnDG0 lUmsrcc6Lk3luvwzC4YTHjlDr2kod3z/EmdKMei7zODdLO/ucqvJiSYh54V1HAWqsp XjqcBs/704gYOwxodvjhu1A6CfK1/sxO42/zz9rc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sagi Grimberg , Sasha Levin Subject: [PATCH 5.4 054/132] nvme-tcp: serialize controller teardown sequences Date: Tue, 15 Sep 2020 16:12:36 +0200 Message-Id: <20200915140646.825900915@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200915140644.037604909@linuxfoundation.org> References: <20200915140644.037604909@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Sagi Grimberg [ Upstream commit d4d61470ae48838f49e668503e840e1520b97162 ] In the timeout handler we may need to complete a request because the request that timed out may be an I/O that is a part of a serial sequence of controller teardown or initialization. In order to complete the request, we need to fence any other context that may compete with us and complete the request that is timing out. In this case, we could have a potential double completion in case a hard-irq or a different competing context triggered error recovery and is running inflight request cancellation concurrently with the timeout handler. Protect using a ctrl teardown_lock to serialize contexts that may complete a cancelled request due to error recovery or a reset. Signed-off-by: Sagi Grimberg Signed-off-by: Sasha Levin --- drivers/nvme/host/tcp.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 0166ff0e4738e..a94c80727de1e 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -110,6 +110,7 @@ struct nvme_tcp_ctrl { struct sockaddr_storage src_addr; struct nvme_ctrl ctrl; + struct mutex teardown_lock; struct work_struct err_work; struct delayed_work connect_work; struct nvme_tcp_request async_req; @@ -1438,7 +1439,6 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) if (!test_and_clear_bit(NVME_TCP_Q_LIVE, &queue->flags)) return; - __nvme_tcp_stop_queue(queue); } @@ -1785,6 +1785,7 @@ out_free_queue: static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl, bool remove) { + mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock); blk_mq_quiesce_queue(ctrl->admin_q); nvme_tcp_stop_queue(ctrl, 0); if (ctrl->admin_tagset) { @@ -1795,13 +1796,16 @@ static void nvme_tcp_teardown_admin_queue(struct nvme_ctrl *ctrl, if (remove) blk_mq_unquiesce_queue(ctrl->admin_q); nvme_tcp_destroy_admin_queue(ctrl, remove); + mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock); } static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, bool remove) { + mutex_lock(&to_tcp_ctrl(ctrl)->teardown_lock); if (ctrl->queue_count <= 1) - return; + goto out; + blk_mq_quiesce_queue(ctrl->admin_q); nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); @@ -1813,6 +1817,8 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, if (remove) nvme_start_queues(ctrl); nvme_tcp_destroy_io_queues(ctrl, remove); +out: + mutex_unlock(&to_tcp_ctrl(ctrl)->teardown_lock); } static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl) @@ -2311,6 +2317,7 @@ static struct nvme_ctrl *nvme_tcp_create_ctrl(struct device *dev, nvme_tcp_reconnect_ctrl_work); INIT_WORK(&ctrl->err_work, nvme_tcp_error_recovery_work); INIT_WORK(&ctrl->ctrl.reset_work, nvme_reset_ctrl_work); + mutex_init(&ctrl->teardown_lock); if (!(opts->mask & NVMF_OPT_TRSVCID)) { opts->trsvcid =