From patchwork Mon May 24 15:26:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 447913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B645EC04FF3 for ; Mon, 24 May 2021 15:59:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8DD16611CD for ; Mon, 24 May 2021 15:59:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235782AbhEXQBI (ORCPT ); Mon, 24 May 2021 12:01:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:46882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235630AbhEXQAQ (ORCPT ); Mon, 24 May 2021 12:00:16 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 647EB6198B; Mon, 24 May 2021 15:46:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1621871161; bh=1thRWn/Ko8lZOPys+Us69FIXuUyvzd84Oklr9SZ8F9o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=R8QaMglqaSmombV6Qe+EHJvaDY1CHFrHLxlhS++eEk4s78/075ZNKuxE0zpjyuRbb cTvUXxJ+2Vc62yl/hNM/R+i7wKhWfNo6AajSwLu+OSX2kI4AONxRZ+kUT9gFu7seQC t7ufeA49OWGplK240lVbQvrm7eiWbtRr+882FX0s= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Narayan Ayalasomayajula , Anil Mishra , Keith Busch , Sagi Grimberg , Christoph Hellwig Subject: [PATCH 5.12 067/127] nvme-tcp: fix possible use-after-completion Date: Mon, 24 May 2021 17:26:24 +0200 Message-Id: <20210524152337.120308490@linuxfoundation.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210524152334.857620285@linuxfoundation.org> References: <20210524152334.857620285@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Sagi Grimberg commit 825619b09ad351894d2c6fb6705f5b3711d145c7 upstream. Commit db5ad6b7f8cd ("nvme-tcp: try to send request in queue_rq context") added a second context that may perform a network send. This means that now RX and TX are not serialized in nvme_tcp_io_work and can run concurrently. While there is correct mutual exclusion in the TX path (where the send_mutex protect the queue socket send activity) RX activity, and more specifically request completion may run concurrently. This means we must guarantee that any mutation of the request state related to its lifetime, bytes sent must not be accessed when a completion may have possibly arrived back (and processed). The race may trigger when a request completion arrives, processed _and_ reused as a fresh new request, exactly in the (relatively short) window between the last data payload sent and before the request iov_iter is advanced. Consider the following race: 1. 16K write request is queued 2. The nvme command and the data is sent to the controller (in-capsule or solicited by r2t) 3. After the last payload is sent but before the req.iter is advanced, the controller sends back a completion. 4. The completion is processed, the request is completed, and reused to transfer a new request (write or read) 5. The new request is queued, and the driver reset the request parameters (nvme_tcp_setup_cmd_pdu). 6. Now context in (2) resumes execution and advances the req.iter ==> use-after-completion as this is already a new request. Fix this by making sure the request is not advanced after the last data payload send, knowing that a completion may have arrived already. An alternative solution would have been to delay the request completion or state change waiting for reference counting on the TX path, but besides adding atomic operations to the hot-path, it may present challenges in multi-stage R2T scenarios where a r2t handler needs to be deferred to an async execution. Reported-by: Narayan Ayalasomayajula Tested-by: Anil Mishra Reviewed-by: Keith Busch Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Sagi Grimberg Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman --- drivers/nvme/host/tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -940,7 +940,6 @@ static int nvme_tcp_try_send_data(struct if (ret <= 0) return ret; - nvme_tcp_advance_req(req, ret); if (queue->data_digest) nvme_tcp_ddgst_update(queue->snd_hash, page, offset, ret); @@ -957,6 +956,7 @@ static int nvme_tcp_try_send_data(struct } return 1; } + nvme_tcp_advance_req(req, ret); } return -EAGAIN; }