From patchwork Tue Sep 29 23:19:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D964BC47420 for ; Tue, 29 Sep 2020 23:23:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 418CE20773 for ; Tue, 29 Sep 2020 23:23:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 418CE20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46932 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNOyB-0002lJ-RJ for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:23:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu7-0000UV-UN; Tue, 29 Sep 2020 19:19:27 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:56253) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu5-00007n-5V; Tue, 29 Sep 2020 19:19:27 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 426EBED0; Tue, 29 Sep 2020 19:19:22 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=1QikJUcbsmo2U P6Df439yCuIR8yF+wYrAAq6TthCldM=; b=eCvHv2wCVn0rZRbzcJscUBan3iOb+ gdrVnzcNOXJcnF3fUWr1cxyXdaJRCtpQW9Qqz8dz2ijvnOncY/P9aUkEyrYZh8ov qsgini1gVyrPbBESt3KJLZM2YduCHRRnnr56BFLQ1g3tV/ZimcSrexum63GQFjUY qHLU2NlB6dQPhP4QWgGsxyYA54ujtDpzfN0iJwRMionFDLWIn1TlBcO3FovSXiqy 4ECyDnxte5OGN0aRDBKIxec7lmtFmMROU0mNWSCj4msmSoWQaLnhNWLmyn7yh/bl q+aMeHUxPHOak7ig+Jy0KHPF623GEX6BHF7vFTCgYPbB27JeGLQE+5b1Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=1QikJUcbsmo2UP6Df439yCuIR8yF+wYrAAq6TthCldM=; b=TOOF0Mv/ ac9jPlZCfciJKOkWkWXwLMeiEGARZhhyroXN2VE25Jpl50QWivekyvQc9MtIHslL 8sFS6y50d1Gkz1vTfzGyxvTB8FzCu99TOGs/iE7ITOlYUbVYdkXFPB3G1381zyy/ sq4Xr2oNZkrJncr8cIgf/7bI9nmzy4wGVnQvePBOUhjFeQp5hKkY/1oQBw5pSq0H 9HdMGzbyrXCFxRPL6x6/tJNhvHW1WhKqQbibewNv8UTZhS2Ml/3WVyOLhmpc9SwP GPSo4cF2DGpyTDIEHGRtNXTuAPijMFBa1nAcEp1l4EpFhe9a5d+nEoVVE/XNNqsb 4v7tfzRYuNtRyA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 64EBB328005D; Tue, 29 Sep 2020 19:19:20 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 01/14] hw/block/nvme: add nsid to get/setfeat trace events Date: Wed, 30 Sep 2020 01:19:04 +0200 Message-Id: <20200929231917.433586-2-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Include the namespace id in the pci_nvme_{get,set}feat trace events. Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 4 ++-- hw/block/trace-events | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index da8344f196a8..84b6b516fa7b 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1660,7 +1660,7 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) [NVME_ARBITRATION] = NVME_ARB_AB_NOLIMIT, }; - trace_pci_nvme_getfeat(nvme_cid(req), fid, sel, dw11); + trace_pci_nvme_getfeat(nvme_cid(req), nsid, fid, sel, dw11); if (!nvme_feature_support[fid]) { return NVME_INVALID_FIELD | NVME_DNR; @@ -1798,7 +1798,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) uint8_t fid = NVME_GETSETFEAT_FID(dw10); uint8_t save = NVME_SETFEAT_SAVE(dw10); - trace_pci_nvme_setfeat(nvme_cid(req), fid, save, dw11); + trace_pci_nvme_setfeat(nvme_cid(req), nsid, fid, save, dw11); if (save) { return NVME_FID_NOT_SAVEABLE | NVME_DNR; diff --git a/hw/block/trace-events b/hw/block/trace-events index 446cca08e9ab..5a239b80bf36 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -53,8 +53,8 @@ pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" -pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" -pci_nvme_setfeat(uint16_t cid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32"" +pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" +pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32"" pci_nvme_getfeat_vwcache(const char* result) "get feature volatile write cache, result=%s" pci_nvme_getfeat_numq(int result) "get feature number of queues, result=%d" pci_nvme_setfeat_numq(int reqcq, int reqsq, int gotcq, int gotsq) "requested cq_count=%d sq_count=%d, responding with cq_count=%d sq_count=%d" From patchwork Tue Sep 29 23:19:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 051D0C4727C for ; Tue, 29 Sep 2020 23:23:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 806C020773 for ; Tue, 29 Sep 2020 23:23:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 806C020773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46910 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNOyB-0002kn-2Y for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:23:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu8-0000Uz-Mq; Tue, 29 Sep 2020 19:19:28 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:41609) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu5-00007t-DD; Tue, 29 Sep 2020 19:19:28 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 57583E99; Tue, 29 Sep 2020 19:19:23 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=+Fjf+ehIwJaI4 71JRdnpqYu38GUIs8qKDBklTJPNZG4=; b=omkeBvp/cU8wrl9DoIYYWeRPxt3gL 5au+33JV72ekcm295rgS16ZUhUyJvD13ImDvayHoWrPSYy0zAhIup4oc8Ir9fU4b dfEpQCTp9X1MvwFV/B30CKrYnxzRRN5Jxwg3nnTAydwDKCMrBMP7vxWhq2hmfhTp AoOAlgeKcwJa32LEnqlmDkUf9K16iAldeAwM0d94ppVlv4o4ibUOhMfVbuyBt8ju pgkW3sd7OaXiKVkQndbAAo0kkZowqZOla/yaSwDBvK6VjCIy2FPfgA+01MwgypIy tguwAOGxG9bMCgoTrdUFf+LkEZTn7gKwU/tru+bs7gJmssh4kcwQvMlyw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=+Fjf+ehIwJaI471JRdnpqYu38GUIs8qKDBklTJPNZG4=; b=C+CTvKso ZixaoPz1LdsD2w4B5Cdw5qjpusxQmxd7dVW8ivY8HNFAHP8YPW4z2HIsN9FDr/Vp lc+KFZ8wDUi9YAaLAaYCkEnuwaqUOUq8it7vUAE9wORNUKpk2BntNLA+i/peZtg5 IWujGTGH3uNz/Mxk+e6Ggu/PbeG976Bxx4NBev/0wS/uvt5kxJR9vpD2CDVaVgor WWg4NGI4AduJD8LBzsdbCbdMPSNE7Cua1JL6FGrRYo+baUvkxUK5g1JW5LQh5u3O ATX0U3DSM1MvYXOgMRtd/i6rtLQOYppssjP2ALFwmykWnvpJ/mSblhhRs0+X1HSY uW4c3o9nraPcOA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgeptdenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id A66563280066; Tue, 29 Sep 2020 19:19:21 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 02/14] hw/block/nvme: add trace event for requests with non-zero status code Date: Wed, 30 Sep 2020 01:19:05 +0200 Message-Id: <20200929231917.433586-3-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen If a command results in a non-zero status code, trace it. Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 6 ++++++ hw/block/trace-events | 1 + 2 files changed, 7 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 84b6b516fa7b..3cbc3c7b75b1 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -777,6 +777,12 @@ static void nvme_enqueue_req_completion(NvmeCQueue *cq, NvmeRequest *req) assert(cq->cqid == req->sq->cqid); trace_pci_nvme_enqueue_req_completion(nvme_cid(req), cq->cqid, req->status); + + if (req->status) { + trace_pci_nvme_err_req_status(nvme_cid(req), nvme_nsid(req->ns), + req->status, req->cmd.opcode); + } + QTAILQ_REMOVE(&req->sq->out_req_list, req, entry); QTAILQ_INSERT_TAIL(&cq->req_list, req, entry); timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500); diff --git a/hw/block/trace-events b/hw/block/trace-events index 5a239b80bf36..9e7507c5abde 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -89,6 +89,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" +pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8"" pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_cfs(void) "controller fatal status" From patchwork Tue Sep 29 23:19:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F18AC4727C for ; Tue, 29 Sep 2020 23:23:10 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7DCA920897 for ; Tue, 29 Sep 2020 23:23:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7DCA920897 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46346 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNOxg-0002Vz-Ck for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:23:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38994) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu8-0000VG-VN; Tue, 29 Sep 2020 19:19:28 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:46811) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu6-000086-SF; Tue, 29 Sep 2020 19:19:28 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id D1427E24; Tue, 29 Sep 2020 19:19:24 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=fm1; bh= qEKNIM+lPRA0c0RxRX/s4SVmBXPBiqyjUZhRX090gXI=; b=l1TchkFscnMD7Nt1 EybcOUdB1GWedUrAwj3GGAo3MXibBTpXmt0EYlh/tJilIvYw0pXf/h6kWPg7Ahu8 v1p8/r2q+QzsavrPJKVLkcS8dXgUSeMlyCb1jdo6CRzoRkwlbY4pxp8/BUtASsPx egdYG49M/GOeB8CwCt63WVqKm7vXlz3jkYVnpyEIwIjHLvzjKtOVYJrt3BPR+WcX ZvenpOMGZneoPR7ho81EcQcuhfJVlhUle30UxzN/4hrrpwaK+irbXtQLIQhV3j3P buqCbOeC3NgYrUxjTQeriSwbXMoOroDUfVhCqG5MdQQvLNJffYjKWnZxGaSg5PU5 iQTTkQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=qEKNIM+lPRA0c0RxRX/s4SVmBXPBiqyjUZhRX090g XI=; b=RRC83CP6odk0QRqRYL7hcNeuVWO8wmkIuUVCwCOFo7rXMm8wfjOTL86Xc S9YIlEWpZrg+In64dO3O53+k3JSyJgJzwYYiJiBTblP9RhbLK6N1kZdwzCNkXTTO t1TwQFqT2wD++kTxpAJ2XttFZroXBSSl1Hfd3omPZ93he8IQye4/dTEtUnZPVQVL ozlCIjPz7TOfeaV5KMBRFgUcFmu4VGgdBncza0BNs4abUc8qBi6gBQxrrggGYEXm oJ+pezfe6EfL1gwn+9TYq5U6ZOFsxqNDUZ5LH+g4bQ+UZ2q8qoh4ih5NkRInt+g9 uVDf+lwEG/JaeOga3diamPVRbTRpQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhggtgfgsehtkeertdertdejnecuhfhrohhmpefmlhgruhhs ucflvghnshgvnhcuoehithhssehirhhrvghlvghvrghnthdrughkqeenucggtffrrghtth gvrhhnpeetveeuudegveeiheegieelueeftedvtdekteefleegheduhfejueelvdfhffdt geenucfkphepkedtrdduieejrdelkedrudeltdenucevlhhushhtvghrufhiiigvpedtne curfgrrhgrmhepmhgrihhlfhhrohhmpehithhssehirhhrvghlvghvrghnthdrughk X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id E9B253280068; Tue, 29 Sep 2020 19:19:22 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 03/14] hw/block/nvme: make lba data size configurable Date: Wed, 30 Sep 2020 01:19:06 +0200 Message-Id: <20200929231917.433586-4-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen , Maxim Levitsky , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Allos the LBA data size (lbads) to be set between 9 and 12. Signed-off-by: Klaus Jensen Acked-by: Keith Busch Reviewed-by: Maxim Levitsky Reviewed-by: Philippe Mathieu-Daudé --- docs/specs/nvme.txt | 11 ++++++++++- hw/block/nvme-ns.h | 1 + hw/block/nvme-ns.c | 8 +++++++- hw/block/nvme.c | 1 + 4 files changed, 19 insertions(+), 2 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 56d393884e7a..438ca50d698c 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -1,7 +1,16 @@ NVM Express Controller ====================== -The nvme device (-device nvme) emulates an NVM Express Controller. +The nvme device (-device nvme) emulates an NVM Express Controller. It is used +together with nvme-ns devices (-device nvme-ns) which emulates an NVM Express +Namespace. + +nvme-ns Options +--------------- + + `lbads`; The "LBA Data Size (LBADS)" indicates the LBA data size used by the + namespace. It is specified in terms of a power of two. Only values between + 9 and 12 (both inclusive) are supported. Reference Specifications diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 83734f4606e1..78b0d1a00672 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + uint8_t lbads; } NvmeNamespaceParams; typedef struct NvmeNamespace { diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 2ba0263ddaca..576c7486f45b 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -36,7 +36,7 @@ static void nvme_ns_init(NvmeNamespace *ns) ns->id_ns.dlfeat = 0x9; } - id_ns->lbaf[0].ds = BDRV_SECTOR_BITS; + id_ns->lbaf[0].ds = ns->params.lbads; id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); @@ -77,6 +77,11 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.lbads < 9 || ns->params.lbads > 12) { + error_setg(errp, "unsupported lbads (supported: 9-12)"); + return -1; + } + return 0; } @@ -125,6 +130,7 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + DEFINE_PROP_UINT8("lbads", NvmeNamespace, params.lbads, BDRV_SECTOR_BITS), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 3cbc3c7b75b1..758f58c88026 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -2812,6 +2812,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) if (n->namespace.blkconf.blk) { ns = &n->namespace; ns->params.nsid = 1; + ns->params.lbads = BDRV_SECTOR_BITS; if (nvme_ns_setup(n, ns, errp)) { return; From patchwork Tue Sep 29 23:19:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06B30C4727C for ; Tue, 29 Sep 2020 23:29:17 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7091C20773 for ; Tue, 29 Sep 2020 23:29:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7091C20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56010 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNP3b-0006qx-63 for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:29:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39006) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu9-0000Vk-V1; Tue, 29 Sep 2020 19:19:29 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:35191) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu8-00008E-0k; Tue, 29 Sep 2020 19:19:29 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 1D8DCB7B; Tue, 29 Sep 2020 19:19:26 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=WbPXCy/OFaknh L9vreyycHOBgrvUUYu9WIgBvZGZ6+E=; b=I9LeHaDj1bT4y+MdxzVBm9ALVqGbF N7kOp4ftAucjVHgjGcuip/IZxFYmqpQXCeULkPlm7JB5jZplN8vJ3VGMIwU/s6jF kHSiDlzpWkd/07ihz9FGKa75410bc5pgA74xjjhstCw2lJ82Cmdy0FGURZES7A3y CzPYjMU+wgcSymfr9zhjxpfZXvTyPXPAtMbc8BzpWdz8dAXOgMsBeUigZWuhB0Vf SR+X+D9jIdf6sJQ3rYsIF7bSySNckH/8aI6tn+7U7FawYY7dUJahS3ByhA5IEOaR I3c/potZZnU/VRgj11guBEnkj/l2Rp3ln+RsYTe+h7KL/CS5t3CZTi4BA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=WbPXCy/OFaknhL9vreyycHOBgrvUUYu9WIgBvZGZ6+E=; b=KL2Mpv7u p1cDSxAxlmW5B5dlPNZoDfwnoiyK9UD504dwSAXqwYd0JPJz+9gXOEtMVvA0Ssnp iQmdZYY1GW/abY2jwnPnt5llBHg8yDbi79tyCnDNCaHVZ7z3aQC5FOHRSrkhxIWu bFR5MEcH/AD0csaMRRt+mEXCFhOJo38zJSm6YbAqbVFiPp89X8mLvMhONXyWzKgw BJVsav1TJuM0ZBiHa3AtkArCo3ljUtlZwn7Jou/D7wOA7U7OVPhmqD2G3mwkViFW 0fK4cuuSx9CIV6d9AiDowPJQO7xDAmt8uQdm8Iuf45ZxHJ9dkXswOcySEJ5EokB+ xtLAxP5AeNTTMw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepvdenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 6BBE4328005D; Tue, 29 Sep 2020 19:19:24 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 04/14] hw/block/nvme: reject io commands if only admin command set selected Date: Wed, 30 Sep 2020 01:19:07 +0200 Message-Id: <20200929231917.433586-5-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen If the host sets CC.CSS to 111b, all commands submitted to I/O queues should be completed with status Invalid Command Opcode. Note that this is technically a v1.4 feature, but it does not hurt to implement before we finally bump the reported version implemented. Signed-off-by: Klaus Jensen --- include/block/nvme.h | 5 +++++ hw/block/nvme.c | 4 ++++ 2 files changed, 9 insertions(+) diff --git a/include/block/nvme.h b/include/block/nvme.h index 58647bcdad0b..7a30cf285ae0 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -110,6 +110,11 @@ enum NvmeCcMask { #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK) #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK) +enum NvmeCcCss { + NVME_CC_CSS_NVM = 0x0, + NVME_CC_CSS_ADMIN_ONLY = 0x7, +}; + enum NvmeCstsShift { CSTS_RDY_SHIFT = 0, CSTS_CFS_SHIFT = 1, diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 758f58c88026..27af2f0b38d5 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1065,6 +1065,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode)); + if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + if (!nvme_nsid_valid(n, nsid)) { return NVME_INVALID_NSID | NVME_DNR; } From patchwork Tue Sep 29 23:19:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DCC3C4727C for ; Tue, 29 Sep 2020 23:37:08 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A642620773 for ; Tue, 29 Sep 2020 23:37:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A642620773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37590 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPBC-00033Z-8T for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:37:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39022) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuB-0000Xj-NW; Tue, 29 Sep 2020 19:19:31 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:59917) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOu9-00008U-CO; Tue, 29 Sep 2020 19:19:31 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 7C4D7EB6; Tue, 29 Sep 2020 19:19:27 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=lLWIYy5Nac5ZM W2eRDZjjpMxYppJ2wzvImlA/zmiNkQ=; b=MBDGTxImkfb3yoTLuxaqu5xgDMJi3 zIqgKvxnBr1ys+bpZ1umONJKP8M1ewvv8JwG7pLChAG2kQCZsKVg6Ge8thSURqC/ v8Ne6Kzn6UZgHkJSP0oqeZA16mKOIxRECm/Olnk0PnalI2320K5NIpjprqp/evKS h4tTZEinHHiQCPseEJBToSZDT8RjKS8el31E9Tv+CC6V7SvKd0C0/CQ6tnykow9G b3x3Nr9AFGkFajtMX2otv2zhhN6TT8gkQmvyFRdsvWqUfAsEDaTxsBVNSkS4JZpU kDZiBe3IZ/68dcjWPuuJi9Ew/eSsm+x9oo1YWwQKI5KPwdK2QRP0A2UMQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=lLWIYy5Nac5ZMW2eRDZjjpMxYppJ2wzvImlA/zmiNkQ=; b=XT3vWRsO gNOz1KJ1s4T0FH48KKMBzUr/q2rR99lrP5BYnhaqffuOktqj1JF+7bfK2lIH0GoA g0Y9/akfITmZwDymr8BUUfB+D36/VNzzOA532vOtxADeQvyRSU+3e7712DwE1jgO igSw1Vo01UBLOFjiPqAcaRdXX1U+56To7SS/NwQnbDstyZtnahxmQlKstOiNaYof ADpMAsqtc25eawXRSe4RQPn7Ee9jgLQkI49vJ1T539raePODYRelvtkKHpknCLuZ F8Bkifu7WfG0r8MvCA/Aj85oFBhoN/z+Wnkti6E+Mrqmbp4qrAsCbGe1gzaVHjg/ ob0qhcYxCpDtRQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepvdenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id AF0823280064; Tue, 29 Sep 2020 19:19:25 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 05/14] hw/block/nvme: consolidate read, write and write zeroes Date: Wed, 30 Sep 2020 01:19:08 +0200 Message-Id: <20200929231917.433586-6-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Consolidate the read/write and write zeroes functions. Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 11 ++++++++ include/block/nvme.h | 2 ++ hw/block/nvme.c | 65 +++++++++++++++---------------------------- hw/block/trace-events | 3 +- 4 files changed, 37 insertions(+), 44 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index e080a2318a50..ccf52ac7bb82 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -36,6 +36,17 @@ typedef struct NvmeRequest { QTAILQ_ENTRY(NvmeRequest)entry; } NvmeRequest; +static inline bool nvme_req_is_write(NvmeRequest *req) +{ + switch (req->cmd.opcode) { + case NVME_CMD_WRITE: + case NVME_CMD_WRITE_ZEROES: + return true; + default: + return false; + } +} + static inline const char *nvme_adm_opc_str(uint8_t opc) { switch (opc) { diff --git a/include/block/nvme.h b/include/block/nvme.h index 7a30cf285ae0..999b4f8ae0d4 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -438,6 +438,8 @@ typedef struct QEMU_PACKED NvmeCmd { uint32_t cdw15; } NvmeCmd; +#define NVME_CMD_OPCODE_DATA_TRANSFER_MASK 0x3 + #define NVME_CMD_FLAGS_FUSE(flags) (flags & 0x3) #define NVME_CMD_FLAGS_PSDT(flags) ((flags >> 6) & 0x3) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 27af2f0b38d5..795c7e7c529f 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -997,48 +997,19 @@ static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req) return nvme_do_aio(ns->blkconf.blk, 0, 0, req); } -static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; - uint64_t offset = nvme_l2b(ns, slba); - uint32_t count = nvme_l2b(ns, nlb); + size_t len = nvme_l2b(ns, nlb); + uint16_t status; - trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb); - - status = nvme_check_bounds(n, ns, slba, nlb); - if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); - return status; - } - - return nvme_do_aio(ns->blkconf.blk, offset, count, req); -} - -static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) -{ - NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; - NvmeNamespace *ns = req->ns; - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; - uint64_t slba = le64_to_cpu(rw->slba); - - uint64_t data_size = nvme_l2b(ns, nlb); - uint64_t data_offset = nvme_l2b(ns, slba); - enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ? - BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; - uint16_t status; - - trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), - nvme_nsid(ns), nlb, data_size, slba); - - status = nvme_check_mdts(n, data_size); - if (status) { - trace_pci_nvme_err_mdts(nvme_cid(req), data_size); - goto invalid; - } + trace_pci_nvme_rwz(nvme_cid(req), nvme_io_opc_str(rw->opcode), + nvme_nsid(ns), nlb, len, slba); status = nvme_check_bounds(n, ns, slba, nlb); if (status) { @@ -1046,15 +1017,26 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - status = nvme_map_dptr(n, data_size, req); - if (status) { - goto invalid; + if (req->cmd.opcode & NVME_CMD_OPCODE_DATA_TRANSFER_MASK) { + status = nvme_check_mdts(n, len); + if (status) { + trace_pci_nvme_err_mdts(nvme_cid(req), len); + goto invalid; + } + + status = nvme_map_dptr(n, len, req); + if (status) { + goto invalid; + } } - return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req); + return nvme_do_aio(ns->blkconf.blk, nvme_l2b(ns, slba), len, req); invalid: - block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct); + block_acct_invalid(blk_get_stats(ns->blkconf.blk), + nvme_req_is_write(req) ? BLOCK_ACCT_WRITE : + BLOCK_ACCT_READ); + return status; } @@ -1082,10 +1064,9 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_FLUSH: return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: - return nvme_write_zeroes(n, req); case NVME_CMD_WRITE: case NVME_CMD_READ: - return nvme_rw(n, req); + return nvme_rwz(n, req); default: trace_pci_nvme_err_invalid_opc(req->cmd.opcode); return NVME_INVALID_OPCODE | NVME_DNR; diff --git a/hw/block/trace-events b/hw/block/trace-events index 9e7507c5abde..b18056c49836 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -40,9 +40,8 @@ pci_nvme_map_prp(uint64_t trans_len, uint32_t len, uint64_t prp1, uint64_t prp2, pci_nvme_map_sgl(uint16_t cid, uint8_t typ, uint64_t len) "cid %"PRIu16" type 0x%"PRIx8" len %"PRIu64"" pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" nsid %"PRIu32" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" -pci_nvme_rw(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t count, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" count %"PRIu64" lba 0x%"PRIx64"" +pci_nvme_rwz(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t len, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" len %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" -pci_nvme_write_zeroes(uint16_t cid, uint32_t nsid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" nsid %"PRIu32" slba %"PRIu64" nlb %"PRIu32"" pci_nvme_do_aio(uint16_t cid, uint8_t opc, const char *opname, const char *blkname, int64_t offset, size_t len) "cid %"PRIu16" opc 0x%"PRIx8" opname '%s' blk '%s' offset %"PRId64" len %zu" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" From patchwork Tue Sep 29 23:19:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92EBCC4727C for ; Tue, 29 Sep 2020 23:29:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F166420773 for ; Tue, 29 Sep 2020 23:29:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F166420773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55502 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNP3O-0006e2-Md for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:29:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39046) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuD-0000af-Sm; Tue, 29 Sep 2020 19:19:33 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:40333) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuA-00008m-Vn; Tue, 29 Sep 2020 19:19:33 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 1D437EB1; Tue, 29 Sep 2020 19:19:29 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=WVNRKZuD6uQ6i UCjGJsXw+ffVEycdwrhsJ8OcakOO1M=; b=eVZfvIY4rrir49khWJv6QNgiYByxT hS6t1tVv9Mbjj1NscqK1xgOLNiLBLU8y7Oy/YPgUhza8Rlu68zYIOZAh+wL+QAGh gAOUn60qaMPJEeHWb01kTOyFrmcsGoVI564arGe5AAGJfhaYuNxlzJXcBiBN4eDv AbahD0jQTg39mOjG/lPobzWxiDm0XHVOAsmDMJkHrEMSQBONAzFJb/GjdJ0CwWek XZFnVxZ99hrEOD4Dd6nhFBMu2jhuoH8hNbz8jmUI3uAwWDZOKiqYau7ylTnAFWtg FGrIlmyKGmPe62kmfCKQC5N9OOzHIHsFDeMHagDngDchP5HbYZFweOSRg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=WVNRKZuD6uQ6iUCjGJsXw+ffVEycdwrhsJ8OcakOO1M=; b=odTsv3n6 BFmv9Hngps2ceIxjtFFoevE88WVPspDq8CSsf25bY5rf4eYYvbBYP01RZb5Uj1lo zesF8bi74ldyQxHIoYy6MEjJMk87FM+5C9xFddomIA+DvaBiOsx2KrH/M18t2Q2p Ml037pqxebbAGwiTdRMhNe4Ef5VVagkxbym3DygePSAvgC1l62ORcR4ChJWuhg4R ddBaE2ObTzSjl7vFFr7h8jULxNHq9LRLJZZlKGZG5HIGdBYELVeY7td/7nVsoLDx TY/RA43REjGvBNdh/u1M55bswn0/VzcPkPysKfsdi5VaP/A648VpndJt9ix8zyDe gfL76qVRUnPZWg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepkeduheeftefhvddvgeeiffehkeeugeehhfdugffhhffhveejjeffueehudeguddu necuffhomhgrihhnpehuthhilhhiiigrthhiohhnrdhmrghpnecukfhppeektddrudeije drleekrdduledtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhf rhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id F269D328005D; Tue, 29 Sep 2020 19:19:26 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 06/14] hw/block/nvme: add support for dulbe and block utilization tracking Date: Wed, 30 Sep 2020 01:19:09 +0200 Message-Id: <20200929231917.433586-7-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen This adds support for reporting the Deallocated or Unwritten Logical Block error (DULBE). This requires tracking the allocated/deallocated status of all logical blocks. Introduce a bitmap that does this. The bitmap is persisted on the new 'pstate' blockdev that is associated with a namespace. If no such drive is attached, the controller will not indicate support for DULBE. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 19 +++++ hw/block/nvme-ns.h | 32 ++++++++ include/block/nvme.h | 5 ++ hw/block/nvme-ns.c | 186 ++++++++++++++++++++++++++++++++++++++++++ hw/block/nvme.c | 130 ++++++++++++++++++++++++++++- hw/block/trace-events | 2 + 6 files changed, 371 insertions(+), 3 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 438ca50d698c..6d00ac064998 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -12,6 +12,25 @@ nvme-ns Options namespace. It is specified in terms of a power of two. Only values between 9 and 12 (both inclusive) are supported. + `pstate`; This parameter specifies another blockdev to be used for storing + persistent state such as logical block allocation tracking. Adding this + parameter enables various optional features of the device. + + -drive id=pstate,file=pstate.img,format=raw + -device nvme-ns,pstate=pstate,... + + To reset (or initialize) state, the blockdev image should be of zero size: + + qemu-img create -f raw pstate.img 0 + + The image will be intialized with a file format header and truncated to + the required size. + + If the pstate given is of non-zero size, it will be assumed to contain + previous saved state and will be checked for consistency. If any stored + values (such as lbads) differs from those specified for the nvme-ns + device an error is produced. + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 78b0d1a00672..0ad83910dde9 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -19,6 +19,20 @@ #define NVME_NS(obj) \ OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS) +#define NVME_PSTATE_MAGIC ((0x00 << 24) | ('S' << 16) | ('P' << 8) | 'N') +#define NVME_PSTATE_V1 1 + +typedef struct NvmePstateHeader { + uint32_t magic; + uint32_t version; + + int64_t blk_len; + + uint8_t lbads; + + uint8_t rsvd17[4079]; +} QEMU_PACKED NvmePstateHeader; + typedef struct NvmeNamespaceParams { uint32_t nsid; uint8_t lbads; @@ -31,7 +45,20 @@ typedef struct NvmeNamespace { int64_t size; NvmeIdNs id_ns; + struct { + BlockBackend *blk; + + struct { + unsigned long *map; + int64_t offset; + } utilization; + } pstate; + NvmeNamespaceParams params; + + struct { + uint32_t err_rec; + } features; } NvmeNamespace; static inline uint32_t nvme_nsid(NvmeNamespace *ns) @@ -72,4 +99,9 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +static inline void _nvme_ns_check_size(void) +{ + QEMU_BUILD_BUG_ON(sizeof(NvmePstateHeader) != 4096); +} + #endif /* NVME_NS_H */ diff --git a/include/block/nvme.h b/include/block/nvme.h index 999b4f8ae0d4..abd49d371e63 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -683,6 +683,7 @@ enum NvmeStatusCodes { NVME_E2E_REF_ERROR = 0x0284, NVME_CMP_FAILURE = 0x0285, NVME_ACCESS_DENIED = 0x0286, + NVME_DULB = 0x0287, NVME_MORE = 0x2000, NVME_DNR = 0x4000, NVME_NO_COMPLETE = 0xffff, @@ -898,6 +899,9 @@ enum NvmeIdCtrlLpa { #define NVME_AEC_NS_ATTR(aec) ((aec >> 8) & 0x1) #define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1) +#define NVME_ERR_REC_TLER(err_rec) (err_rec & 0xffff) +#define NVME_ERR_REC_DULBE(err_rec) (err_rec & 0x10000) + enum NvmeFeatureIds { NVME_ARBITRATION = 0x1, NVME_POWER_MANAGEMENT = 0x2, @@ -1018,6 +1022,7 @@ enum NvmeNsIdentifierType { #define NVME_ID_NS_NSFEAT_THIN(nsfeat) ((nsfeat & 0x1)) +#define NVME_ID_NS_NSFEAT_DULBE(nsfeat) ((nsfeat >> 2) & 0x1) #define NVME_ID_NS_FLBAS_EXTENDED(flbas) ((flbas >> 4) & 0x1) #define NVME_ID_NS_FLBAS_INDEX(flbas) ((flbas & 0xf)) #define NVME_ID_NS_MC_SEPARATE(mc) ((mc >> 1) & 0x1) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 576c7486f45b..5e24b1a5dacd 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -25,9 +25,36 @@ #include "hw/qdev-properties.h" #include "hw/qdev-core.h" +#include "trace.h" + #include "nvme.h" #include "nvme-ns.h" +static int nvme_blk_truncate(BlockBackend *blk, size_t len, Error **errp) +{ + int ret; + uint64_t perm, shared_perm; + + blk_get_perm(blk, &perm, &shared_perm); + + ret = blk_set_perm(blk, perm | BLK_PERM_RESIZE, shared_perm, errp); + if (ret < 0) { + return ret; + } + + ret = blk_truncate(blk, len, false, PREALLOC_MODE_OFF, 0, errp); + if (ret < 0) { + return ret; + } + + ret = blk_set_perm(blk, perm, shared_perm, errp); + if (ret < 0) { + return ret; + } + + return 0; +} + static void nvme_ns_init(NvmeNamespace *ns) { NvmeIdNs *id_ns = &ns->id_ns; @@ -45,6 +72,143 @@ static void nvme_ns_init(NvmeNamespace *ns) id_ns->nuse = id_ns->ncap; } +static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) +{ + BlockBackend *blk = ns->pstate.blk; + NvmePstateHeader header; + uint64_t nlbas = nvme_ns_nlbas(ns); + size_t bitmap_len, pstate_len; + int ret; + + ret = nvme_blk_truncate(blk, sizeof(NvmePstateHeader), errp); + if (ret < 0) { + return ret; + } + + header = (NvmePstateHeader) { + .magic = cpu_to_le32(NVME_PSTATE_MAGIC), + .version = cpu_to_le32(NVME_PSTATE_V1), + .blk_len = cpu_to_le64(ns->size), + .lbads = ns->params.lbads, + }; + + ret = blk_pwrite(blk, 0, &header, sizeof(header), 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not write pstate header"); + return ret; + } + + bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len, + BDRV_SECTOR_SIZE); + + ret = nvme_blk_truncate(blk, pstate_len, errp); + if (ret < 0) { + return ret; + } + + ns->pstate.utilization.map = bitmap_new(nlbas); + + return 0; +} + +static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) +{ + BlockBackend *blk = ns->pstate.blk; + NvmePstateHeader header; + uint64_t nlbas = nvme_ns_nlbas(ns); + size_t bitmap_len, pstate_len; + unsigned long *map; + int ret; + + ret = blk_pread(blk, 0, &header, sizeof(header)); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not read pstate header"); + return ret; + } + + if (le32_to_cpu(header.magic) != NVME_PSTATE_MAGIC) { + error_setg(errp, "invalid pstate header"); + return -1; + } else if (le32_to_cpu(header.version) > NVME_PSTATE_V1) { + error_setg(errp, "unsupported pstate version"); + return -1; + } + + if (le64_to_cpu(header.blk_len) != ns->size) { + error_setg(errp, "invalid drive size"); + return -1; + } + + if (header.lbads != ns->params.lbads) { + error_setg(errp, "lbads parameter inconsistent with pstate " + "(pstate %u; parameter %u)", + header.lbads, ns->params.lbads); + return -1; + } + + bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len, + BDRV_SECTOR_SIZE); + + if (len != pstate_len) { + error_setg(errp, "pstate size mismatch " + "(expected %zd bytes; was %zu bytes)", + pstate_len, len); + return -1; + } + + map = bitmap_new(nlbas); + ret = blk_pread(blk, ns->pstate.utilization.offset, map, bitmap_len); + if (ret < 0) { + error_setg_errno(errp, -ret, + "could not read pstate allocation bitmap"); + g_free(map); + return ret; + } + +#ifdef HOST_WORDS_BIGENDIAN + ns->pstate.utilization.map = bitmap_new(nlbas); + bitmap_from_le(ns->pstate.utilization.map, map, nlbas); + g_free(map); +#else + ns->pstate.utilization.map = map; +#endif + + return 0; + +} + +static int nvme_ns_setup_blk_pstate(NvmeNamespace *ns, Error **errp) +{ + BlockBackend *blk = ns->pstate.blk; + uint64_t perm, shared_perm; + ssize_t len; + int ret; + + perm = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE; + shared_perm = BLK_PERM_ALL; + + ret = blk_set_perm(blk, perm, shared_perm, errp); + if (ret) { + return ret; + } + + ns->pstate.utilization.offset = sizeof(NvmePstateHeader); + + len = blk_getlength(blk); + if (len < 0) { + error_setg_errno(errp, -len, "could not determine pstate size"); + return len; + } + + if (!len) { + return nvme_ns_pstate_init(ns, errp); + } + + return nvme_ns_pstate_load(ns, len, errp); +} + static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) { if (!blkconf_blocksizes(&ns->blkconf, errp)) { @@ -96,6 +260,19 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) } nvme_ns_init(ns); + + if (ns->pstate.blk) { + if (nvme_ns_setup_blk_pstate(ns, errp)) { + return -1; + } + + /* + * With a pstate file in place we can enable the Deallocated or + * Unwritten Logical Block Error feature. + */ + ns->id_ns.nsfeat |= 0x4; + } + if (nvme_register_namespace(n, ns, errp)) { return -1; } @@ -106,11 +283,19 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) void nvme_ns_drain(NvmeNamespace *ns) { blk_drain(ns->blkconf.blk); + + if (ns->pstate.blk) { + blk_drain(ns->pstate.blk); + } } void nvme_ns_flush(NvmeNamespace *ns) { blk_flush(ns->blkconf.blk); + + if (ns->pstate.blk) { + blk_flush(ns->pstate.blk); + } } static void nvme_ns_realize(DeviceState *dev, Error **errp) @@ -131,6 +316,7 @@ static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), DEFINE_PROP_UINT8("lbads", NvmeNamespace, params.lbads, BDRV_SECTOR_BITS), + DEFINE_PROP_DRIVE("pstate", NvmeNamespace, pstate.blk), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 795c7e7c529f..be5a0a7dfa09 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -105,6 +105,7 @@ static const bool nvme_feature_support[NVME_FID_MAX] = { static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { [NVME_TEMPERATURE_THRESHOLD] = NVME_FEAT_CAP_CHANGE, + [NVME_ERROR_RECOVERY] = NVME_FEAT_CAP_CHANGE | NVME_FEAT_CAP_NS, [NVME_VOLATILE_WRITE_CACHE] = NVME_FEAT_CAP_CHANGE, [NVME_NUMBER_OF_QUEUES] = NVME_FEAT_CAP_CHANGE, [NVME_ASYNCHRONOUS_EVENT_CONF] = NVME_FEAT_CAP_CHANGE, @@ -888,6 +889,78 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, return NVME_SUCCESS; } +static inline uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba, + uint32_t nlb) +{ + uint64_t elba = slba + nlb; + + if (find_next_zero_bit(ns->pstate.utilization.map, elba, slba) < elba) { + return NVME_DULB; + } + + return NVME_SUCCESS; +} + +static int nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) +{ + int nlongs, idx; + int64_t offset; + unsigned long *map, *src; + int ret; + + if (!(ns->pstate.blk && nvme_check_dulbe(ns, slba, nlb))) { + return 0; + } + + trace_pci_nvme_allocate(nvme_nsid(ns), slba, nlb); + + bitmap_set(ns->pstate.utilization.map, slba, nlb); + + /* + * The bitmap is an array of unsigned longs, so calculate the index given + * the size of a long. + */ +#if HOST_LONG_BITS == 64 + idx = slba >> 6; +#else /* == 32 */ + idx = slba >> 5; +#endif + + nlongs = BITS_TO_LONGS(nlb); + + /* Unaligned modification (not staring at a long boundary) may modify the + * following long, so account for that. + */ + if (((slba % BITS_PER_LONG) + nlb) > BITS_PER_LONG) { + nlongs += 1; + } + + offset = ns->pstate.utilization.offset + idx * sizeof(unsigned long); + src = ns->pstate.utilization.map; + +#ifdef HOST_WORDS_BIGENDIAN + map = g_new(nlongs, sizeof(unsigned long)); + for (int i = idx; i < idx + nlongs; i++) { +# if HOST_LONG_BITS == 64 + map[i] = bswap64(src[i]); +# else + map[i] = bswap32(src[i]); +# endif + } +#else + map = src; +#endif + + ret = blk_pwrite(ns->pstate.blk, offset, &map[idx], + nlongs * sizeof(unsigned long), 0); + +#ifdef HOST_WORDS_BIGENDIAN + g_free(map); +#endif + return ret; +} + + static void nvme_rw_cb(void *opaque, int ret) { NvmeRequest *req = opaque; @@ -1006,6 +1079,7 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; size_t len = nvme_l2b(ns, nlb); + bool is_write = nvme_req_is_write(req); uint16_t status; trace_pci_nvme_rwz(nvme_cid(req), nvme_io_opc_str(rw->opcode), @@ -1017,6 +1091,16 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (!is_write) { + if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { + status = nvme_check_dulbe(ns, slba, nlb); + if (status) { + trace_pci_nvme_err_dulbe(nvme_cid(req), slba, nlb); + goto invalid; + } + } + } + if (req->cmd.opcode & NVME_CMD_OPCODE_DATA_TRANSFER_MASK) { status = nvme_check_mdts(n, len); if (status) { @@ -1030,12 +1114,18 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) } } + if (is_write) { + if (nvme_allocate(ns, slba, nlb) < 0) { + status = NVME_INTERNAL_DEV_ERROR; + goto invalid; + } + } + return nvme_do_aio(ns->blkconf.blk, nvme_l2b(ns, slba), len, req); invalid: block_acct_invalid(blk_get_stats(ns->blkconf.blk), - nvme_req_is_write(req) ? BLOCK_ACCT_WRITE : - BLOCK_ACCT_READ); + is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ); return status; } @@ -1638,6 +1728,8 @@ static uint16_t nvme_get_feature_timestamp(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; + NvmeCmd *cmd = &req->cmd; uint32_t dw10 = le32_to_cpu(cmd->cdw10); uint32_t dw11 = le32_to_cpu(cmd->cdw11); @@ -1708,6 +1800,18 @@ static uint16_t nvme_get_feature(NvmeCtrl *n, NvmeRequest *req) } return NVME_INVALID_FIELD | NVME_DNR; + case NVME_ERROR_RECOVERY: + if (!nvme_nsid_valid(n, nsid)) { + return NVME_INVALID_NSID | NVME_DNR; + } + + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + result = ns->features.err_rec; + goto out; case NVME_VOLATILE_WRITE_CACHE: result = n->features.vwc; trace_pci_nvme_getfeat_vwcache(result ? "enabled" : "disabled"); @@ -1780,7 +1884,7 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) { - NvmeNamespace *ns; + NvmeNamespace *ns = NULL; NvmeCmd *cmd = &req->cmd; uint32_t dw10 = le32_to_cpu(cmd->cdw10); @@ -1847,6 +1951,26 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) NVME_LOG_SMART_INFO); } + break; + case NVME_ERROR_RECOVERY: + if (nsid == NVME_NSID_BROADCAST) { + for (int i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + + if (!ns) { + continue; + } + + if (NVME_ID_NS_NSFEAT_DULBE(ns->id_ns.nsfeat)) { + ns->features.err_rec = dw11; + } + } + + break; + } + + assert(ns); + ns->features.err_rec = dw11; break; case NVME_VOLATILE_WRITE_CACHE: n->features.vwc = dw11 & 0x1; diff --git a/hw/block/trace-events b/hw/block/trace-events index b18056c49836..774513469274 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -42,6 +42,7 @@ pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, cons pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_rwz(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t len, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" len %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" +pci_nvme_allocate(uint32_t ns, uint64_t slba, uint32_t nlb) "nsid %"PRIu32" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_do_aio(uint16_t cid, uint8_t opc, const char *opname, const char *blkname, int64_t offset, size_t len) "cid %"PRIu16" opc 0x%"PRIx8" opname '%s' blk '%s' offset %"PRId64" len %zu" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" @@ -89,6 +90,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8"" +pci_nvme_err_dulbe(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_cfs(void) "controller fatal status" From patchwork Tue Sep 29 23:19:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272426 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93734C4727C for ; Tue, 29 Sep 2020 23:29:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EBDEB20773 for ; Tue, 29 Sep 2020 23:28:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EBDEB20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55452 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNP3K-0006ct-Gq for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:28:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuE-0000ba-Bs; Tue, 29 Sep 2020 19:19:34 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:36637) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuC-00008v-7L; Tue, 29 Sep 2020 19:19:34 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 3E465F56; Tue, 29 Sep 2020 19:19:30 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=bQBaNVsviUuNq Z4cWVhi6pXtTyURdFG4wBBhMYKvQ84=; b=P6Ywob8A1079YPpErF0Ma2a+PXuBo NDYIpBypFzHkArPBMYDCW/6asqMNMBh6rpoJAju0R+w+2aykZFIEp82C/Ii3asm2 Xj9eeckoM45V1ppx9b6GZLQNVjWZSStZJVfB443LPxk7xhm1BPsspBHO5FyTixsY XZiw+R5JKTGmKmCOO2wgZQRCqL0oWsMlQxYKREafcIMHHfjvnajHcZ4yn7pX6Utb eMqLk1T39ED++MkVSNjb9EXIIb7LzZ+wR58lE0f/l5bLJqvHfEEB79Iu4mvVp5Z7 XERd+Rwr0dfxy41XlmpLyRCUOZC0lsIlLvnOoCGOv/t6nGKyJeNkcRGTQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=bQBaNVsviUuNqZ4cWVhi6pXtTyURdFG4wBBhMYKvQ84=; b=tUBeeAd7 4Fxf5E07j1MiGrDW1cAXHj4CAPk8h9Xmv+PRyf8GcOVWl3qLpNrOi1JbvwosN3zO /+TWIBFNBw3UrWajhgS9oO9e6LkUfjk6sR3vBF6MjvES1RCZhBzcKEWVvPBjkGCi 9VlvwFabJEPykW71XZxTC8Kdzxr+U1zYCjqXh4kslTyb9y+dEiPptcOS8PFNZf8O NuuFnzEVepsuP5A6+VC0dMY0X47ttap6b81Wvn9QS4XUqBeNqA9I45ow2sYjnvgN HeH1OfPXfjWfYKyrc6S94OntjJOj7eK/9JShS8ixNKeplGbzMjMcwNq75oEKaKxA g64/I728iE6rNA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepvdenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 5DA113280064; Tue, 29 Sep 2020 19:19:28 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 07/14] hw/block/nvme: add commands supported and effects log page Date: Wed, 30 Sep 2020 01:19:10 +0200 Message-Id: <20200929231917.433586-8-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Gollu Appalanaidu , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Gollu Appalanaidu This is to support for the Commands Supported and Effects log page. See NVM Express Spec 1.3d, sec. 5.14.1.5 ("Commands Supported and Effects") Signed-off-by: Gollu Appalanaidu Signed-off-by: Klaus Jensen --- include/block/nvme.h | 21 +++++++++++++ hw/block/nvme.c | 75 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 95 insertions(+), 1 deletion(-) diff --git a/include/block/nvme.h b/include/block/nvme.h index abd49d371e63..5a5e19f6bedc 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -734,6 +734,24 @@ typedef struct QEMU_PACKED NvmeSmartLog { uint8_t reserved2[320]; } NvmeSmartLog; +typedef struct QEMU_PACKED NvmeEffectsLog { + uint32_t acs[256]; + uint32_t iocs[256]; + uint8_t rsvd2048[2048]; +} NvmeEffectsLog; + +enum { + NVME_EFFECTS_CSUPP = 1 << 0, + NVME_EFFECTS_LBCC = 1 << 1, + NVME_EFFECTS_NCC = 1 << 2, + NVME_EFFECTS_NIC = 1 << 3, + NVME_EFFECTS_CCC = 1 << 4, + NVME_EFFECTS_CSE_SINGLE = 1 << 16, + NVME_EFFECTS_CSE_MULTI = 1 << 17, + NVME_EFFECTS_CSE_MASK = 3 << 16, + NVME_EFFECTS_UUID_SEL = 1 << 19, +}; + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -746,6 +764,7 @@ enum NvmeLogIdentifier { NVME_LOG_ERROR_INFO = 0x01, NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, + NVME_LOG_EFFECTS = 0x05, }; typedef struct QEMU_PACKED NvmePSD { @@ -857,6 +876,7 @@ enum NvmeIdCtrlFrmw { }; enum NvmeIdCtrlLpa { + NVME_LPA_EFFECTS_LOG = 1 << 1, NVME_LPA_EXTENDED = 1 << 2, }; @@ -1064,5 +1084,6 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); } #endif diff --git a/hw/block/nvme.c b/hw/block/nvme.c index be5a0a7dfa09..0bc19a1e3688 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -81,6 +81,7 @@ #define NVME_TEMPERATURE_WARNING 0x157 #define NVME_TEMPERATURE_CRITICAL 0x175 #define NVME_NUM_FW_SLOTS 1 +#define NVME_MAX_ADM_IO_CMDS 0xFF #define NVME_GUEST_ERR(trace, fmt, ...) \ do { \ @@ -112,6 +113,46 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { [NVME_TIMESTAMP] = NVME_FEAT_CAP_CHANGE, }; +#define NVME_EFFECTS_ADMIN_INITIALIZER \ + [NVME_ADM_CMD_DELETE_SQ] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_CREATE_SQ] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_GET_LOG_PAGE] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_DELETE_CQ] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_CREATE_CQ] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_IDENTIFY] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_ABORT] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_SET_FEATURES] = NVME_EFFECTS_CSUPP | \ + NVME_EFFECTS_CCC | \ + NVME_EFFECTS_NIC | \ + NVME_EFFECTS_NCC, \ + [NVME_ADM_CMD_GET_FEATURES] = NVME_EFFECTS_CSUPP, \ + [NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_EFFECTS_CSUPP + +#define NVME_EFFECTS_NVM_INITIALIZER \ + [NVME_CMD_FLUSH] = NVME_EFFECTS_CSUPP | \ + NVME_EFFECTS_LBCC, \ + [NVME_CMD_WRITE] = NVME_EFFECTS_CSUPP | \ + NVME_EFFECTS_LBCC, \ + [NVME_CMD_READ] = NVME_EFFECTS_CSUPP, \ + [NVME_CMD_WRITE_ZEROES] = NVME_EFFECTS_CSUPP | \ + NVME_EFFECTS_LBCC + +static const NvmeEffectsLog nvme_effects_admin_only = { + .acs = { + NVME_EFFECTS_ADMIN_INITIALIZER, + }, +}; + +static const NvmeEffectsLog nvme_effects = { + .acs = { + NVME_EFFECTS_ADMIN_INITIALIZER, + }, + + .iocs = { + NVME_EFFECTS_NVM_INITIALIZER, + }, +}; + static void nvme_process_sq(void *opaque); static uint16_t nvme_cid(NvmeRequest *req) @@ -1382,6 +1423,36 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, + NvmeRequest *req) +{ + const NvmeEffectsLog *effects; + + uint32_t trans_len; + + if (off > sizeof(NvmeEffectsLog)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (NVME_CC_CSS(n->bar.cc)) { + case NVME_CC_CSS_ADMIN_ONLY: + effects = &nvme_effects_admin_only; + break; + + case NVME_CC_CSS_NVM: + effects = &nvme_effects; + break; + + default: + return NVME_INTERNAL_DEV_ERROR | NVME_DNR; + } + + trans_len = MIN(sizeof(NvmeEffectsLog) - off, buf_len); + + return nvme_dma(n, (uint8_t *)effects + off, trans_len, + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -1425,6 +1496,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) return nvme_smart_info(n, rae, len, off, req); case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); + case NVME_LOG_EFFECTS: + return nvme_effects_log(n, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -2858,7 +2931,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) id->acl = 3; id->aerl = n->params.aerl; id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO; - id->lpa = NVME_LPA_EXTENDED; + id->lpa = NVME_LPA_EFFECTS_LOG | NVME_LPA_EXTENDED; /* recommended default value (~70 C) */ id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING); From patchwork Tue Sep 29 23:19:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB33EC4727C for ; Tue, 29 Sep 2020 23:36:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2600520773 for ; Tue, 29 Sep 2020 23:36:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2600520773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37052 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPAx-0002mT-TP for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:36:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39082) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuH-0000gE-CH; Tue, 29 Sep 2020 19:19:37 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:52227) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuD-000095-Ve; Tue, 29 Sep 2020 19:19:37 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id ECD0EE24; Tue, 29 Sep 2020 19:19:31 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=vaaO2kwzF5DTy G52WHJcQ/CxLNWStLFy0M1KFAMYffw=; b=r2tYY8+/eO2FCPJNkbDT+X/QS7lKg RkmOU88sQT/8FwO3bFS7pCrpPUdsBwuGw411+Kgm+mr4Ql0NMa/Go32eNyMvEpmI n7S9DLIh8XA/PfnFp7U3V/fRKiFigc3NUx9+rkyAwhIWOgbOfdiba9G3Vpkjtj1u e7IefmzMBXDN5PtvJPoY99STjM7LFsLSmiGLICRkwkisphaw4PkI530jZf/vn3Nr nS6D+aaCMEPxyikTUUwUlRKv8EotNiygILr3YtsoXDZZ100nwo69znbiR5rGvSKE 7UtLXWCg7FwhxuieB9ewULRCBUnGUb8Rg4nS91OM3s3gBwIuxmlDB15Gg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=vaaO2kwzF5DTyG52WHJcQ/CxLNWStLFy0M1KFAMYffw=; b=exUKagB1 OYqDgv7aAl/Eh41UYngZdzAnnalKN67ZpAHRZR47P8xyq+BK837qCje7m3r9OBkJ HrL+5EZfyzNj3DSDZlssEAkf8BoopdGkTI9wN2cvQNXfcz6JYWOQXgVSjDBrDlVM IOjjgdb4u947jNspUuBdAf+D725jBV1LuMPF9ZBk+x2a6TIKy6OQxHnpZWy4/ofy DaPCvevu8xT/RP/V3VHud+gwQ6rl8Dh8iwIfPhF7AJIB2ILq6zyt+ZcbjzIbKnSI xWkIR00/m9zm/Ze5wdPoq+Iw9pH+2pPlRRcRq2WjJh4U0KvRYnJ6N4KIX7zTfjXU aabulb6JAuKX1A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepheenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id BBDA73280068; Tue, 29 Sep 2020 19:19:29 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 08/14] hw/block/nvme: support namespace types Date: Wed, 30 Sep 2020 01:19:11 +0200 Message-Id: <20200929231917.433586-9-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Implement support for TP 4056 ("Namespace Types"). This adds the 'iocs' (I/O Command Set) device parameter to the nvme-ns device. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 3 + hw/block/nvme-ns.h | 14 ++- hw/block/nvme.h | 3 + include/block/nvme.h | 58 +++++++++-- block/nvme.c | 4 +- hw/block/nvme-ns.c | 29 +++++- hw/block/nvme.c | 228 ++++++++++++++++++++++++++++++++++-------- hw/block/trace-events | 6 +- 8 files changed, 285 insertions(+), 60 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index 6d00ac064998..a13c7a5dbe86 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -12,6 +12,9 @@ nvme-ns Options namespace. It is specified in terms of a power of two. Only values between 9 and 12 (both inclusive) are supported. + `iocs`; The "I/O Command Set" associated with the namespace. E.g. 0x0 for the + NVM Command Set (the default), or 0x2 for the Zoned Namespace Command Set. + `pstate`; This parameter specifies another blockdev to be used for storing persistent state such as logical block allocation tracking. Adding this parameter enables various optional features of the device. diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 0ad83910dde9..d39fa79fa682 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -29,12 +29,14 @@ typedef struct NvmePstateHeader { int64_t blk_len; uint8_t lbads; + uint8_t iocs; - uint8_t rsvd17[4079]; + uint8_t rsvd18[4078]; } QEMU_PACKED NvmePstateHeader; typedef struct NvmeNamespaceParams { uint32_t nsid; + uint8_t iocs; uint8_t lbads; } NvmeNamespaceParams; @@ -43,7 +45,8 @@ typedef struct NvmeNamespace { BlockConf blkconf; int32_t bootindex; int64_t size; - NvmeIdNs id_ns; + uint8_t iocs; + void *id_ns[NVME_IOCS_MAX]; struct { BlockBackend *blk; @@ -70,9 +73,14 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) return -1; } +static inline NvmeIdNsNvm *nvme_ns_id_nvm(NvmeNamespace *ns) +{ + return ns->id_ns[NVME_IOCS_NVM]; +} + static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) { - NvmeIdNs *id_ns = &ns->id_ns; + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; } diff --git a/hw/block/nvme.h b/hw/block/nvme.h index ccf52ac7bb82..f66ed9ab7eff 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -123,6 +123,7 @@ typedef struct NvmeFeatureVal { }; uint32_t async_config; uint32_t vwc; + uint32_t iocsci; } NvmeFeatureVal; typedef struct NvmeCtrl { @@ -150,6 +151,7 @@ typedef struct NvmeCtrl { uint64_t timestamp_set_qemu_clock_ms; /* QEMU clock time */ uint64_t starttime_ms; uint16_t temperature; + uint64_t iocscs[512]; HostMemoryBackend *pmrdev; @@ -165,6 +167,7 @@ typedef struct NvmeCtrl { NvmeSQueue admin_sq; NvmeCQueue admin_cq; NvmeIdCtrl id_ctrl; + void *id_ctrl_iocss[NVME_IOCS_MAX]; NvmeFeatureVal features; } NvmeCtrl; diff --git a/include/block/nvme.h b/include/block/nvme.h index 5a5e19f6bedc..792fccf8c81f 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -82,6 +82,11 @@ enum NvmeCapMask { #define NVME_CAP_SET_PMRS(cap, val) (cap |= (uint64_t)(val & CAP_PMR_MASK)\ << CAP_PMR_SHIFT) +enum NvmeCapCss { + NVME_CAP_CSS_NVM = 1 << 0, + NVME_CAP_CSS_CSI = 1 << 6, +}; + enum NvmeCcShift { CC_EN_SHIFT = 0, CC_CSS_SHIFT = 4, @@ -112,6 +117,7 @@ enum NvmeCcMask { enum NvmeCcCss { NVME_CC_CSS_NVM = 0x0, + NVME_CC_CSS_ALL = 0x6, NVME_CC_CSS_ADMIN_ONLY = 0x7, }; @@ -383,6 +389,11 @@ enum NvmePmrmscMask { #define NVME_PMRMSC_SET_CBA(pmrmsc, val) \ (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT) +enum NvmeCommandSet { + NVME_IOCS_NVM = 0x0, + NVME_IOCS_MAX = 0x1, +}; + enum NvmeSglDescriptorType { NVME_SGL_DESCR_TYPE_DATA_BLOCK = 0x0, NVME_SGL_DESCR_TYPE_BIT_BUCKET = 0x1, @@ -531,8 +542,13 @@ typedef struct QEMU_PACKED NvmeIdentify { uint64_t rsvd2[2]; uint64_t prp1; uint64_t prp2; - uint32_t cns; - uint32_t rsvd11[5]; + uint8_t cns; + uint8_t rsvd3; + uint16_t cntid; + uint16_t nvmsetid; + uint8_t rsvd4; + uint8_t csi; + uint32_t rsvd11[4]; } NvmeIdentify; typedef struct QEMU_PACKED NvmeRwCmd { @@ -624,8 +640,15 @@ typedef struct QEMU_PACKED NvmeAerResult { } NvmeAerResult; typedef struct QEMU_PACKED NvmeCqe { - uint32_t result; - uint32_t rsvd; + union { + struct { + uint32_t dw0; + uint32_t dw1; + }; + + uint64_t qw0; + }; + uint16_t sq_head; uint16_t sq_id; uint16_t cid; @@ -673,6 +696,10 @@ enum NvmeStatusCodes { NVME_FEAT_NOT_CHANGEABLE = 0x010e, NVME_FEAT_NOT_NS_SPEC = 0x010f, NVME_FW_REQ_SUSYSTEM_RESET = 0x0110, + NVME_IOCS_NOT_SUPPORTED = 0x0127, + NVME_IOCS_NOT_ENABLED = 0x0128, + NVME_IOCS_COMB_REJECTED = 0x0129, + NVME_INVALID_IOCS = 0x0126, NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, @@ -734,6 +761,8 @@ typedef struct QEMU_PACKED NvmeSmartLog { uint8_t reserved2[320]; } NvmeSmartLog; +#define NVME_ACS_MAX 256 + typedef struct QEMU_PACKED NvmeEffectsLog { uint32_t acs[256]; uint32_t iocs[256]; @@ -782,10 +811,14 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum { - NVME_ID_CNS_NS = 0x0, - NVME_ID_CNS_CTRL = 0x1, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x2, - NVME_ID_CNS_NS_DESCR_LIST = 0x3, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_NS_IOCS = 0x05, + NVME_ID_CNS_CTRL_IOCS = 0x06, + NVME_ID_CNS_NS_ACTIVE_LIST_IOCS = 0x07, + NVME_ID_CNS_IOCS = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { @@ -935,6 +968,7 @@ enum NvmeFeatureIds { NVME_WRITE_ATOMICITY = 0xa, NVME_ASYNCHRONOUS_EVENT_CONF = 0xb, NVME_TIMESTAMP = 0xe, + NVME_COMMAND_SET_PROFILE = 0x19, NVME_SOFTWARE_PROGRESS_MARKER = 0x80, NVME_FID_MAX = 0x100, }; @@ -983,7 +1017,7 @@ typedef struct QEMU_PACKED NvmeLBAF { #define NVME_NSID_BROADCAST 0xffffffff -typedef struct QEMU_PACKED NvmeIdNs { +typedef struct QEMU_PACKED NvmeIdNsNvm { uint64_t nsze; uint64_t ncap; uint64_t nuse; @@ -1011,7 +1045,7 @@ typedef struct QEMU_PACKED NvmeIdNs { NvmeLBAF lbaf[16]; uint8_t rsvd192[192]; uint8_t vs[3712]; -} NvmeIdNs; +} NvmeIdNsNvm; typedef struct QEMU_PACKED NvmeIdNsDescr { uint8_t nidt; @@ -1023,12 +1057,14 @@ enum { NVME_NIDT_EUI64_LEN = 8, NVME_NIDT_NGUID_LEN = 16, NVME_NIDT_UUID_LEN = 16, + NVME_NIDT_CSI_LEN = 1, }; enum NvmeNsIdentifierType { NVME_NIDT_EUI64 = 0x1, NVME_NIDT_NGUID = 0x2, NVME_NIDT_UUID = 0x3, + NVME_NIDT_CSI = 0x4, }; /*Deallocate Logical Block Features*/ @@ -1081,7 +1117,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); - QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); diff --git a/block/nvme.c b/block/nvme.c index f4f27b6da7d8..b5cc0185c298 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -336,7 +336,7 @@ static inline int nvme_translate_error(const NvmeCqe *c) { uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF; if (status) { - trace_nvme_error(le32_to_cpu(c->result), + trace_nvme_error(le32_to_cpu(c->dw0), le16_to_cpu(c->sq_head), le16_to_cpu(c->sq_id), le16_to_cpu(c->cid), @@ -503,7 +503,7 @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp) BDRVNVMeState *s = bs->opaque; union { NvmeIdCtrl ctrl; - NvmeIdNs ns; + NvmeIdNsNvm ns; } *id; NvmeLBAF *lbaf; uint16_t oncs; diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 5e24b1a5dacd..9eabab29f32c 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -57,10 +57,15 @@ static int nvme_blk_truncate(BlockBackend *blk, size_t len, Error **errp) static void nvme_ns_init(NvmeNamespace *ns) { - NvmeIdNs *id_ns = &ns->id_ns; + NvmeIdNsNvm *id_ns; + + ns->id_ns[NVME_IOCS_NVM] = g_new0(NvmeIdNsNvm, 1); + id_ns = nvme_ns_id_nvm(ns); + + ns->iocs = ns->params.iocs; if (blk_get_flags(ns->blkconf.blk) & BDRV_O_UNMAP) { - ns->id_ns.dlfeat = 0x9; + id_ns->dlfeat = 0x9; } id_ns->lbaf[0].ds = ns->params.lbads; @@ -90,6 +95,7 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) .version = cpu_to_le32(NVME_PSTATE_V1), .blk_len = cpu_to_le64(ns->size), .lbads = ns->params.lbads, + .iocs = ns->params.iocs, }; ret = blk_pwrite(blk, 0, &header, sizeof(header), 0); @@ -147,6 +153,13 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) return -1; } + if (header.iocs != ns->params.iocs) { + error_setg(errp, "iocs parameter inconsistent with pstate " + "(pstate %u; parameter %u)", + header.iocs, ns->params.iocs); + return -1; + } + bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len, BDRV_SECTOR_SIZE); @@ -246,6 +259,14 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } + switch (ns->params.iocs) { + case NVME_IOCS_NVM: + break; + default: + error_setg(errp, "unsupported iocs"); + return -1; + } + return 0; } @@ -270,7 +291,8 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) * With a pstate file in place we can enable the Deallocated or * Unwritten Logical Block Error feature. */ - ns->id_ns.nsfeat |= 0x4; + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); + id_ns->nsfeat |= 0x4; } if (nvme_register_namespace(n, ns, errp)) { @@ -317,6 +339,7 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), DEFINE_PROP_UINT8("lbads", NvmeNamespace, params.lbads, BDRV_SECTOR_BITS), DEFINE_PROP_DRIVE("pstate", NvmeNamespace, pstate.blk), + DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, NVME_IOCS_NVM), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 0bc19a1e3688..53cb8c2ed4a7 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -102,6 +102,7 @@ static const bool nvme_feature_support[NVME_FID_MAX] = { [NVME_WRITE_ATOMICITY] = true, [NVME_ASYNCHRONOUS_EVENT_CONF] = true, [NVME_TIMESTAMP] = true, + [NVME_COMMAND_SET_PROFILE] = true, }; static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { @@ -111,6 +112,7 @@ static const uint32_t nvme_feature_cap[NVME_FID_MAX] = { [NVME_NUMBER_OF_QUEUES] = NVME_FEAT_CAP_CHANGE, [NVME_ASYNCHRONOUS_EVENT_CONF] = NVME_FEAT_CAP_CHANGE, [NVME_TIMESTAMP] = NVME_FEAT_CAP_CHANGE, + [NVME_COMMAND_SET_PROFILE] = NVME_FEAT_CAP_CHANGE, }; #define NVME_EFFECTS_ADMIN_INITIALIZER \ @@ -143,13 +145,15 @@ static const NvmeEffectsLog nvme_effects_admin_only = { }, }; -static const NvmeEffectsLog nvme_effects = { - .acs = { - NVME_EFFECTS_ADMIN_INITIALIZER, - }, +static const NvmeEffectsLog nvme_effects[NVME_IOCS_MAX] = { + [NVME_IOCS_NVM] = { + .acs = { + NVME_EFFECTS_ADMIN_INITIALIZER, + }, - .iocs = { - NVME_EFFECTS_NVM_INITIALIZER, + .iocs = { + NVME_EFFECTS_NVM_INITIALIZER, + }, }, }; @@ -861,7 +865,7 @@ static void nvme_process_aers(void *opaque) req = n->aer_reqs[n->outstanding_aers]; - result = (NvmeAerResult *) &req->cqe.result; + result = (NvmeAerResult *) &req->cqe.dw0; result->event_type = event->result.event_type; result->event_info = event->result.event_info; result->log_page = event->result.log_page; @@ -921,7 +925,8 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, uint64_t slba, uint32_t nlb) { - uint64_t nsze = le64_to_cpu(ns->id_ns.nsze); + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); + uint64_t nsze = le64_to_cpu(id_ns->nsze); if (unlikely(UINT64_MAX - slba < nlb || slba + nlb > nsze)) { return NVME_LBA_RANGE | NVME_DNR; @@ -1128,7 +1133,8 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) status = nvme_check_bounds(n, ns, slba, nlb); if (status) { - trace_pci_nvme_err_invalid_lba_range(slba, nlb, ns->id_ns.nsze); + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); + trace_pci_nvme_err_invalid_lba_range(slba, nlb, id_ns->nsze); goto invalid; } @@ -1174,9 +1180,10 @@ invalid: static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) { uint32_t nsid = le32_to_cpu(req->cmd.nsid); + uint8_t opc = req->cmd.opcode; trace_pci_nvme_io_cmd(nvme_cid(req), nsid, nvme_sqid(req), - req->cmd.opcode, nvme_io_opc_str(req->cmd.opcode)); + opc, nvme_io_opc_str(req->cmd.opcode)); if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ADMIN_ONLY) { return NVME_INVALID_OPCODE | NVME_DNR; @@ -1191,6 +1198,10 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } + if (!(nvme_effects[req->ns->iocs].iocs[opc] & NVME_EFFECTS_CSUPP)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + switch (req->cmd.opcode) { case NVME_CMD_FLUSH: return nvme_flush(n, req); @@ -1429,6 +1440,7 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, const NvmeEffectsLog *effects; uint32_t trans_len; + uint8_t csi = le32_to_cpu(req->cmd.cdw14) >> 24; if (off > sizeof(NvmeEffectsLog)) { return NVME_INVALID_FIELD | NVME_DNR; @@ -1440,7 +1452,17 @@ static uint16_t nvme_effects_log(NvmeCtrl *n, uint32_t buf_len, uint64_t off, break; case NVME_CC_CSS_NVM: - effects = &nvme_effects; + effects = &nvme_effects[NVME_IOCS_NVM]; + break; + + case NVME_CC_CSS_ALL: + if (!(n->iocscs[n->features.iocsci] & (1 << csi))) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + assert(csi < NVME_IOCS_MAX); + + effects = &nvme_effects[csi]; break; default: @@ -1610,39 +1632,94 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req) return NVME_SUCCESS; } -static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ctrl(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { + NvmeIdCtrl empty = { 0 }; + NvmeIdCtrl *id_ctrl = ∅ + trace_pci_nvme_identify_ctrl(); - return nvme_dma(n, (uint8_t *)&n->id_ctrl, sizeof(n->id_ctrl), + switch (cns) { + case NVME_ID_CNS_CTRL: + id_ctrl = &n->id_ctrl; + + break; + + case NVME_ID_CNS_CTRL_IOCS: + if (!(n->iocscs[n->features.iocsci] & (1 << csi))) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + assert(csi < NVME_IOCS_MAX); + + if (n->id_ctrl_iocss[csi]) { + id_ctrl = n->id_ctrl_iocss[csi]; + } + + break; + + default: + return NVME_INVALID_FIELD | NVME_DNR; + } + + return nvme_dma(n, (uint8_t *)id_ctrl, sizeof(*id_ctrl), DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { + NvmeIdNsNvm empty = { 0 }; + void *id_ns = ∅ NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - NvmeIdNs *id_ns, inactive = { 0 }; uint32_t nsid = le32_to_cpu(c->nsid); - trace_pci_nvme_identify_ns(nsid); + trace_pci_nvme_identify_ns(nsid, csi); if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) { return NVME_INVALID_NSID | NVME_DNR; } ns = nvme_ns(n, nsid); - if (unlikely(!ns)) { - id_ns = &inactive; - } else { - id_ns = &ns->id_ns; + if (ns) { + switch (cns) { + case NVME_ID_CNS_NS: + id_ns = ns->id_ns[NVME_IOCS_NVM]; + if (!id_ns) { + return NVME_INVALID_IOCS | NVME_DNR; + } + + break; + + case NVME_ID_CNS_NS_IOCS: + if (csi == NVME_IOCS_NVM) { + break; + } + + if (csi >= NVME_IOCS_MAX) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + id_ns = ns->id_ns[csi]; + if (!id_ns) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + break; + + default: + return NVME_INVALID_FIELD | NVME_DNR; + } } - return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), + return nvme_dma(n, (uint8_t *)id_ns, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, uint8_t cns, uint8_t csi, + NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; static const int data_len = NVME_IDENTIFY_DATA_SIZE; @@ -1651,7 +1728,7 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) uint16_t ret; int j = 0; - trace_pci_nvme_identify_nslist(min_nsid); + trace_pci_nvme_identify_nslist(min_nsid, csi); /* * Both 0xffffffff (NVME_NSID_BROADCAST) and 0xfffffffe are invalid values @@ -1663,11 +1740,21 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } + if (cns == NVME_ID_CNS_NS_ACTIVE_LIST_IOCS && !csi) { + return NVME_INVALID_FIELD | NVME_DNR; + } + list = g_malloc0(data_len); for (int i = 1; i <= n->num_namespaces; i++) { - if (i <= min_nsid || !nvme_ns(n, i)) { + NvmeNamespace *ns = nvme_ns(n, i); + if (i <= min_nsid || !ns) { continue; } + + if (cns == NVME_ID_CNS_NS_ACTIVE_LIST_IOCS && csi && csi != ns->iocs) { + continue; + } + list[j++] = cpu_to_le32(i); if (j == data_len / sizeof(uint32_t)) { break; @@ -1683,6 +1770,7 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; uint32_t nsid = le32_to_cpu(c->nsid); + NvmeNamespace *ns; uint8_t list[NVME_IDENTIFY_DATA_SIZE]; struct data { @@ -1690,6 +1778,11 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) NvmeIdNsDescr hdr; uint8_t v[16]; } uuid; + + struct { + NvmeIdNsDescr hdr; + uint8_t v; + } iocs; }; struct data *ns_descrs = (struct data *)list; @@ -1700,7 +1793,8 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - if (unlikely(!nvme_ns(n, nsid))) { + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1716,25 +1810,45 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; stl_be_p(&ns_descrs->uuid.v, nsid); + ns_descrs->iocs.hdr.nidt = NVME_NIDT_CSI; + ns_descrs->iocs.hdr.nidl = NVME_NIDT_CSI_LEN; + stb_p(&ns_descrs->iocs.v, ns->iocs); + return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_iocs(NvmeCtrl *n, uint16_t cntid, + NvmeRequest *req) +{ + return nvme_dma(n, (uint8_t *) n->iocscs, sizeof(n->iocscs), + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + trace_pci_nvme_identify(nvme_cid(req), le32_to_cpu(c->nsid), + le16_to_cpu(c->cntid), c->cns, c->csi, + le16_to_cpu(c->nvmsetid)); + switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + case NVME_ID_CNS_NS_IOCS: + return nvme_identify_ns(n, c->cns, c->csi, req); case NVME_ID_CNS_CTRL: - return nvme_identify_ctrl(n, req); + case NVME_ID_CNS_CTRL_IOCS: + return nvme_identify_ctrl(n, c->cns, c->csi, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + case NVME_ID_CNS_NS_ACTIVE_LIST_IOCS: + return nvme_identify_nslist(n, c->cns, c->csi, req); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); + case NVME_ID_CNS_IOCS: + return nvme_identify_iocs(n, c->cntid, req); default: - trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns)); + trace_pci_nvme_err_invalid_identify_cns(c->cns); return NVME_INVALID_FIELD | NVME_DNR; } } @@ -1743,7 +1857,7 @@ static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req) { uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff; - req->cqe.result = 1; + req->cqe.dw0 = 1; if (nvme_check_sqid(n, sqid)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1928,6 +2042,9 @@ defaults: result |= NVME_INTVC_NOCOALESCING; } + break; + case NVME_COMMAND_SET_PROFILE: + result = cpu_to_le32(n->features.iocsci & 0x1ff); break; default: result = nvme_feature_default[fid]; @@ -1935,7 +2052,8 @@ defaults: } out: - req->cqe.result = cpu_to_le32(result); + req->cqe.dw0 = cpu_to_le32(result); + return NVME_SUCCESS; } @@ -1958,6 +2076,7 @@ static uint16_t nvme_set_feature_timestamp(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns = NULL; + NvmeIdNsNvm *id_ns; NvmeCmd *cmd = &req->cmd; uint32_t dw10 = le32_to_cpu(cmd->cdw10); @@ -2034,7 +2153,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) continue; } - if (NVME_ID_NS_NSFEAT_DULBE(ns->id_ns.nsfeat)) { + id_ns = nvme_ns_id_nvm(ns); + if (NVME_ID_NS_NSFEAT_DULBE(id_ns->nsfeat)) { ns->features.err_rec = dw11; } } @@ -2050,6 +2170,7 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) for (int i = 1; i <= n->num_namespaces; i++) { ns = nvme_ns(n, i); + if (!ns) { continue; } @@ -2080,14 +2201,34 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) ((dw11 >> 16) & 0xFFFF) + 1, n->params.max_ioqpairs, n->params.max_ioqpairs); - req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)); + req->cqe.dw0 = cpu_to_le32((n->params.max_ioqpairs - 1) | + ((n->params.max_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config = dw11; break; case NVME_TIMESTAMP: return nvme_set_feature_timestamp(n, req); + case NVME_COMMAND_SET_PROFILE: + if (NVME_CC_CSS(n->bar.cc) == NVME_CC_CSS_ALL) { + uint16_t iocsci = dw11 & 0x1ff; + uint64_t iocsc = n->iocscs[iocsci]; + + for (int i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + + if (!(iocsc & (1 << ns->iocs))) { + return NVME_IOCS_COMB_REJECTED | NVME_DNR; + } + } + + n->features.iocsci = iocsci; + } + + break; default: return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR; } @@ -2234,6 +2375,8 @@ static int nvme_start_ctrl(NvmeCtrl *n) uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12; uint32_t page_size = 1 << page_bits; + NvmeIdCtrl *id_ctrl = &n->id_ctrl; + if (unlikely(n->cq[0])) { trace_pci_nvme_err_startfail_cq(); return -1; @@ -2273,28 +2416,28 @@ static int nvme_start_ctrl(NvmeCtrl *n) return -1; } if (unlikely(NVME_CC_IOCQES(n->bar.cc) < - NVME_CTRL_CQES_MIN(n->id_ctrl.cqes))) { + NVME_CTRL_CQES_MIN(id_ctrl->cqes))) { trace_pci_nvme_err_startfail_cqent_too_small( NVME_CC_IOCQES(n->bar.cc), NVME_CTRL_CQES_MIN(n->bar.cap)); return -1; } if (unlikely(NVME_CC_IOCQES(n->bar.cc) > - NVME_CTRL_CQES_MAX(n->id_ctrl.cqes))) { + NVME_CTRL_CQES_MAX(id_ctrl->cqes))) { trace_pci_nvme_err_startfail_cqent_too_large( NVME_CC_IOCQES(n->bar.cc), NVME_CTRL_CQES_MAX(n->bar.cap)); return -1; } if (unlikely(NVME_CC_IOSQES(n->bar.cc) < - NVME_CTRL_SQES_MIN(n->id_ctrl.sqes))) { + NVME_CTRL_SQES_MIN(id_ctrl->sqes))) { trace_pci_nvme_err_startfail_sqent_too_small( NVME_CC_IOSQES(n->bar.cc), NVME_CTRL_SQES_MIN(n->bar.cap)); return -1; } if (unlikely(NVME_CC_IOSQES(n->bar.cc) > - NVME_CTRL_SQES_MAX(n->id_ctrl.sqes))) { + NVME_CTRL_SQES_MAX(id_ctrl->sqes))) { trace_pci_nvme_err_startfail_sqent_too_large( NVME_CC_IOSQES(n->bar.cc), NVME_CTRL_SQES_MAX(n->bar.cap)); @@ -2755,6 +2898,8 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING; n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1); + n->iocscs[0] = 1 << NVME_IOCS_NVM; + n->features.iocsci = 0; } int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) @@ -2959,7 +3104,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_MQES(n->bar.cap, 0x7ff); NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); - NVME_CAP_SET_CSS(n->bar.cap, 1); + NVME_CAP_SET_CSS(n->bar.cap, (NVME_CAP_CSS_NVM | NVME_CAP_CSS_CSI)); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); n->bar.vs = NVME_SPEC_VER; @@ -3019,6 +3164,11 @@ static void nvme_exit(PCIDevice *pci_dev) if (n->pmrdev) { host_memory_backend_set_mapped(n->pmrdev, false); } + + for (int i = 0; i < NVME_IOCS_MAX; i++) { + g_free(n->id_ctrl_iocss[i]); + } + msix_uninit_exclusive_bar(pci_dev); } diff --git a/hw/block/trace-events b/hw/block/trace-events index 774513469274..b002eb7c8a5c 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -48,10 +48,12 @@ pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" +pci_nvme_identify(uint16_t cid, uint32_t nsid, uint16_t cntid, uint8_t cns, uint8_t csi, uint16_t nvmsetid) "cid %"PRIu16" nsid %"PRIu32" cntid 0x%"PRIx16" cns 0x%"PRIx8" csi 0x%"PRIx8" nvmsetid %"PRIu16"" pci_nvme_identify_ctrl(void) "identify controller" -pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" -pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_ns(uint32_t ns, uint8_t csi) "nsid %"PRIu32" csi 0x%"PRIx8"" +pci_nvme_identify_nslist(uint32_t ns, uint8_t csi) "nsid %"PRIu32" csi 0x%"PRIx8"" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_io_cmd_set(uint16_t cid) "cid %"PRIu16"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32"" From patchwork Tue Sep 29 23:19:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F16B1C4727C for ; Tue, 29 Sep 2020 23:29:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4761E20773 for ; Tue, 29 Sep 2020 23:29:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4761E20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56158 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNP3f-0006ua-0p for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:29:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39100) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuI-0000jK-Td; Tue, 29 Sep 2020 19:19:38 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:41005) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuF-00009K-9L; Tue, 29 Sep 2020 19:19:38 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 518F5EAD; Tue, 29 Sep 2020 19:19:33 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=sAsp3uvhvuIl8 ag4Y4oHp+LMFTT42Z4taNVnJwmaTr8=; b=Wrbr9APqKqBlXDz+55KX2eWCzj8bf xoAsnoA95X5ArCABcuRA+y6UNuOEUp4Xd2HPs+4Dp20R6uWsvo0bNyJLUz1P+rMr XLOQzosP+Q6/5Max6pPe/Uc70rq4rBscmJ5SOi6Mokpws8DiGTYcMpjKguLjzhEE 7GfOrn7Qn8FOGgGmZJ/8uFrMH/lkNkkGWx/7PsENeEwqhMZT1W5KqZiykfUrfbEd W8lNyFyzSuM1fCjUhLEKIWPIex5aLkRuCt98y/2VjiDg9TT4SVDwtkxklyS5mruz DgFuxjkHiyFxK2HO5SefvaOBK1W8seuh7UQnLLw/STGm+ElRs4a0BKrFw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=sAsp3uvhvuIl8ag4Y4oHp+LMFTT42Z4taNVnJwmaTr8=; b=nJ2ETPO+ P+B/VcPDWp5XzHCPQ88IWt4yYyjxzfYrCJEOyUVzy6/0a38ojUbVimMWJMVjRwhw wMfsrLovyd+Xzi3IMO+IH26sNKrnDe7phTpgggdcZ5n2lA7191iDPj3iEmrhRTnT XZuYfFeTYdKjD5pn+EC/OpJpIJXnoJEh6vnxsjJ/X4mvG5S7bY1S4u16jxgyf5YY JJ11Z7Rs+xZCq4aACJXfTOpTGpX1pGj8GVCNKbv3NUNt+nw/dBbP58OIY5yYF3UQ BmokeqbsuzzwwsGAu7vbiGhQZUKcVKBdlPyXNzjFZ04tK1O0QhMH9O7RQhPmB2l4 YLJPtAMVWHo9Zg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepkeduheeftefhvddvgeeiffehkeeugeehhfdugffhhffhveejjeffueehudeguddu necuffhomhgrihhnpehuthhilhhiiigrthhiohhnrdhmrghpnecukfhppeektddrudeije drleekrdduledtnecuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhf rhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 274FA328005A; Tue, 29 Sep 2020 19:19:31 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 09/14] hw/block/nvme: add basic read/write for zoned namespaces Date: Wed, 30 Sep 2020 01:19:12 +0200 Message-Id: <20200929231917.433586-10-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen This adds basic read and write for zoned namespaces. A zoned namespace is created by setting the iocs namespace parameter to 0x2 and specifying the zns.zcap parameter (zone capacity) in number of logical blocks per zone. If a zone size (zns.zsze) is not specified, the namespace device will set the zone size to be the next power of two and fit in as many zones as possible on the underlying namespace blockdev. This behavior is not required by the specification, but ensures that the device can be initialized by the Linux kernel nvme driver, which requires a power of two zone size. If the namespace has an associated 'pstate' blockdev it will be used to store the zone states persistently. Only zone state changes are persisted, that is, zone write pointer updates are only persistent if the zone is explicitly closed. On boot up, any zones that were in an Opened state will be transitioned to Full. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 7 ++ hw/block/nvme-ns.h | 116 +++++++++++++++++++- include/block/nvme.h | 59 ++++++++++- hw/block/nvme-ns.c | 184 +++++++++++++++++++++++++++++++- hw/block/nvme.c | 239 +++++++++++++++++++++++++++++++++++++++++- hw/block/trace-events | 8 ++ 6 files changed, 604 insertions(+), 9 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index a13c7a5dbe86..b23e59dd3075 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -34,6 +34,13 @@ nvme-ns Options values (such as lbads) differs from those specified for the nvme-ns device an error is produced. + `zns.zcap`; If `iocs` is 0x2, this specifies the zone capacity. It is + specified in units of logical blocks. + + `zns.zsze`; If `iocs` is 0x2, this specifies the zone size. It is specified + in units of the logical blocks. If not specified, the value depends on + zns.zcap; if the zone capacity is a power of two, the zone size will be + set to that, otherwise it will default to the next power of two. Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index d39fa79fa682..e16f50dc4bb8 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -31,15 +31,34 @@ typedef struct NvmePstateHeader { uint8_t lbads; uint8_t iocs; - uint8_t rsvd18[4078]; + uint8_t rsvd18[3054]; + + /* offset 0xc00 */ + struct { + uint64_t zcap; + uint64_t zsze; + } QEMU_PACKED zns; + + uint8_t rsvd3088[1008]; } QEMU_PACKED NvmePstateHeader; typedef struct NvmeNamespaceParams { uint32_t nsid; uint8_t iocs; uint8_t lbads; + + struct { + uint64_t zcap; + uint64_t zsze; + } zns; } NvmeNamespaceParams; +typedef struct NvmeZone { + NvmeZoneDescriptor *zd; + + uint64_t wp_staging; +} NvmeZone; + typedef struct NvmeNamespace { DeviceState parent_obj; BlockConf blkconf; @@ -55,6 +74,10 @@ typedef struct NvmeNamespace { unsigned long *map; int64_t offset; } utilization; + + struct { + int64_t offset; + } zns; } pstate; NvmeNamespaceParams params; @@ -62,8 +85,20 @@ typedef struct NvmeNamespace { struct { uint32_t err_rec; } features; + + struct { + int num_zones; + + NvmeZone *zones; + NvmeZoneDescriptor *zd; + } zns; } NvmeNamespace; +static inline bool nvme_ns_zoned(NvmeNamespace *ns) +{ + return ns->iocs == NVME_IOCS_ZONED; +} + static inline uint32_t nvme_nsid(NvmeNamespace *ns) { if (ns) { @@ -78,17 +113,39 @@ static inline NvmeIdNsNvm *nvme_ns_id_nvm(NvmeNamespace *ns) return ns->id_ns[NVME_IOCS_NVM]; } +static inline NvmeIdNsZns *nvme_ns_id_zoned(NvmeNamespace *ns) +{ + return ns->id_ns[NVME_IOCS_ZONED]; +} + static inline NvmeLBAF *nvme_ns_lbaf(NvmeNamespace *ns) { NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); return &id_ns->lbaf[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; } +static inline NvmeLBAFE *nvme_ns_lbafe(NvmeNamespace *ns) +{ + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); + NvmeIdNsZns *id_ns_zns = nvme_ns_id_zoned(ns); + return &id_ns_zns->lbafe[NVME_ID_NS_FLBAS_INDEX(id_ns->flbas)]; +} + static inline uint8_t nvme_ns_lbads(NvmeNamespace *ns) { return nvme_ns_lbaf(ns)->ds; } +static inline uint64_t nvme_ns_zsze(NvmeNamespace *ns) +{ + return nvme_ns_lbafe(ns)->zsze; +} + +static inline uint64_t nvme_ns_zsze_bytes(NvmeNamespace *ns) +{ + return nvme_ns_zsze(ns) << nvme_ns_lbads(ns); +} + /* calculate the number of LBAs that the namespace can accomodate */ static inline uint64_t nvme_ns_nlbas(NvmeNamespace *ns) { @@ -101,12 +158,69 @@ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) return lba << nvme_ns_lbads(ns); } +static inline int nvme_ns_zone_idx(NvmeNamespace *ns, uint64_t lba) +{ + return lba / nvme_ns_zsze(ns); +} + +static inline NvmeZone *nvme_ns_get_zone(NvmeNamespace *ns, uint64_t lba) +{ + int idx = nvme_ns_zone_idx(ns, lba); + if (unlikely(idx >= ns->zns.num_zones)) { + return NULL; + } + + return &ns->zns.zones[idx]; +} + +static inline NvmeZoneState nvme_zs(NvmeZone *zone) +{ + return (zone->zd->zs >> 4) & 0xf; +} + +static inline void nvme_zs_set(NvmeZone *zone, NvmeZoneState zs) +{ + zone->zd->zs = zs << 4; +} + +static inline bool nvme_ns_zone_wp_valid(NvmeZone *zone) +{ + switch (nvme_zs(zone)) { + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSO: + return false; + default: + return false; + } +} + +static inline uint64_t nvme_zslba(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->zslba); +} + +static inline uint64_t nvme_zcap(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->zcap); +} + +static inline uint64_t nvme_wp(NvmeZone *zone) +{ + return le64_to_cpu(zone->zd->wp); +} + typedef struct NvmeCtrl NvmeCtrl; +const char *nvme_zs_str(NvmeZone *zone); +const char *nvme_zs_to_str(NvmeZoneState zs); + int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +void nvme_ns_zns_init_zone_state(NvmeNamespace *ns); + static inline void _nvme_ns_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmePstateHeader) != 4096); diff --git a/include/block/nvme.h b/include/block/nvme.h index 792fccf8c81f..2e523c9d97b4 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -390,8 +390,9 @@ enum NvmePmrmscMask { (pmrmsc |= (uint64_t)(val & PMRMSC_CBA_MASK) << PMRMSC_CBA_SHIFT) enum NvmeCommandSet { - NVME_IOCS_NVM = 0x0, - NVME_IOCS_MAX = 0x1, + NVME_IOCS_NVM = 0x0, + NVME_IOCS_ZONED = 0x2, + NVME_IOCS_MAX = 0x3, }; enum NvmeSglDescriptorType { @@ -703,6 +704,11 @@ enum NvmeStatusCodes { NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, + NVME_ZONE_BOUNDARY_ERROR = 0x01b8, + NVME_ZONE_IS_FULL = 0x01b9, + NVME_ZONE_IS_READ_ONLY = 0x01ba, + NVME_ZONE_IS_OFFLINE = 0x01bb, + NVME_ZONE_INVALID_WRITE = 0x01bc, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -781,6 +787,31 @@ enum { NVME_EFFECTS_UUID_SEL = 1 << 19, }; +typedef enum NvmeZoneType { + NVME_ZT_SEQ = 0x2, +} NvmeZoneType; + +typedef enum NvmeZoneState { + NVME_ZS_ZSE = 0x1, + NVME_ZS_ZSIO = 0x2, + NVME_ZS_ZSEO = 0x3, + NVME_ZS_ZSC = 0x4, + NVME_ZS_ZSRO = 0xd, + NVME_ZS_ZSF = 0xe, + NVME_ZS_ZSO = 0xf, +} NvmeZoneState; + +typedef struct QEMU_PACKED NvmeZoneDescriptor { + uint8_t zt; + uint8_t zs; + uint8_t za; + uint8_t rsvd3[5]; + uint64_t zcap; + uint64_t zslba; + uint64_t wp; + uint8_t rsvd32[32]; +} NvmeZoneDescriptor; + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -794,6 +825,7 @@ enum NvmeLogIdentifier { NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, NVME_LOG_EFFECTS = 0x05, + NVME_LOG_CHANGED_ZONE_LIST = 0xbf, }; typedef struct QEMU_PACKED NvmePSD { @@ -1099,9 +1131,27 @@ enum NvmeIdNsDps { DPS_FIRST_EIGHT = 8, }; +typedef struct QEMU_PACKED NvmeLBAFE { + uint64_t zsze; + uint8_t zdes; + uint8_t rsvd9[7]; +} NvmeLBAFE; + +typedef struct QEMU_PACKED NvmeIdNsZns { + uint16_t zoc; + uint16_t ozcs; + uint32_t mar; + uint32_t mor; + uint32_t rrl; + uint32_t frl; + uint8_t rsvd20[2796]; + NvmeLBAFE lbafe[16]; + uint8_t rsvd3072[768]; + uint8_t vs[256]; +} NvmeIdNsZns; + static inline void _nvme_check_size(void) { - QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16); @@ -1118,8 +1168,11 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescriptor) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); } #endif diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 9eabab29f32c..ba32ea6d1326 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -30,6 +30,26 @@ #include "nvme.h" #include "nvme-ns.h" +const char *nvme_zs_str(NvmeZone *zone) +{ + return nvme_zs_to_str(nvme_zs(zone)); +} + +const char *nvme_zs_to_str(NvmeZoneState zs) +{ + switch (zs) { + case NVME_ZS_ZSE: return "ZSE"; + case NVME_ZS_ZSIO: return "ZSIO"; + case NVME_ZS_ZSEO: return "ZSEO"; + case NVME_ZS_ZSC: return "ZSC"; + case NVME_ZS_ZSRO: return "ZSRO"; + case NVME_ZS_ZSF: return "ZSF"; + case NVME_ZS_ZSO: return "ZSO"; + } + + return NULL; +} + static int nvme_blk_truncate(BlockBackend *blk, size_t len, Error **errp) { int ret; @@ -55,6 +75,47 @@ static int nvme_blk_truncate(BlockBackend *blk, size_t len, Error **errp) return 0; } +static void nvme_ns_zns_init_zones(NvmeNamespace *ns) +{ + NvmeZone *zone; + NvmeZoneDescriptor *zd; + uint64_t zslba, zsze = nvme_ns_zsze(ns); + + for (int i = 0; i < ns->zns.num_zones; i++) { + zslba = i * zsze; + + zone = &ns->zns.zones[i]; + zone->zd = &ns->zns.zd[i]; + zone->wp_staging = zslba; + + zd = zone->zd; + zd->zt = NVME_ZT_SEQ; + zd->zcap = cpu_to_le64(ns->params.zns.zcap); + zd->wp = zd->zslba = cpu_to_le64(zslba); + + nvme_zs_set(zone, NVME_ZS_ZSE); + } +} + +static void nvme_ns_init_zoned(NvmeNamespace *ns) +{ + NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); + NvmeIdNsZns *id_ns_zns = nvme_ns_id_zoned(ns); + + for (int i = 0; i <= id_ns->nlbaf; i++) { + id_ns_zns->lbafe[i].zsze = ns->params.zns.zsze ? + cpu_to_le64(ns->params.zns.zsze) : + cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + } + + ns->zns.num_zones = nvme_ns_nlbas(ns) / nvme_ns_zsze(ns); + ns->zns.zones = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZone)); + ns->zns.zd = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZoneDescriptor)); + + id_ns_zns->mar = 0xffffffff; + id_ns_zns->mor = 0xffffffff; +} + static void nvme_ns_init(NvmeNamespace *ns) { NvmeIdNsNvm *id_ns; @@ -72,6 +133,11 @@ static void nvme_ns_init(NvmeNamespace *ns) id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); + if (nvme_ns_zoned(ns)) { + ns->id_ns[NVME_IOCS_ZONED] = g_new0(NvmeIdNsZns, 1); + nvme_ns_init_zoned(ns); + } + /* no thin provisioning */ id_ns->ncap = id_ns->nsze; id_ns->nuse = id_ns->ncap; @@ -82,7 +148,7 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) BlockBackend *blk = ns->pstate.blk; NvmePstateHeader header; uint64_t nlbas = nvme_ns_nlbas(ns); - size_t bitmap_len, pstate_len; + size_t bitmap_len, pstate_len, zd_len = 0; int ret; ret = nvme_blk_truncate(blk, sizeof(NvmePstateHeader), errp); @@ -98,6 +164,14 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) .iocs = ns->params.iocs, }; + if (nvme_ns_zoned(ns)) { + /* zns specific; offset 0xc00 */ + header.zns.zcap = cpu_to_le64(ns->params.zns.zcap); + header.zns.zsze = ns->params.zns.zsze ? + cpu_to_le64(ns->params.zns.zsze) : + cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + } + ret = blk_pwrite(blk, 0, &header, sizeof(header), 0); if (ret < 0) { error_setg_errno(errp, -ret, "could not write pstate header"); @@ -105,7 +179,10 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) } bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); - pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len, + if (nvme_ns_zoned(ns)) { + zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + } + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len, BDRV_SECTOR_SIZE); ret = nvme_blk_truncate(blk, pstate_len, errp); @@ -115,15 +192,58 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) ns->pstate.utilization.map = bitmap_new(nlbas); + if (zd_len) { + ns->pstate.zns.offset = ns->pstate.utilization.offset + bitmap_len, + + nvme_ns_zns_init_zones(ns); + + ret = blk_pwrite(blk, ns->pstate.zns.offset, ns->zns.zd, zd_len, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, + "could not write zone descriptors to pstate"); + return ret; + } + } + return 0; } +void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) +{ + for (int i = 0; i < ns->zns.num_zones; i++) { + NvmeZone *zone = &ns->zns.zones[i]; + zone->zd = &ns->zns.zd[i]; + + zone->wp_staging = nvme_wp(zone); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSO: + continue; + + case NVME_ZS_ZSC: + if (nvme_wp(zone) == nvme_zslba(zone)) { + nvme_zs_set(zone, NVME_ZS_ZSE); + } + + continue; + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + zone->zd->wp = zone->zd->zslba; + nvme_zs_set(zone, NVME_ZS_ZSF); + } + } +} + static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) { BlockBackend *blk = ns->pstate.blk; NvmePstateHeader header; uint64_t nlbas = nvme_ns_nlbas(ns); - size_t bitmap_len, pstate_len; + size_t bitmap_len, pstate_len, zd_len = 0; unsigned long *map; int ret; @@ -160,8 +280,25 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) return -1; } + if (header.zns.zcap != ns->params.zns.zcap) { + error_setg(errp, "zns.zcap parameter inconsistent with pstate " + "(pstate %"PRIu64"; parameter %"PRIu64")", + header.zns.zcap, ns->params.zns.zcap); + return -1; + } + + if (ns->params.zns.zsze && header.zns.zsze != ns->params.zns.zsze) { + error_setg(errp, "zns.zsze parameter inconsistent with pstate " + "(pstate %"PRIu64"; parameter %"PRIu64")", + header.zns.zsze, ns->params.zns.zsze); + return -1; + } + bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); - pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len, + if (nvme_ns_zoned(ns)) { + zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + } + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len, BDRV_SECTOR_SIZE); if (len != pstate_len) { @@ -188,6 +325,27 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) ns->pstate.utilization.map = map; #endif + if (zd_len) { + ns->pstate.zns.offset = ns->pstate.utilization.offset + bitmap_len, + + ret = blk_pread(blk, ns->pstate.zns.offset, ns->zns.zd, zd_len); + if (ret < 0) { + error_setg_errno(errp, -ret, + "could not read zone descriptors from pstate"); + return ret; + } + + nvme_ns_zns_init_zone_state(ns); + + ret = blk_pwrite(blk, ns->pstate.utilization.offset + bitmap_len, + ns->zns.zd, zd_len, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, + "could not write zone descriptors to pstate"); + return ret; + } + } + return 0; } @@ -262,6 +420,20 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) switch (ns->params.iocs) { case NVME_IOCS_NVM: break; + + case NVME_IOCS_ZONED: + if (!ns->params.zns.zcap) { + error_setg(errp, "zns.zcap must be specified"); + return -1; + } + + if (ns->params.zns.zsze && ns->params.zns.zsze < ns->params.zns.zcap) { + error_setg(errp, "zns.zsze cannot be less than zns.zcap"); + return -1; + } + + break; + default: error_setg(errp, "unsupported iocs"); return -1; @@ -293,6 +465,8 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) */ NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); id_ns->nsfeat |= 0x4; + } else if (nvme_ns_zoned(ns)) { + nvme_ns_zns_init_zones(ns); } if (nvme_register_namespace(n, ns, errp)) { @@ -340,6 +514,8 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT8("lbads", NvmeNamespace, params.lbads, BDRV_SECTOR_BITS), DEFINE_PROP_DRIVE("pstate", NvmeNamespace, pstate.blk), DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, NVME_IOCS_NVM), + DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), + DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 53cb8c2ed4a7..0c7d269c1b5a 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -155,6 +155,16 @@ static const NvmeEffectsLog nvme_effects[NVME_IOCS_MAX] = { NVME_EFFECTS_NVM_INITIALIZER, }, }, + + [NVME_IOCS_ZONED] = { + .acs = { + NVME_EFFECTS_ADMIN_INITIALIZER, + }, + + .iocs = { + NVME_EFFECTS_NVM_INITIALIZER, + }, + }, }; static void nvme_process_sq(void *opaque); @@ -911,6 +921,112 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type) } } +static uint16_t nvme_check_zone_readable(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeZoneState zs = nvme_zs(zone); + uint64_t zslba = nvme_zslba(zone); + + if (zs == NVME_ZS_ZSO) { + trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), zslba, + NVME_ZS_ZSO); + return NVME_ZONE_IS_OFFLINE | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_check_zone_read(NvmeCtrl *n, uint64_t slba, uint32_t nlb, + NvmeRequest *req, NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint64_t zslba = nvme_zslba(zone); + uint64_t zsze = nvme_ns_zsze(ns); + uint16_t status; + + status = nvme_check_zone_readable(n, req, zone); + if (status) { + return status; + } + + if ((slba + nlb) > (zslba + zsze)) { + trace_pci_nvme_err_zone_boundary(nvme_cid(req), slba, nlb, zsze); + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_check_zone_writeable(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeZoneState zs = nvme_zs(zone); + uint64_t zslba = nvme_zslba(zone); + + if (zs == NVME_ZS_ZSO) { + trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), zslba, + NVME_ZS_ZSO); + return NVME_ZONE_IS_OFFLINE | NVME_DNR; + } + + switch (zs) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSC: + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + return NVME_SUCCESS; + case NVME_ZS_ZSF: + trace_pci_nvme_err_zone_is_full(nvme_cid(req), zslba); + return NVME_ZONE_IS_FULL | NVME_DNR; + case NVME_ZS_ZSRO: + trace_pci_nvme_err_zone_is_read_only(nvme_cid(req), zslba); + return NVME_ZONE_IS_READ_ONLY | NVME_DNR; + default: + break; + } + + trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), zslba, zs); + return NVME_INTERNAL_DEV_ERROR | NVME_DNR; +} + +static uint16_t nvme_check_zone_write(NvmeCtrl *n, uint64_t slba, uint32_t nlb, + NvmeRequest *req, NvmeZone *zone) +{ + uint64_t zslba, wp, zcap; + uint16_t status; + + zslba = nvme_zslba(zone); + wp = zone->wp_staging; + zcap = nvme_zcap(zone); + + status = nvme_check_zone_writeable(n, req, zone); + if (status) { + return status; + } + + if ((wp - zslba) + nlb > zcap) { + trace_pci_nvme_err_zone_boundary(nvme_cid(req), slba, nlb, zcap); + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + if (slba != wp) { + trace_pci_nvme_err_zone_invalid_write(nvme_cid(req), slba, wp); + return NVME_ZONE_INVALID_WRITE | NVME_DNR; + } + + return NVME_SUCCESS; +} + +static inline uint16_t nvme_check_zone(NvmeCtrl *n, uint64_t slba, + uint32_t nlb, NvmeRequest *req, + NvmeZone *zone) { + if (nvme_req_is_write(req)) { + return nvme_check_zone_write(n, slba, nlb, req, zone); + } + + return nvme_check_zone_read(n, slba, nlb, req, zone); +} + static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) { uint8_t mdts = n->params.mdts; @@ -1006,6 +1122,42 @@ static int nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) return ret; } +static int nvme_zns_commit_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint64_t zslba; + int64_t offset; + + if (!ns->pstate.blk) { + return 0; + } + + trace_pci_nvme_zns_commit_zone(nvme_nsid(ns), nvme_zslba(zone)); + + zslba = nvme_zslba(zone); + offset = ns->pstate.zns.offset + + nvme_ns_zone_idx(ns, zslba) * sizeof(NvmeZoneDescriptor); + + return blk_pwrite(ns->pstate.blk, offset, zone->zd, + sizeof(NvmeZoneDescriptor), 0); +} + +static void nvme_zns_advance_wp(NvmeRequest *req) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + uint64_t slba = le64_to_cpu(rw->slba); + uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + NvmeZone *zone = nvme_ns_get_zone(req->ns, slba); + uint64_t wp = nvme_wp(zone); + + wp += nlb; + zone->zd->wp = cpu_to_le64(wp); + if (wp == nvme_zslba(zone) + nvme_zcap(zone)) { + nvme_zs_set(zone, NVME_ZS_ZSF); + if (nvme_zns_commit_zone(req->ns, zone) < 0) { + req->status = NVME_INTERNAL_DEV_ERROR; + } + } +} static void nvme_rw_cb(void *opaque, int ret) { @@ -1022,6 +1174,10 @@ static void nvme_rw_cb(void *opaque, int ret) if (!ret) { block_acct_done(stats, acct); + + if (nvme_ns_zoned(ns) && nvme_req_is_write(req)) { + nvme_zns_advance_wp(req); + } } else { uint16_t status; @@ -1047,6 +1203,26 @@ static void nvme_rw_cb(void *opaque, int ret) error_report_err(local_err); req->status = status; + + if (nvme_ns_zoned(ns)) { + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + uint64_t slba = le64_to_cpu(rw->slba); + + NvmeZone *zone = nvme_ns_get_zone(ns, slba); + + /* + * Transition the zone to read-only on write fault and offline + * on unrecovered read. + */ + NvmeZoneState zs = status == NVME_WRITE_FAULT ? + NVME_ZS_ZSRO : NVME_ZS_ZSO; + + nvme_zs_set(zone, zs); + + if (nvme_zns_commit_zone(ns, zone) < 0) { + req->status = NVME_INTERNAL_DEV_ERROR; + } + } } nvme_enqueue_req_completion(nvme_cq(req), req); @@ -1120,6 +1296,7 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + NvmeZone *zone = NULL; uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; @@ -1138,6 +1315,20 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (nvme_ns_zoned(ns)) { + zone = nvme_ns_get_zone(ns, slba); + if (!zone) { + trace_pci_nvme_err_invalid_zone(nvme_cid(req), slba); + status = NVME_INVALID_FIELD | NVME_DNR; + goto invalid; + } + + status = nvme_check_zone(n, slba, nlb, req, zone); + if (status) { + goto invalid; + } + } + if (!is_write) { if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { status = nvme_check_dulbe(ns, slba, nlb); @@ -1162,6 +1353,31 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) } if (is_write) { + if (zone) { + if (zone->wp_staging != nvme_wp(zone)) { + trace_pci_nvme_err_zone_pending_writes(nvme_cid(req), + nvme_zslba(zone), + nvme_wp(zone), + zone->wp_staging); + } + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSE: + case NVME_ZS_ZSC: + nvme_zs_set(zone, NVME_ZS_ZSIO); + + if (nvme_zns_commit_zone(req->ns, zone) < 0) { + status = NVME_INTERNAL_DEV_ERROR; + goto invalid; + } + + default: + break; + } + + zone->wp_staging += nlb; + } + if (nvme_allocate(ns, slba, nlb) < 0) { status = NVME_INTERNAL_DEV_ERROR; goto invalid; @@ -2365,6 +2581,24 @@ static void nvme_clear_ctrl(NvmeCtrl *n) } nvme_ns_flush(ns); + + if (nvme_ns_zoned(ns)) { + for (int i = 0; i < ns->zns.num_zones; i++) { + NvmeZone *zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + nvme_zs_set(zone, NVME_ZS_ZSC); + nvme_zns_commit_zone(ns, zone); + break; + default: + continue; + } + } + + nvme_ns_zns_init_zone_state(ns); + } } n->bar.cc = 0; @@ -2898,7 +3132,7 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING; n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1); - n->iocscs[0] = 1 << NVME_IOCS_NVM; + n->iocscs[0] = (1 << NVME_IOCS_NVM) | (1 << NVME_IOCS_ZONED); n->features.iocsci = 0; } @@ -3049,6 +3283,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) uint8_t *pci_conf = pci_dev->config; char *subnqn; + n->id_ctrl_iocss[NVME_IOCS_NVM] = g_new0(NvmeIdCtrl, 1); + n->id_ctrl_iocss[NVME_IOCS_ZONED] = g_new0(NvmeIdCtrl, 1); + id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID)); id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID)); strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' '); diff --git a/hw/block/trace-events b/hw/block/trace-events index b002eb7c8a5c..d46a7a4942bb 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -72,6 +72,7 @@ pci_nvme_enqueue_event_noqueue(int queued) "queued %d" pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8"" pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs" pci_nvme_enqueue_req_completion(uint16_t cid, uint16_t cqid, uint16_t status) "cid %"PRIu16" cqid %"PRIu16" status 0x%"PRIx16"" +pci_nvme_zns_commit_zone(uint32_t nsid, uint64_t zslba) "nsid 0x%"PRIx32" zslba 0x%"PRIx64"" pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64"" pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16"" @@ -97,6 +98,11 @@ pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_addr_write(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_err_cfs(void) "controller fatal status" pci_nvme_err_aio(uint16_t cid, const char *errname, uint16_t status) "cid %"PRIu16" err '%s' status 0x%"PRIx16"" +pci_nvme_err_zone_is_full(uint16_t cid, uint64_t slba) "cid %"PRIu16" lba 0x%"PRIx64"" +pci_nvme_err_zone_is_read_only(uint16_t cid, uint64_t slba) "cid %"PRIu16" lba 0x%"PRIx64"" +pci_nvme_err_zone_invalid_write(uint16_t cid, uint64_t slba, uint64_t wp) "cid %"PRIu16" lba 0x%"PRIx64" wp 0x%"PRIx64"" +pci_nvme_err_zone_boundary(uint16_t cid, uint64_t slba, uint32_t nlb, uint64_t zcap) "cid %"PRIu16" lba 0x%"PRIx64" nlb %"PRIu32" zcap 0x%"PRIx64"" +pci_nvme_err_zone_pending_writes(uint16_t cid, uint64_t zslba, uint64_t wp, uint64_t wp_staging) "cid %"PRIu16" zslba 0x%"PRIx64" wp 0x%"PRIx64" wp_staging 0x%"PRIx64"" pci_nvme_err_invalid_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_num_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_sgl_excess_length(uint16_t cid) "cid %"PRIu16"" @@ -125,6 +131,8 @@ pci_nvme_err_invalid_identify_cns(uint16_t cns) "identify, invalid cns=0x%"PRIx1 pci_nvme_err_invalid_getfeat(int dw10) "invalid get features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx32"" pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16"" +pci_nvme_err_invalid_zone(uint16_t cid, uint64_t lba) "cid %"PRIu16" lba 0x%"PRIx64"" +pci_nvme_err_invalid_zone_condition(uint16_t cid, uint64_t zslba, uint8_t condition) "cid %"PRIu16" zslba 0x%"PRIx64" condition 0x%"PRIx8"" pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues" pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues" pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null" From patchwork Tue Sep 29 23:19:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38A40C4727C for ; Tue, 29 Sep 2020 23:37:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 67F7E20773 for ; Tue, 29 Sep 2020 23:37:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 67F7E20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPBO-00037w-03 for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:37:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39104) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuJ-0000k8-PT; Tue, 29 Sep 2020 19:19:39 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:47833) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuG-00009V-81; Tue, 29 Sep 2020 19:19:39 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5387FEB6; Tue, 29 Sep 2020 19:19:34 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=alqmpXbnFEBK9 CpTS+wUokETkajaAEX9UfR5hF9sq50=; b=hp5z9JMoPmC4H1XpCr3xVpiDYhtN3 tZ5VXUmtH8oGkapl6sdkT6LxaFCXJJDQM80rerfBi9bzUEPqRAfLX3HIEGWaXwyX SRI8vVmm1Sb/9RQCLGNtvOf6eClN7Z97wRsmcbiXwDkHyo7240Sugjmi/hcKb5eb h4REFwrmc2x0h/xPjmZV07FPFr/VHYNMLFbZmc9avxzFz0wj8RqjpYM2GLborXbP gl+wkGLNDj99j2kkjdvdvPjTs9V+ooCuH+63M1UuQ5e+Ohw827kAa1Y2RCZTat28 8vMV99WaBy4n/PNraqgdcN8uck3/BxB63X1Ya7jUA2ZJgK2mfBFIwU3WQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=alqmpXbnFEBK9CpTS+wUokETkajaAEX9UfR5hF9sq50=; b=IcrKP+Y/ ap6PiB3DQWUHEO1hzjA5q1Eqf5xf5CiUzmErTRhuV/m40RqShKvsHgc6Tog5ZDB2 AiXuX4kjrerOwZmFhwnkqF2oUBo+dF8/hFuUKg7y+EDiccwLk+xm2MT+2N34ZYM7 GNncfTGEqEk9Bu/1BC7lQtUXOPm6HBpI+hQfk1SDPdvaYylORW0dSpzJ9/kpmnyP lAXx/7HL+p0t8a58qgk+qEtanTXVH3vdLFNMOQAmJ9/QXDZTjJwfatGVdsPqiYne zOt0wTjpPlGWZDb8h+hGV+iy8/ELNJ0t8WfDJ/p+1gXlRSweG4vK0/q5Y2gROCZF AgURSK9NC+dy/Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepheenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 6CAFF328005D; Tue, 29 Sep 2020 19:19:32 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 10/14] hw/block/nvme: add the zone management receive command Date: Wed, 30 Sep 2020 01:19:13 +0200 Message-Id: <20200929231917.433586-11-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Management Receive command. Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 11 +++- hw/block/nvme.h | 1 + include/block/nvme.h | 46 ++++++++++++++ hw/block/nvme-ns.c | 49 ++++++++++++--- hw/block/nvme.c | 135 ++++++++++++++++++++++++++++++++++++++++++ hw/block/trace-events | 1 + 6 files changed, 233 insertions(+), 10 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index e16f50dc4bb8..82cb0b0bce82 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -37,9 +37,10 @@ typedef struct NvmePstateHeader { struct { uint64_t zcap; uint64_t zsze; + uint8_t zdes; } QEMU_PACKED zns; - uint8_t rsvd3088[1008]; + uint8_t rsvd3089[1007]; } QEMU_PACKED NvmePstateHeader; typedef struct NvmeNamespaceParams { @@ -50,11 +51,13 @@ typedef struct NvmeNamespaceParams { struct { uint64_t zcap; uint64_t zsze; + uint8_t zdes; } zns; } NvmeNamespaceParams; typedef struct NvmeZone { NvmeZoneDescriptor *zd; + uint8_t *zde; uint64_t wp_staging; } NvmeZone; @@ -91,6 +94,7 @@ typedef struct NvmeNamespace { NvmeZone *zones; NvmeZoneDescriptor *zd; + uint8_t *zde; } zns; } NvmeNamespace; @@ -183,6 +187,11 @@ static inline void nvme_zs_set(NvmeZone *zone, NvmeZoneState zs) zone->zd->zs = zs << 4; } +static inline size_t nvme_ns_zdes_bytes(NvmeNamespace *ns) +{ + return ns->params.zns.zdes << 6; +} + static inline bool nvme_ns_zone_wp_valid(NvmeZone *zone) { switch (nvme_zs(zone)) { diff --git a/hw/block/nvme.h b/hw/block/nvme.h index f66ed9ab7eff..523eef0bcad8 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -71,6 +71,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_WRITE: return "NVME_NVM_CMD_WRITE"; case NVME_CMD_READ: return "NVME_NVM_CMD_READ"; case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES"; + case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; default: return "NVME_NVM_CMD_UNKNOWN"; } } diff --git a/include/block/nvme.h b/include/block/nvme.h index 2e523c9d97b4..9bacf48ee9e9 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -481,6 +481,7 @@ enum NvmeIoCommands { NVME_CMD_COMPARE = 0x05, NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, + NVME_CMD_ZONE_MGMT_RECV = 0x7a, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -593,6 +594,44 @@ enum { NVME_RW_PRINFO_PRCHK_REF = 1 << 10, }; +typedef struct QEMU_PACKED NvmeZoneManagementRecvCmd { + uint8_t opcode; + uint8_t flags; + uint16_t cid; + uint32_t nsid; + uint8_t rsvd8[16]; + NvmeCmdDptr dptr; + uint64_t slba; + uint32_t numdw; + uint8_t zra; + uint8_t zrasp; + uint8_t zrasf; + uint8_t rsvd55[9]; +} NvmeZoneManagementRecvCmd; + +typedef enum NvmeZoneManagementRecvAction { + NVME_CMD_ZONE_MGMT_RECV_REPORT_ZONES = 0x0, + NVME_CMD_ZONE_MGMT_RECV_EXTENDED_REPORT_ZONES = 0x1, +} NvmeZoneManagementRecvAction; + +typedef enum NvmeZoneManagementRecvActionSpecificField { + NVME_CMD_ZONE_MGMT_RECV_LIST_ALL = 0x0, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSE = 0x1, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSIO = 0x2, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSEO = 0x3, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSC = 0x4, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSF = 0x5, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSRO = 0x6, + NVME_CMD_ZONE_MGMT_RECV_LIST_ZSO = 0x7, +} NvmeZoneManagementRecvActionSpecificField; + +#define NVME_CMD_ZONE_MGMT_RECEIVE_PARTIAL 0x1 + +typedef struct QEMU_PACKED NvmeZoneReportHeader { + uint64_t num_zones; + uint8_t rsvd[56]; +} NvmeZoneReportHeader; + typedef struct QEMU_PACKED NvmeDsmCmd { uint8_t opcode; uint8_t flags; @@ -812,6 +851,12 @@ typedef struct QEMU_PACKED NvmeZoneDescriptor { uint8_t rsvd32[32]; } NvmeZoneDescriptor; +#define NVME_ZA_ZDEV (1 << 7) + +#define NVME_ZA_SET(za, attrs) ((za) |= (attrs)) +#define NVME_ZA_CLEAR(za, attrs) ((za) &= ~(attrs)) +#define NVME_ZA_CLEAR_ALL(za) ((za) = 0x0) + enum NvmeSmartWarn { NVME_SMART_SPARE = 1 << 0, NVME_SMART_TEMPERATURE = 1 << 1, @@ -1162,6 +1207,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdentify) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRwCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmCmd) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneManagementRecvCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRangeType) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index ba32ea6d1326..bc49c7f2674f 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -86,6 +86,9 @@ static void nvme_ns_zns_init_zones(NvmeNamespace *ns) zone = &ns->zns.zones[i]; zone->zd = &ns->zns.zd[i]; + if (ns->params.zns.zdes) { + zone->zde = &ns->zns.zde[i]; + } zone->wp_staging = zslba; zd = zone->zd; @@ -106,11 +109,15 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) id_ns_zns->lbafe[i].zsze = ns->params.zns.zsze ? cpu_to_le64(ns->params.zns.zsze) : cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + id_ns_zns->lbafe[i].zdes = ns->params.zns.zdes; } ns->zns.num_zones = nvme_ns_nlbas(ns) / nvme_ns_zsze(ns); ns->zns.zones = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZone)); ns->zns.zd = g_malloc0_n(ns->zns.num_zones, sizeof(NvmeZoneDescriptor)); + if (ns->params.zns.zdes) { + ns->zns.zde = g_malloc0_n(ns->zns.num_zones, nvme_ns_zdes_bytes(ns)); + } id_ns_zns->mar = 0xffffffff; id_ns_zns->mor = 0xffffffff; @@ -148,7 +155,7 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) BlockBackend *blk = ns->pstate.blk; NvmePstateHeader header; uint64_t nlbas = nvme_ns_nlbas(ns); - size_t bitmap_len, pstate_len, zd_len = 0; + size_t bitmap_len, pstate_len, zd_len = 0, zde_len = 0; int ret; ret = nvme_blk_truncate(blk, sizeof(NvmePstateHeader), errp); @@ -170,6 +177,7 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) header.zns.zsze = ns->params.zns.zsze ? cpu_to_le64(ns->params.zns.zsze) : cpu_to_le64(pow2ceil(ns->params.zns.zcap)); + header.zns.zdes = ns->params.zns.zdes; } ret = blk_pwrite(blk, 0, &header, sizeof(header), 0); @@ -181,9 +189,11 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); if (nvme_ns_zoned(ns)) { zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + zde_len = nvme_ns_zoned(ns) ? + ns->zns.num_zones * nvme_ns_zdes_bytes(ns) : 0; } - pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len, - BDRV_SECTOR_SIZE); + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len + + zde_len, BDRV_SECTOR_SIZE); ret = nvme_blk_truncate(blk, pstate_len, errp); if (ret < 0) { @@ -213,6 +223,7 @@ void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) for (int i = 0; i < ns->zns.num_zones; i++) { NvmeZone *zone = &ns->zns.zones[i]; zone->zd = &ns->zns.zd[i]; + zone->zde = &ns->zns.zde[i]; zone->wp_staging = nvme_wp(zone); @@ -224,7 +235,8 @@ void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) continue; case NVME_ZS_ZSC: - if (nvme_wp(zone) == nvme_zslba(zone)) { + if (nvme_wp(zone) == nvme_zslba(zone) && + !(zone->zd->za & NVME_ZA_ZDEV)) { nvme_zs_set(zone, NVME_ZS_ZSE); } @@ -243,7 +255,7 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) BlockBackend *blk = ns->pstate.blk; NvmePstateHeader header; uint64_t nlbas = nvme_ns_nlbas(ns); - size_t bitmap_len, pstate_len, zd_len = 0; + size_t bitmap_len, pstate_len, zd_len = 0, zde_len = 0; unsigned long *map; int ret; @@ -294,12 +306,21 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) return -1; } + if (header.zns.zdes != ns->params.zns.zdes) { + error_setg(errp, "zns.zdes parameter inconsistent with pstate " + "(pstate %u; parameter %u)", + header.zns.zdes, ns->params.zns.zdes); + return -1; + } + bitmap_len = DIV_ROUND_UP(nlbas, sizeof(unsigned long)); if (nvme_ns_zoned(ns)) { zd_len = ns->zns.num_zones * sizeof(NvmeZoneDescriptor); + zde_len = nvme_ns_zoned(ns) ? + ns->zns.num_zones * nvme_ns_zdes_bytes(ns) : 0; } - pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len, - BDRV_SECTOR_SIZE); + pstate_len = ROUND_UP(sizeof(NvmePstateHeader) + bitmap_len + zd_len + + zde_len, BDRV_SECTOR_SIZE); if (len != pstate_len) { error_setg(errp, "pstate size mismatch " @@ -335,10 +356,19 @@ static int nvme_ns_pstate_load(NvmeNamespace *ns, size_t len, Error **errp) return ret; } + if (zde_len) { + ret = blk_pread(blk, ns->pstate.zns.offset + zd_len, ns->zns.zde, + zde_len); + if (ret < 0) { + error_setg_errno(errp, -ret, "could not read zone descriptor " + "extensions from pstate"); + return ret; + } + } + nvme_ns_zns_init_zone_state(ns); - ret = blk_pwrite(blk, ns->pstate.utilization.offset + bitmap_len, - ns->zns.zd, zd_len, 0); + ret = blk_pwrite(blk, ns->pstate.zns.offset, ns->zns.zd, zd_len, 0); if (ret < 0) { error_setg_errno(errp, -ret, "could not write zone descriptors to pstate"); @@ -516,6 +546,7 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT8("iocs", NvmeNamespace, params.iocs, NVME_IOCS_NVM), DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), + DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 0c7d269c1b5a..1e6c57752769 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -163,6 +163,7 @@ static const NvmeEffectsLog nvme_effects[NVME_IOCS_MAX] = { .iocs = { NVME_EFFECTS_NVM_INITIALIZER, + [NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP, }, }, }; @@ -1218,6 +1219,9 @@ static void nvme_rw_cb(void *opaque, int ret) NVME_ZS_ZSRO : NVME_ZS_ZSO; nvme_zs_set(zone, zs); + if (zs == NVME_ZS_ZSO) { + NVME_ZA_CLEAR_ALL(zone->zd->za); + } if (nvme_zns_commit_zone(ns, zone) < 0) { req->status = NVME_INTERNAL_DEV_ERROR; @@ -1286,6 +1290,135 @@ static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len, return NVME_NO_COMPLETE; } +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeZoneManagementRecvCmd *recv; + NvmeZoneManagementRecvAction zra; + NvmeZoneManagementRecvActionSpecificField zrasp; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + + uint8_t *buf, *bufp, zs_list; + uint64_t slba; + int num_zones = 0, zidx = 0, zidx_begin; + uint16_t zes, status; + size_t len; + + recv = (NvmeZoneManagementRecvCmd *) &req->cmd; + + zra = recv->zra; + zrasp = recv->zrasp; + slba = le64_to_cpu(recv->slba); + len = (le32_to_cpu(recv->numdw) + 1) << 2; + + if (!nvme_ns_zoned(ns)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + + trace_pci_nvme_zone_mgmt_recv(nvme_cid(req), nvme_nsid(ns), slba, len, + zra, zrasp, recv->zrasf); + + if (!len) { + return NVME_SUCCESS; + } + + switch (zrasp) { + case NVME_CMD_ZONE_MGMT_RECV_LIST_ALL: + zs_list = 0; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSE: + zs_list = NVME_ZS_ZSE; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSIO: + zs_list = NVME_ZS_ZSIO; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSEO: + zs_list = NVME_ZS_ZSEO; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSC: + zs_list = NVME_ZS_ZSC; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSF: + zs_list = NVME_ZS_ZSF; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSRO: + zs_list = NVME_ZS_ZSRO; + break; + + case NVME_CMD_ZONE_MGMT_RECV_LIST_ZSO: + zs_list = NVME_ZS_ZSO; + break; + default: + return NVME_INVALID_FIELD | NVME_DNR; + } + + status = nvme_check_mdts(n, len); + if (status) { + return status; + } + + if (!nvme_ns_get_zone(ns, slba)) { + trace_pci_nvme_err_invalid_zone(nvme_cid(req), slba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + zidx_begin = zidx = nvme_ns_zone_idx(ns, slba); + zes = sizeof(NvmeZoneDescriptor); + if (zra == NVME_CMD_ZONE_MGMT_RECV_EXTENDED_REPORT_ZONES) { + zes += nvme_ns_zdes_bytes(ns); + } + + buf = bufp = g_malloc0(len); + bufp += sizeof(NvmeZoneReportHeader); + + while ((bufp + zes) - buf <= len && zidx < ns->zns.num_zones) { + zone = &ns->zns.zones[zidx++]; + + if (zs_list && zs_list != nvme_zs(zone)) { + continue; + } + + num_zones++; + + memcpy(bufp, zone->zd, sizeof(NvmeZoneDescriptor)); + + if (zra == NVME_CMD_ZONE_MGMT_RECV_EXTENDED_REPORT_ZONES) { + memcpy(bufp + sizeof(NvmeZoneDescriptor), zone->zde, + nvme_ns_zdes_bytes(ns)); + } + + bufp += zes; + } + + if (!(recv->zrasf & NVME_CMD_ZONE_MGMT_RECEIVE_PARTIAL)) { + if (!zs_list) { + num_zones = ns->zns.num_zones - zidx_begin; + } else { + num_zones = 0; + for (int i = zidx_begin; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (zs_list == nvme_zs(zone)) { + num_zones++; + } + } + } + } + + stq_le_p(buf, (uint64_t)num_zones); + + status = nvme_dma(n, buf, len, DMA_DIRECTION_FROM_DEVICE, req); + g_free(buf); + + return status; +} + static uint16_t nvme_flush(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns = req->ns; @@ -1425,6 +1558,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_WRITE: case NVME_CMD_READ: return nvme_rwz(n, req); + case NVME_CMD_ZONE_MGMT_RECV: + return nvme_zone_mgmt_recv(n, req); default: trace_pci_nvme_err_invalid_opc(req->cmd.opcode); return NVME_INVALID_OPCODE | NVME_DNR; diff --git a/hw/block/trace-events b/hw/block/trace-events index d46a7a4942bb..a2671dadb1e8 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -42,6 +42,7 @@ pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, cons pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_rwz(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t len, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" len %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" +pci_nvme_zone_mgmt_recv(uint16_t cid, uint32_t nsid, uint64_t slba, uint64_t len, uint8_t zra, uint8_t zrasp, uint8_t zrasf) "cid %"PRIu16" nsid %"PRIu32" slba 0x%"PRIx64" len %"PRIu64" zra 0x%"PRIx8" zrasp 0x%"PRIx8" zrasf 0x%"PRIx8"" pci_nvme_allocate(uint32_t ns, uint64_t slba, uint32_t nlb) "nsid %"PRIu32" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_do_aio(uint16_t cid, uint8_t opc, const char *opname, const char *blkname, int64_t offset, size_t len) "cid %"PRIu16" opc 0x%"PRIx8" opname '%s' blk '%s' offset %"PRId64" len %zu" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" From patchwork Tue Sep 29 23:19:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFAD2C4727C for ; Tue, 29 Sep 2020 23:45:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F3D6E206DC for ; Tue, 29 Sep 2020 23:45:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3D6E206DC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49836 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPJW-0000bW-LT for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:45:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39166) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOug-0000x0-JS; Tue, 29 Sep 2020 19:20:04 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:50693) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOub-00009h-Np; Tue, 29 Sep 2020 19:20:01 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id AE1A9E99; Tue, 29 Sep 2020 19:19:35 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=WHznGDtLoUIg3 Sd6f8kc7D4ZA8LSLvGQTcmfwphZdeY=; b=iFP0/LhQ5Nx8Lisu43EEIvKWGynA4 jtLNHc3nWoKFnuO/6eGoe6uV7lxyjUgo3L35BpyT7v61CJ4AS/WGJ4NELTFKfUsp OYVfMfR1xO6929UYcdLqW1Wnp71ORxhNLzzyNT5BB384Co2FKPIMgi6lTuPWox2w Ye7amdDo9qCj//03yV9wYw9Mx6WI+8NfnzdCMv5uYL0C4xpRKAIOnSTIlX/OopoB XTlDAfteP0qU3getsY/rA1VmZP/pgckpouoPpSMXzOTqrRhL2wtu9EUsB1t3hsOY sAVjfQ2WfyMDbpG50NDJ4F+sWFLNDRVIxW5c2D0FA3HUPLomgkYuLZd/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=WHznGDtLoUIg3Sd6f8kc7D4ZA8LSLvGQTcmfwphZdeY=; b=nn7QIZNo 81nCwURRgRwNf8sb7J28C9SHZ403CXdAa4PaNxbU6ixchVrDpWzCZmC3HgTgsyJu aZPHPb534t6J2RAYXkOm3dccwbFnKLdipihzBkeEae1I7KUEV2Q50LNj/D5nwIyu D8zJ5Z80r0N1S42/34/BonQhov/HpLxXmOVY3mPntzGwVzKPudMnYlP9cZUAHF2M RG091Zj94b5aumkkb0wZOt1VBuULxJEi6OC9Y2LHydOPSYnUO3TBFeYrcjgweArZ Tsz/OL5iZ6XrZ7cOxLS0tMU3+Gmg5HuWXQgCFDOfVNU2zNGCrytvT8RlZKwPkstV Q/jEdcnvBgjlgA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepkeduheeftefhvddvgeeiffehkeeugeehhfdugffhhffhveejjeffueehudeguddu necuffhomhgrihhnpehuthhilhhiiigrthhiohhnrdhmrghpnecukfhppeektddrudeije drleekrdduledtnecuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhf rhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id B10A53280060; Tue, 29 Sep 2020 19:19:33 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 11/14] hw/block/nvme: add the zone management send command Date: Wed, 30 Sep 2020 01:19:14 +0200 Message-Id: <20200929231917.433586-12-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Management Send command. Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 1 + include/block/nvme.h | 29 +++ hw/block/nvme.c | 552 ++++++++++++++++++++++++++++++++++++++++-- hw/block/trace-events | 11 + 4 files changed, 574 insertions(+), 19 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 523eef0bcad8..c704663e0a3e 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -71,6 +71,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_WRITE: return "NVME_NVM_CMD_WRITE"; case NVME_CMD_READ: return "NVME_NVM_CMD_READ"; case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES"; + case NVME_CMD_ZONE_MGMT_SEND: return "NVME_ZONED_CMD_ZONE_MGMT_SEND"; case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; default: return "NVME_NVM_CMD_UNKNOWN"; } diff --git a/include/block/nvme.h b/include/block/nvme.h index 9bacf48ee9e9..967b42eb5da7 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -481,6 +481,7 @@ enum NvmeIoCommands { NVME_CMD_COMPARE = 0x05, NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, + NVME_CMD_ZONE_MGMT_SEND = 0x79, NVME_CMD_ZONE_MGMT_RECV = 0x7a, }; @@ -594,6 +595,32 @@ enum { NVME_RW_PRINFO_PRCHK_REF = 1 << 10, }; +typedef struct QEMU_PACKED NvmeZoneManagementSendCmd { + uint8_t opcode; + uint8_t flags; + uint16_t cid; + uint32_t nsid; + uint32_t rsvd8[4]; + NvmeCmdDptr dptr; + uint64_t slba; + uint32_t rsvd48; + uint8_t zsa; + uint8_t zsflags; + uint16_t rsvd54; + uint32_t rsvd56[2]; +} NvmeZoneManagementSendCmd; + +#define NVME_CMD_ZONE_MGMT_SEND_SELECT_ALL(zsflags) ((zsflags) & 0x1) + +typedef enum NvmeZoneManagementSendAction { + NVME_CMD_ZONE_MGMT_SEND_CLOSE = 0x1, + NVME_CMD_ZONE_MGMT_SEND_FINISH = 0x2, + NVME_CMD_ZONE_MGMT_SEND_OPEN = 0x3, + NVME_CMD_ZONE_MGMT_SEND_RESET = 0x4, + NVME_CMD_ZONE_MGMT_SEND_OFFLINE = 0x5, + NVME_CMD_ZONE_MGMT_SEND_SET_ZDE = 0x10, +} NvmeZoneManagementSendAction; + typedef struct QEMU_PACKED NvmeZoneManagementRecvCmd { uint8_t opcode; uint8_t flags; @@ -748,6 +775,7 @@ enum NvmeStatusCodes { NVME_ZONE_IS_READ_ONLY = 0x01ba, NVME_ZONE_IS_OFFLINE = 0x01bb, NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_INVALID_ZONE_STATE_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -1207,6 +1235,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeIdentify) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRwCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeDsmCmd) != 64); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneManagementSendCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeZoneManagementRecvCmd) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeRangeType) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1e6c57752769..5c109cab58e8 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -164,6 +164,8 @@ static const NvmeEffectsLog nvme_effects[NVME_IOCS_MAX] = { .iocs = { NVME_EFFECTS_NVM_INITIALIZER, [NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP, + [NVME_CMD_ZONE_MGMT_SEND] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_LBCC, }, }, }; @@ -1064,21 +1066,20 @@ static inline uint16_t nvme_check_dulbe(NvmeNamespace *ns, uint64_t slba, return NVME_SUCCESS; } -static int nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) +static int __nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb, + bool deallocate) { int nlongs, idx; int64_t offset; unsigned long *map, *src; int ret; - if (!(ns->pstate.blk && nvme_check_dulbe(ns, slba, nlb))) { - return 0; + if (deallocate) { + bitmap_clear(ns->pstate.utilization.map, slba, nlb); + } else { + bitmap_set(ns->pstate.utilization.map, slba, nlb); } - trace_pci_nvme_allocate(nvme_nsid(ns), slba, nlb); - - bitmap_set(ns->pstate.utilization.map, slba, nlb); - /* * The bitmap is an array of unsigned longs, so calculate the index given * the size of a long. @@ -1123,6 +1124,28 @@ static int nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) return ret; } +static int nvme_allocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) +{ + if (!(ns->pstate.blk && nvme_check_dulbe(ns, slba, nlb))) { + return 0; + } + + trace_pci_nvme_allocate(nvme_nsid(ns), slba, nlb); + + return __nvme_allocate(ns, slba, nlb, false /* deallocate */); +} + +static int nvme_deallocate(NvmeNamespace *ns, uint64_t slba, uint32_t nlb) +{ + if (!ns->pstate.blk) { + return 0; + } + + trace_pci_nvme_deallocate(nvme_nsid(ns), slba, nlb); + + return __nvme_allocate(ns, slba, nlb, true /* deallocate */); +} + static int nvme_zns_commit_zone(NvmeNamespace *ns, NvmeZone *zone) { uint64_t zslba; @@ -1142,6 +1165,139 @@ static int nvme_zns_commit_zone(NvmeNamespace *ns, NvmeZone *zone) sizeof(NvmeZoneDescriptor), 0); } +static int nvme_zns_commit_zde(NvmeNamespace *ns, NvmeZone *zone) +{ + uint64_t zslba; + int zidx; + size_t zdes_bytes; + int64_t offset; + + if (!ns->pstate.blk) { + return 0; + } + + zslba = nvme_zslba(zone); + zidx = nvme_ns_zone_idx(ns, zslba); + zdes_bytes = nvme_ns_zdes_bytes(ns); + offset = ns->pstate.zns.offset + + ns->zns.num_zones * sizeof(NvmeZoneDescriptor) + + zidx * zdes_bytes; + + return blk_pwrite(ns->pstate.blk, offset, zone->zde, zdes_bytes, 0); +} + +static inline void nvme_zone_reset_wp(NvmeZone *zone) +{ + zone->zd->wp = zone->zd->zslba; + zone->wp_staging = nvme_zslba(zone); +} + +static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, + NvmeZoneState to) +{ + NvmeZoneState from = nvme_zs(zone); + + /* fast path */ + if (from == to) { + return NVME_SUCCESS; + } + + switch (from) { + case NVME_ZS_ZSE: + break; + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + switch (to) { + case NVME_ZS_ZSEO: + break; + + case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + + case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zone->zd->za); + + /* fallthrough */ + + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSC: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSC: + switch (to) { + case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + + case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zone->zd->za); + + /* fallthrough */ + + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSRO: + switch (to) { + case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zone->zd->za); + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + case NVME_ZS_ZSF: + switch (to) { + case NVME_ZS_ZSE: + nvme_zone_reset_wp(zone); + + /* fallthrough */ + + case NVME_ZS_ZSO: + NVME_ZA_CLEAR_ALL(zone->zd->za); + + /* fallthrough */ + + case NVME_ZS_ZSRO: + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + nvme_zs_set(zone, to); + return NVME_SUCCESS; +} + static void nvme_zns_advance_wp(NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; @@ -1153,7 +1309,7 @@ static void nvme_zns_advance_wp(NvmeRequest *req) wp += nlb; zone->zd->wp = cpu_to_le64(wp); if (wp == nvme_zslba(zone) + nvme_zcap(zone)) { - nvme_zs_set(zone, NVME_ZS_ZSF); + nvme_zrm_transition(req->ns, zone, NVME_ZS_ZSF); if (nvme_zns_commit_zone(req->ns, zone) < 0) { req->status = NVME_INTERNAL_DEV_ERROR; } @@ -1218,11 +1374,7 @@ static void nvme_rw_cb(void *opaque, int ret) NvmeZoneState zs = status == NVME_WRITE_FAULT ? NVME_ZS_ZSRO : NVME_ZS_ZSO; - nvme_zs_set(zone, zs); - if (zs == NVME_ZS_ZSO) { - NVME_ZA_CLEAR_ALL(zone->zd->za); - } - + nvme_zrm_transition(ns, zone, zs); if (nvme_zns_commit_zone(ns, zone) < 0) { req->status = NVME_INTERNAL_DEV_ERROR; } @@ -1290,6 +1442,364 @@ static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len, return NVME_NO_COMPLETE; } +static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_close(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSC: + return NVME_SUCCESS; + + case NVME_ZS_ZSE: + /* + * The state machine in nvme_zrm_transition allows zones to transition + * from ZSE to ZSC. That transition is only valid if done as part Set + * Zone Descriptor, so do an early check here. + */ + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + + default: + break; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + if (status) { + return status; + } + + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_finish(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_finish(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) == NVME_ZS_ZSF) { + return NVME_SUCCESS; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSF); + if (status) { + return status; + } + + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_open(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_open(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) == NVME_ZS_ZSEO) { + return NVME_SUCCESS; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSEO); + if (status) { + return status; + } + + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_reset(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint64_t zslba = nvme_zslba(zone); + uint64_t zcap = nvme_zcap(zone); + + trace_pci_nvme_zone_mgmt_send_reset(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + case NVME_ZS_ZSF: + if (blk_pdiscard(ns->blkconf.blk, nvme_l2b(ns, zslba), + nvme_l2b(ns, zcap)) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + if (nvme_deallocate(ns, zslba, zcap) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + nvme_zrm_transition(ns, zone, NVME_ZS_ZSE); + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + /* fallthrough */ + + case NVME_ZS_ZSE: + return NVME_SUCCESS; + + default: + break; + } + + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; +} + +static uint16_t nvme_zone_mgmt_send_offline(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + + trace_pci_nvme_zone_mgmt_send_offline(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSRO: + if (nvme_deallocate(ns, nvme_zslba(zone), nvme_zcap(zone)) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + nvme_zrm_transition(ns, zone, NVME_ZS_ZSO); + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + /* fallthrough */ + + case NVME_ZS_ZSO: + return NVME_SUCCESS; + + default: + break; + } + + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; +} + +static uint16_t nvme_zone_mgmt_send_set_zde(NvmeCtrl *n, NvmeRequest *req, + NvmeZone *zone) +{ + NvmeNamespace *ns = req->ns; + uint16_t status; + + trace_pci_nvme_zone_mgmt_send_set_zde(nvme_cid(req), nvme_nsid(ns), + nvme_zslba(zone), nvme_zs_str(zone)); + + if (nvme_zs(zone) != NVME_ZS_ZSE) { + trace_pci_nvme_err_invalid_zone_condition(nvme_cid(req), + nvme_zslba(zone), + nvme_zs(zone)); + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + + status = nvme_check_mdts(n, nvme_ns_zdes_bytes(ns)); + if (status) { + return status; + } + + status = nvme_dma(n, zone->zde, nvme_ns_zdes_bytes(ns), + DMA_DIRECTION_TO_DEVICE, req); + if (status) { + return status; + } + + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + if (status) { + return status; + } + + if (nvme_zns_commit_zde(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + NVME_ZA_SET(zone->zd->za, NVME_ZA_ZDEV); + + if (nvme_zns_commit_zone(ns, zone) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeZoneManagementSendCmd *send = (NvmeZoneManagementSendCmd *) &req->cmd; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + + uint16_t status = NVME_SUCCESS; + + trace_pci_nvme_zone_mgmt_send_all(nvme_cid(req), nvme_nsid(ns), send->zsa); + + switch (send->zsa) { + case NVME_CMD_ZONE_MGMT_SEND_SET_ZDE: + return NVME_INVALID_FIELD | NVME_DNR; + + case NVME_CMD_ZONE_MGMT_SEND_CLOSE: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + status = nvme_zone_mgmt_send_close(n, req, zone); + if (status) { + return status; + } + + default: + continue; + } + } + + break; + + case NVME_CMD_ZONE_MGMT_SEND_FINISH: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + status = nvme_zone_mgmt_send_finish(n, req, zone); + if (status) { + return status; + } + + default: + continue; + } + } + + break; + + case NVME_CMD_ZONE_MGMT_SEND_OPEN: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSC) { + status = nvme_zone_mgmt_send_open(n, req, zone); + if (status) { + return status; + } + } + } + + break; + + case NVME_CMD_ZONE_MGMT_SEND_RESET: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + switch (nvme_zs(zone)) { + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + case NVME_ZS_ZSC: + case NVME_ZS_ZSF: + status = nvme_zone_mgmt_send_reset(n, req, zone); + if (status) { + return status; + } + + default: + continue; + } + } + + break; + + case NVME_CMD_ZONE_MGMT_SEND_OFFLINE: + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSRO) { + status = nvme_zone_mgmt_send_offline(n, req, zone); + if (status) { + return status; + } + } + } + + break; + } + + return NVME_SUCCESS; +} + +static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeZoneManagementSendCmd *send = (NvmeZoneManagementSendCmd *) &req->cmd; + NvmeZoneManagementSendAction zsa = send->zsa; + NvmeNamespace *ns = req->ns; + NvmeZone *zone; + uint64_t zslba = le64_to_cpu(send->slba); + + if (!nvme_ns_zoned(ns)) { + return NVME_INVALID_OPCODE | NVME_DNR; + } + + trace_pci_nvme_zone_mgmt_send(nvme_cid(req), ns->params.nsid, zslba, zsa, + send->zsflags); + + if (NVME_CMD_ZONE_MGMT_SEND_SELECT_ALL(send->zsflags)) { + return nvme_zone_mgmt_send_all(n, req); + } + + zone = nvme_ns_get_zone(ns, zslba); + if (!zone) { + trace_pci_nvme_err_invalid_zone(nvme_cid(req), zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + if (zslba != nvme_zslba(zone)) { + trace_pci_nvme_err_invalid_zslba(nvme_cid(req), zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + switch (zsa) { + case NVME_CMD_ZONE_MGMT_SEND_CLOSE: + return nvme_zone_mgmt_send_close(n, req, zone); + case NVME_CMD_ZONE_MGMT_SEND_FINISH: + return nvme_zone_mgmt_send_finish(n, req, zone); + case NVME_CMD_ZONE_MGMT_SEND_OPEN: + return nvme_zone_mgmt_send_open(n, req, zone); + case NVME_CMD_ZONE_MGMT_SEND_RESET: + return nvme_zone_mgmt_send_reset(n, req, zone); + case NVME_CMD_ZONE_MGMT_SEND_OFFLINE: + return nvme_zone_mgmt_send_offline(n, req, zone); + case NVME_CMD_ZONE_MGMT_SEND_SET_ZDE: + return nvme_zone_mgmt_send_set_zde(n, req, zone); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) { NvmeZoneManagementRecvCmd *recv; @@ -1495,17 +2005,19 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) } switch (nvme_zs(zone)) { - case NVME_ZS_ZSE: - case NVME_ZS_ZSC: - nvme_zs_set(zone, NVME_ZS_ZSIO); + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + break; + default: + status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSIO); + if (status) { + goto invalid; + } if (nvme_zns_commit_zone(req->ns, zone) < 0) { status = NVME_INTERNAL_DEV_ERROR; goto invalid; } - - default: - break; } zone->wp_staging += nlb; @@ -1558,6 +2070,8 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_WRITE: case NVME_CMD_READ: return nvme_rwz(n, req); + case NVME_CMD_ZONE_MGMT_SEND: + return nvme_zone_mgmt_send(n, req); case NVME_CMD_ZONE_MGMT_RECV: return nvme_zone_mgmt_recv(n, req); default: diff --git a/hw/block/trace-events b/hw/block/trace-events index a2671dadb1e8..d6342f5c555d 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -42,8 +42,18 @@ pci_nvme_io_cmd(uint16_t cid, uint32_t nsid, uint16_t sqid, uint8_t opcode, cons pci_nvme_admin_cmd(uint16_t cid, uint16_t sqid, uint8_t opcode, const char *opname) "cid %"PRIu16" sqid %"PRIu16" opc 0x%"PRIx8" opname '%s'" pci_nvme_rwz(uint16_t cid, const char *verb, uint32_t nsid, uint32_t nlb, uint64_t len, uint64_t lba) "cid %"PRIu16" opname '%s' nsid %"PRIu32" nlb %"PRIu32" len %"PRIu64" lba 0x%"PRIx64"" pci_nvme_rw_cb(uint16_t cid, const char *blkname) "cid %"PRIu16" blk '%s'" +pci_nvme_zone_mgmt_send(uint16_t cid, uint32_t nsid, uint64_t zslba, uint8_t zsa, uint8_t zsflags) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zsa 0x%"PRIx8" zsflags 0x%"PRIx8"" +pci_nvme_zone_mgmt_send_all(uint16_t cid, uint32_t nsid, uint8_t za) "cid %"PRIu16" nsid %"PRIu32" za 0x%"PRIx8"" +pci_nvme_zone_mgmt_send_close(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_finish(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_open(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_reset(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_reset_cb(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32"" +pci_nvme_zone_mgmt_send_offline(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" +pci_nvme_zone_mgmt_send_set_zde(uint16_t cid, uint32_t nsid, uint64_t zslba, const char *zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" zc \"%s\"" pci_nvme_zone_mgmt_recv(uint16_t cid, uint32_t nsid, uint64_t slba, uint64_t len, uint8_t zra, uint8_t zrasp, uint8_t zrasf) "cid %"PRIu16" nsid %"PRIu32" slba 0x%"PRIx64" len %"PRIu64" zra 0x%"PRIx8" zrasp 0x%"PRIx8" zrasf 0x%"PRIx8"" pci_nvme_allocate(uint32_t ns, uint64_t slba, uint32_t nlb) "nsid %"PRIu32" slba 0x%"PRIx64" nlb %"PRIu32"" +pci_nvme_deallocate(uint32_t ns, uint64_t slba, uint32_t nlb) "nsid %"PRIu32" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_do_aio(uint16_t cid, uint8_t opc, const char *opname, const char *blkname, int64_t offset, size_t len) "cid %"PRIu16" opc 0x%"PRIx8" opname '%s' blk '%s' offset %"PRId64" len %zu" pci_nvme_create_sq(uint64_t addr, uint16_t sqid, uint16_t cqid, uint16_t qsize, uint16_t qflags) "create submission queue, addr=0x%"PRIx64", sqid=%"PRIu16", cqid=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16"" pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, uint16_t qflags, int ien) "create completion queue, addr=0x%"PRIx64", cqid=%"PRIu16", vector=%"PRIu16", qsize=%"PRIu16", qflags=%"PRIu16", ien=%d" @@ -134,6 +144,7 @@ pci_nvme_err_invalid_setfeat(uint32_t dw10) "invalid set features, dw10=0x%"PRIx pci_nvme_err_invalid_log_page(uint16_t cid, uint16_t lid) "cid %"PRIu16" lid 0x%"PRIx16"" pci_nvme_err_invalid_zone(uint16_t cid, uint64_t lba) "cid %"PRIu16" lba 0x%"PRIx64"" pci_nvme_err_invalid_zone_condition(uint16_t cid, uint64_t zslba, uint8_t condition) "cid %"PRIu16" zslba 0x%"PRIx64" condition 0x%"PRIx8"" +pci_nvme_err_invalid_zslba(uint16_t cid, uint64_t zslba) "cid %"PRIu16" zslba 0x%"PRIx64"" pci_nvme_err_startfail_cq(void) "nvme_start_ctrl failed because there are non-admin completion queues" pci_nvme_err_startfail_sq(void) "nvme_start_ctrl failed because there are non-admin submission queues" pci_nvme_err_startfail_nbarasq(void) "nvme_start_ctrl failed because the admin submission queue address is null" From patchwork Tue Sep 29 23:19:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304096 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9621FC4727C for ; Tue, 29 Sep 2020 23:45:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B226C20773 for ; Tue, 29 Sep 2020 23:45:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B226C20773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPIt-00009v-42 for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:45:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39184) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOui-0000x5-6G; Tue, 29 Sep 2020 19:20:04 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:45125) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuc-00009v-Rp; Tue, 29 Sep 2020 19:20:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id BE70CF6F; Tue, 29 Sep 2020 19:19:36 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=mYjORvH2RErs1 9Xh+so4liqSrYUzah+MX69/13x+3u8=; b=pr/l89TRGYuGtqW2U1xq4N5gNP2uP 5CUR9Y/WpH06b1kZu8KyIVptRGqIbolJm7bPmcacrZpce+TGJ/Ljt2we86cFaAfv QBWpD+PTx/H/MpPOUFui8DQ2wzpxRWfQ61K+DtOB5AtbFf6rg71aYtGz7DXlG2HB uCv883Ov+9ixWBXpuodwfdcCzSvI+DmogMGEpj/4sr1xEUZvIBw0Ycqfbb23F+SW qkJPDSzsPPWLQHFaVpMqucNzhyTio06UKfguL3ZgGAuMRnAVZIP4xYOShi1rgDja 3Io0UST68TiF6hQMn5VstP8a/3EwnUGCXufHek83Mnw3oGOhuXFvUo64g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=mYjORvH2RErs19Xh+so4liqSrYUzah+MX69/13x+3u8=; b=fGNkoGzE bfZ1c+A5dczoxgUuCVYuyi9mTvAEHTdtC+4b1v9xIijflc9wUx4sBSGkVZdXmg0n cEHuUwCTjIfPCTFKhHem5h6I71BR4jHpfgslVOkHXAd1RCKULw1bbbfbfTH9Ls6c od5fJU8zmClm8GLKJbkETC/QUyheg0w4iHlL6ZbTeuMAa8xUzEjV6AzxvXpkPAld gvvrrDc2XWn5hn9z8UAe0EpmdyOgw0NaPaytgFYdCGGydr24xYvpq2E2BkxxgPgC 2YJMekOu5CTb87za+/OWpzEksEQsX09Pb4Dil7iz3oA+YYdj/Skaghl5Fk1szOT7 ISwdqCQrd+VH2w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepueelteegieeuhffgkeefgfevjeeigfetkeeitdfgtdeifefhtdfhfeeuffevgfek necukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihiivgepjeenuc frrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrdgukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 023CD3280067; Tue, 29 Sep 2020 19:19:34 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 12/14] hw/block/nvme: add the zone append command Date: Wed, 30 Sep 2020 01:19:15 +0200 Message-Id: <20200929231917.433586-13-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Add the Zone Append command. Signed-off-by: Klaus Jensen --- hw/block/nvme.h | 6 ++++ include/block/nvme.h | 7 +++++ hw/block/nvme.c | 71 +++++++++++++++++++++++++++++++++++++++---- hw/block/trace-events | 1 + 4 files changed, 79 insertions(+), 6 deletions(-) diff --git a/hw/block/nvme.h b/hw/block/nvme.h index c704663e0a3e..8cd2d936548e 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -16,6 +16,10 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; + + struct { + uint8_t zasl; + } zns; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -41,6 +45,7 @@ static inline bool nvme_req_is_write(NvmeRequest *req) switch (req->cmd.opcode) { case NVME_CMD_WRITE: case NVME_CMD_WRITE_ZEROES: + case NVME_CMD_ZONE_APPEND: return true; default: return false; @@ -73,6 +78,7 @@ static inline const char *nvme_io_opc_str(uint8_t opc) case NVME_CMD_WRITE_ZEROES: return "NVME_NVM_CMD_WRITE_ZEROES"; case NVME_CMD_ZONE_MGMT_SEND: return "NVME_ZONED_CMD_ZONE_MGMT_SEND"; case NVME_CMD_ZONE_MGMT_RECV: return "NVME_ZONED_CMD_ZONE_MGMT_RECV"; + case NVME_CMD_ZONE_APPEND: return "NVME_ZONED_CMD_ZONE_APPEND"; default: return "NVME_NVM_CMD_UNKNOWN"; } } diff --git a/include/block/nvme.h b/include/block/nvme.h index 967b42eb5da7..5f8914f594f4 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -483,6 +483,7 @@ enum NvmeIoCommands { NVME_CMD_DSM = 0x09, NVME_CMD_ZONE_MGMT_SEND = 0x79, NVME_CMD_ZONE_MGMT_RECV = 0x7a, + NVME_CMD_ZONE_APPEND = 0x7d, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -1018,6 +1019,11 @@ enum NvmeIdCtrlLpa { NVME_LPA_EXTENDED = 1 << 2, }; +typedef struct QEMU_PACKED NvmeIdCtrlZns { + uint8_t zasl; + uint8_t rsvd1[4095]; +} NvmeIdCtrlZns; + #define NVME_CTRL_SQES_MIN(sqes) ((sqes) & 0xf) #define NVME_CTRL_SQES_MAX(sqes) (((sqes) >> 4) & 0xf) #define NVME_CTRL_CQES_MIN(cqes) ((cqes) & 0xf) @@ -1242,6 +1248,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsNvm) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZns) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 5c109cab58e8..a891ee284d1d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -166,6 +166,8 @@ static const NvmeEffectsLog nvme_effects[NVME_IOCS_MAX] = { [NVME_CMD_ZONE_MGMT_RECV] = NVME_EFFECTS_CSUPP, [NVME_CMD_ZONE_MGMT_SEND] = NVME_EFFECTS_CSUPP | NVME_EFFECTS_LBCC, + [NVME_CMD_ZONE_APPEND] = NVME_EFFECTS_CSUPP | + NVME_EFFECTS_LBCC, }, }, }; @@ -1041,6 +1043,21 @@ static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len) return NVME_SUCCESS; } +static inline uint16_t nvme_check_zasl(NvmeCtrl *n, size_t len) +{ + uint8_t zasl = n->params.zns.zasl; + + if (!zasl) { + return nvme_check_mdts(n, len); + } + + if (len > n->page_size << zasl) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + return NVME_SUCCESS; +} + static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, uint64_t slba, uint32_t nlb) { @@ -1410,6 +1427,7 @@ static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len, break; case NVME_CMD_WRITE: + case NVME_CMD_ZONE_APPEND: is_write = true; /* fallthrough */ @@ -1945,12 +1963,40 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; size_t len = nvme_l2b(ns, nlb); - bool is_write = nvme_req_is_write(req); + bool is_append, is_write = nvme_req_is_write(req); uint16_t status; trace_pci_nvme_rwz(nvme_cid(req), nvme_io_opc_str(rw->opcode), nvme_nsid(ns), nlb, len, slba); + if (req->cmd.opcode == NVME_CMD_ZONE_APPEND) { + uint64_t wp; + is_append = true; + + zone = nvme_ns_get_zone(ns, slba); + if (!zone) { + trace_pci_nvme_err_invalid_zone(nvme_cid(req), slba); + status = NVME_INVALID_FIELD | NVME_DNR; + goto invalid; + } + + wp = zone->wp_staging; + + if (slba != nvme_zslba(zone)) { + trace_pci_nvme_err_invalid_zslba(nvme_cid(req), slba); + return NVME_INVALID_FIELD | NVME_DNR; + } + + status = nvme_check_zasl(n, len); + if (status) { + trace_pci_nvme_err_zasl(nvme_cid(req), len); + goto invalid; + } + + slba = wp; + rw->slba = req->cqe.qw0 = cpu_to_le64(wp); + } + status = nvme_check_bounds(n, ns, slba, nlb); if (status) { NvmeIdNsNvm *id_ns = nvme_ns_id_nvm(ns); @@ -1959,11 +2005,13 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) } if (nvme_ns_zoned(ns)) { - zone = nvme_ns_get_zone(ns, slba); if (!zone) { - trace_pci_nvme_err_invalid_zone(nvme_cid(req), slba); - status = NVME_INVALID_FIELD | NVME_DNR; - goto invalid; + zone = nvme_ns_get_zone(ns, slba); + if (!zone) { + trace_pci_nvme_err_invalid_zone(nvme_cid(req), slba); + status = NVME_INVALID_FIELD | NVME_DNR; + goto invalid; + } } status = nvme_check_zone(n, slba, nlb, req, zone); @@ -1997,7 +2045,7 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) if (is_write) { if (zone) { - if (zone->wp_staging != nvme_wp(zone)) { + if (!is_append && (zone->wp_staging != nvme_wp(zone))) { trace_pci_nvme_err_zone_pending_writes(nvme_cid(req), nvme_zslba(zone), nvme_wp(zone), @@ -2069,6 +2117,7 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) case NVME_CMD_WRITE_ZEROES: case NVME_CMD_WRITE: case NVME_CMD_READ: + case NVME_CMD_ZONE_APPEND: return nvme_rwz(n, req); case NVME_CMD_ZONE_MGMT_SEND: return nvme_zone_mgmt_send(n, req); @@ -3753,6 +3802,11 @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) return; } + if (params->zns.zasl && params->zns.zasl > params->mdts) { + error_setg(errp, "zns.zasl must be less than or equal to mdts"); + return; + } + if (!n->params.cmb_size_mb && n->pmrdev) { if (host_memory_backend_is_mapped(n->pmrdev)) { error_setg(errp, "can't use already busy memdev: %s", @@ -3929,12 +3983,16 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) { NvmeIdCtrl *id = &n->id_ctrl; + NvmeIdCtrlZns *id_zns; uint8_t *pci_conf = pci_dev->config; char *subnqn; n->id_ctrl_iocss[NVME_IOCS_NVM] = g_new0(NvmeIdCtrl, 1); n->id_ctrl_iocss[NVME_IOCS_ZONED] = g_new0(NvmeIdCtrl, 1); + id_zns = n->id_ctrl_iocss[NVME_IOCS_ZONED]; + id_zns->zasl = n->params.zns.zasl; + id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID)); id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID)); strpadcpy((char *)id->mn, sizeof(id->mn), "QEMU NVMe Ctrl", ' '); @@ -4071,6 +4129,7 @@ static Property nvme_props[] = { DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false), + DEFINE_PROP_UINT8("zns.zasl", NvmeCtrl, params.zns.zasl, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/trace-events b/hw/block/trace-events index d6342f5c555d..929409b79b41 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -103,6 +103,7 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" +pci_nvme_err_zasl(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" pci_nvme_err_req_status(uint16_t cid, uint32_t nsid, uint16_t status, uint8_t opc) "cid %"PRIu16" nsid %"PRIu32" status 0x%"PRIx16" opc 0x%"PRIx8"" pci_nvme_err_dulbe(uint16_t cid, uint64_t slba, uint32_t nlb) "cid %"PRIu16" slba 0x%"PRIx64" nlb %"PRIu32"" pci_nvme_err_addr_read(uint64_t addr) "addr 0x%"PRIx64"" From patchwork Tue Sep 29 23:19:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 304097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDE98C4727C for ; Tue, 29 Sep 2020 23:38:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4910520773 for ; Tue, 29 Sep 2020 23:38:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4910520773 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPCC-0003zs-3v for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:38:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39190) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuj-0000xc-Jh; Tue, 29 Sep 2020 19:20:07 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:50857) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOue-0000AA-Nj; Tue, 29 Sep 2020 19:20:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 09226EAB; Tue, 29 Sep 2020 19:19:37 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=F9KxneCu7sQ8Y q9t+Jyk/wjRwtWvywodl5Tkh60KQ44=; b=lY4gSmPij6ysow3rlo2N6zhwTa7+f rrURSnb7olBmIoR2QqDffZOizcCmo7GQA2xd5sETAvRNmxDhWK8vsoxB8sQBVH0m qNxrjW9Nl+VwpobvaDuREj6nhhpBtIpQBUv650jb7jHCo51Wy/aX9x4Y4kBDeWSB hDTOoZWITUpdJgyEOVvIK0H7gPvKqY71qSsAUS2XVXHfnhmw4neXTGG4Bworj7nm O4PFxfs979vCyy8INVN8Ytl0YQmbu6vyfMQILuUUtfu0+ieuws+1IM9j8exMUA+6 Dbw0HgvfYRXlABNwcJjE5G7ANRmicrYHYl54agpIkiq+ggchD6wwoV9aQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=F9KxneCu7sQ8Yq9t+Jyk/wjRwtWvywodl5Tkh60KQ44=; b=cl/S0mWp 3nBQHv3d7k0N2SqB63UE7ZJCb5HwMoRndyIfUMR1r0cB+MwusesRA1e69IrimJH3 sJQL45rjmiRTYbcsyjsuuxbrXtEVD/sBkmgWGmTCHA3h/+ZgQprt5h48BCJrLT5y X1wJ2X8OcsbC60h4UYZi6mx56A8AJcMseuVw5kmbdmbbPh4Aze2F7pSOILt20W2S eExgz/5uNFq8SHiFiPXinLKliSDCKJRnUiigvPzvG44Mt5YpQj5TRKwNdXXVCpre Z/kJ9hhOX8K8bxINFP77LIE5vAlQSV+0hIHcPTIFesoXOFZAAZsvceYS0+8JbKCq j40cbb8HRbFtIA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepjeefjeejjeffleefuefhffdtvdfggedujedvvdfftddvhfegjeekheeuhfejjeeu necuffhomhgrihhnpehrvghsohhurhgtvghsrdgrtghtihhvvgdprhgvshhouhhrtggvsh drohhpvghnnecukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihii vgeptdenucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrd gukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 44A2A328005A; Tue, 29 Sep 2020 19:19:36 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 13/14] hw/block/nvme: track and enforce zone resources Date: Wed, 30 Sep 2020 01:19:16 +0200 Message-Id: <20200929231917.433586-14-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Track number of open/active resources. Signed-off-by: Klaus Jensen --- docs/specs/nvme.txt | 7 +++++ hw/block/nvme-ns.h | 7 +++++ include/block/nvme.h | 2 ++ hw/block/nvme-ns.c | 25 +++++++++++++++-- hw/block/nvme.c | 67 +++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 104 insertions(+), 4 deletions(-) diff --git a/docs/specs/nvme.txt b/docs/specs/nvme.txt index b23e59dd3075..e3810843cd2d 100644 --- a/docs/specs/nvme.txt +++ b/docs/specs/nvme.txt @@ -42,6 +42,13 @@ nvme-ns Options zns.zcap; if the zone capacity is a power of two, the zone size will be set to that, otherwise it will default to the next power of two. + `zns.mar`; Specifies the number of active resources available. This is a 0s + based value. + + `zns.mor`; Specifies the number of open resources available. This is a 0s + based value. + + Reference Specifications ------------------------ diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 82cb0b0bce82..ff34cd37af7d 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -52,6 +52,8 @@ typedef struct NvmeNamespaceParams { uint64_t zcap; uint64_t zsze; uint8_t zdes; + uint32_t mar; + uint32_t mor; } zns; } NvmeNamespaceParams; @@ -95,6 +97,11 @@ typedef struct NvmeNamespace { NvmeZone *zones; NvmeZoneDescriptor *zd; uint8_t *zde; + + struct { + uint32_t open; + uint32_t active; + } resources; } zns; } NvmeNamespace; diff --git a/include/block/nvme.h b/include/block/nvme.h index 5f8914f594f4..d51f397e7ff1 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -776,6 +776,8 @@ enum NvmeStatusCodes { NVME_ZONE_IS_READ_ONLY = 0x01ba, NVME_ZONE_IS_OFFLINE = 0x01bb, NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_TOO_MANY_ACTIVE_ZONES = 0x01bd, + NVME_TOO_MANY_OPEN_ZONES = 0x01be, NVME_INVALID_ZONE_STATE_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index bc49c7f2674f..9584fbb3f62d 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -119,8 +119,8 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) ns->zns.zde = g_malloc0_n(ns->zns.num_zones, nvme_ns_zdes_bytes(ns)); } - id_ns_zns->mar = 0xffffffff; - id_ns_zns->mor = 0xffffffff; + id_ns_zns->mar = cpu_to_le32(ns->params.zns.mar); + id_ns_zns->mor = cpu_to_le32(ns->params.zns.mor); } static void nvme_ns_init(NvmeNamespace *ns) @@ -220,6 +220,11 @@ static int nvme_ns_pstate_init(NvmeNamespace *ns, Error **errp) void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) { + ns->zns.resources.active = ns->params.zns.mar != 0xffffffff ? + ns->params.zns.mar + 1 : ns->zns.num_zones; + ns->zns.resources.open = ns->params.zns.mor != 0xffffffff ? + ns->params.zns.mor + 1 : ns->zns.num_zones; + for (int i = 0; i < ns->zns.num_zones; i++) { NvmeZone *zone = &ns->zns.zones[i]; zone->zd = &ns->zns.zd[i]; @@ -238,9 +243,15 @@ void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) if (nvme_wp(zone) == nvme_zslba(zone) && !(zone->zd->za & NVME_ZA_ZDEV)) { nvme_zs_set(zone, NVME_ZS_ZSE); + continue; } - continue; + if (ns->zns.resources.active) { + ns->zns.resources.active--; + continue; + } + + /* fallthrough */ case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: @@ -462,6 +473,12 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.zns.mor > ns->params.zns.mar) { + error_setg(errp, "maximum open resources (zns.mor) must be less " + "than or equal to maximum active resources (zns.mar)"); + return -1; + } + break; default: @@ -547,6 +564,8 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT64("zns.zcap", NvmeNamespace, params.zns.zcap, 0), DEFINE_PROP_UINT64("zns.zsze", NvmeNamespace, params.zns.zsze, 0), DEFINE_PROP_UINT8("zns.zdes", NvmeNamespace, params.zns.zdes, 0), + DEFINE_PROP_UINT32("zns.mar", NvmeNamespace, params.zns.mar, 0xffffffff), + DEFINE_PROP_UINT32("zns.mor", NvmeNamespace, params.zns.mor, 0xffffffff), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index a891ee284d1d..fc5b119e3f35 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1221,6 +1221,40 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, switch (from) { case NVME_ZS_ZSE: + switch (to) { + case NVME_ZS_ZSF: + case NVME_ZS_ZSRO: + case NVME_ZS_ZSO: + break; + + case NVME_ZS_ZSC: + if (!ns->zns.resources.active) { + return NVME_TOO_MANY_ACTIVE_ZONES; + } + + ns->zns.resources.active--; + + break; + + case NVME_ZS_ZSIO: + case NVME_ZS_ZSEO: + if (!ns->zns.resources.active) { + return NVME_TOO_MANY_ACTIVE_ZONES; + } + + if (!ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + + ns->zns.resources.active--; + ns->zns.resources.open--; + + break; + + default: + return NVME_INVALID_ZONE_STATE_TRANSITION | NVME_DNR; + } + break; case NVME_ZS_ZSIO: @@ -1241,7 +1275,13 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: + ns->zns.resources.active++; + + /* fallthrough */ + case NVME_ZS_ZSC: + ns->zns.resources.open++; + break; default: @@ -1264,8 +1304,18 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: + ns->zns.resources.active++; + + break; + case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: + if (!ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + + ns->zns.resources.open--; + break; default: @@ -1669,6 +1719,7 @@ static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeRequest *req) NvmeZoneManagementSendCmd *send = (NvmeZoneManagementSendCmd *) &req->cmd; NvmeNamespace *ns = req->ns; NvmeZone *zone; + int count; uint16_t status = NVME_SUCCESS; @@ -1718,6 +1769,20 @@ static uint16_t nvme_zone_mgmt_send_all(NvmeCtrl *n, NvmeRequest *req) break; case NVME_CMD_ZONE_MGMT_SEND_OPEN: + count = 0; + + for (int i = 0; i < ns->zns.num_zones; i++) { + zone = &ns->zns.zones[i]; + + if (nvme_zs(zone) == NVME_ZS_ZSC) { + count++; + } + } + + if (count > ns->zns.resources.open) { + return NVME_TOO_MANY_OPEN_ZONES; + } + for (int i = 0; i < ns->zns.num_zones; i++) { zone = &ns->zns.zones[i]; @@ -3287,7 +3352,7 @@ static void nvme_clear_ctrl(NvmeCtrl *n) switch (nvme_zs(zone)) { case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: - nvme_zs_set(zone, NVME_ZS_ZSC); + nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); nvme_zns_commit_zone(ns, zone); break; default: From patchwork Tue Sep 29 23:19:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Klaus Jensen X-Patchwork-Id: 272420 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EB66C4727C for ; Tue, 29 Sep 2020 23:53:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CB0CF20691 for ; Tue, 29 Sep 2020 23:53:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB0CF20691 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=irrelevant.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55054 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kNPQx-0003OU-6Q for qemu-devel@archiver.kernel.org; Tue, 29 Sep 2020 19:53:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39188) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOui-0000x8-Q7; Tue, 29 Sep 2020 19:20:04 -0400 Received: from wnew2-smtp.messagingengine.com ([64.147.123.27]:53315) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kNOuf-0000AM-Ad; Tue, 29 Sep 2020 19:20:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5EF0FF73; Tue, 29 Sep 2020 19:19:39 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 29 Sep 2020 19:19:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=irrelevant.dk; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=woncMQv06VujK Kpu9FxP0yRxOEPbqeAUy6ab5yIYLmM=; b=hDjMJsU9FTge/w2iQ73w7Ubx7rOU6 N7sNmviTq9pOhs5UtDgiLzCS9NkCHpuU5DdoAdPH0VxzOKjVCHRIYPhLPdEol6sH PSxLuHKsDYAE2vSb3pmSXIKNhu7nlhwqpwEVLeAMhhFQtY2DoHgUX2U/36Pl00ch 2kmORrc7zJ8az7gtULYaO/mjMjsq+QRqG89aNshM2WoiCmcNTZap+N3TTIyzo+pT dC61uNhQEOcSBLCgqVrwgpGlvH7axv/eRsfLVCdPPS/bewv2vbWCPkCiVTO3riNq XRi9CplsM5N3pPevE0xbQuGZ2bxiQrZOD4SaQHKlxMoW8+2P+MSeU81Yg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=woncMQv06VujKKpu9FxP0yRxOEPbqeAUy6ab5yIYLmM=; b=bzCsRsM3 elo0Q0PPqcBt/fe2COyllC+5Bb/IciEhJovyqjPh2ViVSB3G+tY2Mb3TWcha7Na3 OgRS4oe0oq1mNmcYHxhpkqiQYxpSqpYjyJ6nP7oC3iOZ2hpnzdV4BRPsT0qTzzQU oZH3/fhwnu0Zs+tbwOuhbc7KUGZUnC4lSC6bOM/H2NTdd01ReJ67aBrXgKVIMqX/ ihAXzAps+Bo+WVxfCochXTxO5m2OoDjXfZQMrKtvU/yM+kyAJ6nMQJU2mDVT6lzr oQQ4o0klX+Y6ODFV3s8hixn7tVNyPG7PL510nEd1+73Rgm31r6P7D1kyFO1zd2cW C3+QtKlg0BqiPA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfedtgddujecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepmfhlrghushcu lfgvnhhsvghnuceoihhtshesihhrrhgvlhgvvhgrnhhtrdgukheqnecuggftrfgrthhtvg hrnhepvdffffdugffhfeffvddvvddtvefhffffgfeuffekgffgkeejgfehudehteevgfeh necuffhomhgrihhnpehrvghsohhurhgtvghsrdhophgvnhdprhgvshhouhhrtggvshdrrg gtthhivhgvnecukfhppeektddrudeijedrleekrdduledtnecuvehluhhsthgvrhfuihii vgeptdenucfrrghrrghmpehmrghilhhfrhhomhepihhtshesihhrrhgvlhgvvhgrnhhtrd gukh X-ME-Proxy: Received: from apples.local (80-167-98-190-cable.dk.customer.tdc.net [80.167.98.190]) by mail.messagingengine.com (Postfix) with ESMTPA id 8A0213280060; Tue, 29 Sep 2020 19:19:37 -0400 (EDT) From: Klaus Jensen To: qemu-devel@nongnu.org Subject: [PATCH v2 14/14] hw/block/nvme: allow open to close transitions by controller Date: Wed, 30 Sep 2020 01:19:17 +0200 Message-Id: <20200929231917.433586-15-its@irrelevant.dk> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929231917.433586-1-its@irrelevant.dk> References: <20200929231917.433586-1-its@irrelevant.dk> MIME-Version: 1.0 Received-SPF: pass client-ip=64.147.123.27; envelope-from=its@irrelevant.dk; helo=wnew2-smtp.messagingengine.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/29 17:46:07 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Klaus Jensen , Max Reitz , Keith Busch , Klaus Jensen Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Klaus Jensen Allow the controller to release open resources by transitioning implicitly and explicitly opened zones to closed. This is done using a naive "least recently opened" strategy. Signed-off-by: Klaus Jensen --- hw/block/nvme-ns.h | 5 ++ hw/block/nvme-ns.c | 5 ++ hw/block/nvme.c | 105 +++++++++++++++++++++++++++++++++++------- hw/block/trace-events | 5 ++ 4 files changed, 103 insertions(+), 17 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index ff34cd37af7d..491a77f3ae2f 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -62,6 +62,8 @@ typedef struct NvmeZone { uint8_t *zde; uint64_t wp_staging; + + QTAILQ_ENTRY(NvmeZone) lru_entry; } NvmeZone; typedef struct NvmeNamespace { @@ -101,6 +103,9 @@ typedef struct NvmeNamespace { struct { uint32_t open; uint32_t active; + + QTAILQ_HEAD(, NvmeZone) lru_open; + QTAILQ_HEAD(, NvmeZone) lru_active; } resources; } zns; } NvmeNamespace; diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 9584fbb3f62d..26c9f846417a 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -225,6 +225,9 @@ void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) ns->zns.resources.open = ns->params.zns.mor != 0xffffffff ? ns->params.zns.mor + 1 : ns->zns.num_zones; + QTAILQ_INIT(&ns->zns.resources.lru_open); + QTAILQ_INIT(&ns->zns.resources.lru_active); + for (int i = 0; i < ns->zns.num_zones; i++) { NvmeZone *zone = &ns->zns.zones[i]; zone->zd = &ns->zns.zd[i]; @@ -248,6 +251,8 @@ void nvme_ns_zns_init_zone_state(NvmeNamespace *ns) if (ns->zns.resources.active) { ns->zns.resources.active--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, + lru_entry); continue; } diff --git a/hw/block/nvme.c b/hw/block/nvme.c index fc5b119e3f35..34093f33ad1a 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1209,12 +1209,61 @@ static inline void nvme_zone_reset_wp(NvmeZone *zone) zone->wp_staging = nvme_zslba(zone); } -static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState to) +static uint16_t nvme_zrm_transition(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, NvmeZoneState to, + NvmeRequest *req); + +static uint16_t nvme_zrm_release_open(NvmeCtrl *n, NvmeNamespace *ns, + NvmeRequest *req) +{ + NvmeZone *candidate; + NvmeZoneState zs; + uint16_t status; + + trace_pci_nvme_zone_zrm_release_open(nvme_cid(req), ns->params.nsid); + + QTAILQ_FOREACH(candidate, &ns->zns.resources.lru_open, lru_entry) { + zs = nvme_zs(candidate); + + trace_pci_nvme_zone_zrm_candidate(nvme_cid(req), ns->params.nsid, + nvme_zslba(candidate), + nvme_wp(candidate), zs); + + /* skip explicitly opened zones */ + if (zs == NVME_ZS_ZSEO) { + continue; + } + + /* the zone cannot be closed if it is currently writing */ + if (candidate->wp_staging != nvme_wp(candidate)) { + continue; + } + + status = nvme_zrm_transition(n, ns, candidate, NVME_ZS_ZSC, req); + if (status) { + return status; + } + + if (nvme_zns_commit_zone(ns, candidate) < 0) { + return NVME_INTERNAL_DEV_ERROR; + } + + return NVME_SUCCESS; + } + + return NVME_TOO_MANY_OPEN_ZONES; +} + +static uint16_t nvme_zrm_transition(NvmeCtrl *n, NvmeNamespace *ns, + NvmeZone *zone, NvmeZoneState to, + NvmeRequest *req) { NvmeZoneState from = nvme_zs(zone); + uint16_t status; + + trace_pci_nvme_zone_zrm_transition(nvme_cid(req), ns->params.nsid, + nvme_zslba(zone), nvme_zs(zone), to); - /* fast path */ if (from == to) { return NVME_SUCCESS; } @@ -1229,25 +1278,32 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSC: if (!ns->zns.resources.active) { + trace_pci_nvme_err_too_many_active_zones(nvme_cid(req)); return NVME_TOO_MANY_ACTIVE_ZONES; } ns->zns.resources.active--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, lru_entry); break; case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: if (!ns->zns.resources.active) { + trace_pci_nvme_err_too_many_active_zones(nvme_cid(req)); return NVME_TOO_MANY_ACTIVE_ZONES; } if (!ns->zns.resources.open) { - return NVME_TOO_MANY_OPEN_ZONES; + status = nvme_zrm_release_open(n, ns, req); + if (status) { + return status; + } } ns->zns.resources.active--; ns->zns.resources.open--; + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_open, zone, lru_entry); break; @@ -1276,11 +1332,15 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: ns->zns.resources.active++; + ns->zns.resources.open++; + QTAILQ_REMOVE(&ns->zns.resources.lru_open, zone, lru_entry); - /* fallthrough */ + break; case NVME_ZS_ZSC: ns->zns.resources.open++; + QTAILQ_REMOVE(&ns->zns.resources.lru_open, zone, lru_entry); + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_active, zone, lru_entry); break; @@ -1305,16 +1365,22 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: case NVME_ZS_ZSRO: ns->zns.resources.active++; + QTAILQ_REMOVE(&ns->zns.resources.lru_active, zone, lru_entry); break; case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: if (!ns->zns.resources.open) { - return NVME_TOO_MANY_OPEN_ZONES; + status = nvme_zrm_release_open(n, ns, req); + if (status) { + return status; + } } ns->zns.resources.open--; + QTAILQ_REMOVE(&ns->zns.resources.lru_active, zone, lru_entry); + QTAILQ_INSERT_TAIL(&ns->zns.resources.lru_open, zone, lru_entry); break; @@ -1338,6 +1404,9 @@ static uint16_t nvme_zrm_transition(NvmeNamespace *ns, NvmeZone *zone, case NVME_ZS_ZSF: switch (to) { + case NVME_ZS_ZSF: + return NVME_SUCCESS; + case NVME_ZS_ZSE: nvme_zone_reset_wp(zone); @@ -1376,7 +1445,9 @@ static void nvme_zns_advance_wp(NvmeRequest *req) wp += nlb; zone->zd->wp = cpu_to_le64(wp); if (wp == nvme_zslba(zone) + nvme_zcap(zone)) { - nvme_zrm_transition(req->ns, zone, NVME_ZS_ZSF); + NvmeCtrl *n = nvme_ctrl(req); + + nvme_zrm_transition(n, req->ns, zone, NVME_ZS_ZSF, req); if (nvme_zns_commit_zone(req->ns, zone) < 0) { req->status = NVME_INTERNAL_DEV_ERROR; } @@ -1433,6 +1504,7 @@ static void nvme_rw_cb(void *opaque, int ret) uint64_t slba = le64_to_cpu(rw->slba); NvmeZone *zone = nvme_ns_get_zone(ns, slba); + NvmeCtrl *n = nvme_ctrl(req); /* * Transition the zone to read-only on write fault and offline @@ -1441,7 +1513,7 @@ static void nvme_rw_cb(void *opaque, int ret) NvmeZoneState zs = status == NVME_WRITE_FAULT ? NVME_ZS_ZSRO : NVME_ZS_ZSO; - nvme_zrm_transition(ns, zone, zs); + nvme_zrm_transition(n, ns, zone, zs, req); if (nvme_zns_commit_zone(ns, zone) < 0) { req->status = NVME_INTERNAL_DEV_ERROR; } @@ -1535,7 +1607,7 @@ static uint16_t nvme_zone_mgmt_send_close(NvmeCtrl *n, NvmeRequest *req, break; } - status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + status = nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSC, req); if (status) { return status; } @@ -1560,7 +1632,7 @@ static uint16_t nvme_zone_mgmt_send_finish(NvmeCtrl *n, NvmeRequest *req, return NVME_SUCCESS; } - status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSF); + status = nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSF, req); if (status) { return status; } @@ -1585,7 +1657,7 @@ static uint16_t nvme_zone_mgmt_send_open(NvmeCtrl *n, NvmeRequest *req, return NVME_SUCCESS; } - status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSEO); + status = nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSEO, req); if (status) { return status; } @@ -1621,7 +1693,7 @@ static uint16_t nvme_zone_mgmt_send_reset(NvmeCtrl *n, NvmeRequest *req, return NVME_INTERNAL_DEV_ERROR; } - nvme_zrm_transition(ns, zone, NVME_ZS_ZSE); + nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSE, req); if (nvme_zns_commit_zone(ns, zone) < 0) { return NVME_INTERNAL_DEV_ERROR; } @@ -1652,7 +1724,7 @@ static uint16_t nvme_zone_mgmt_send_offline(NvmeCtrl *n, NvmeRequest *req, return NVME_INTERNAL_DEV_ERROR; } - nvme_zrm_transition(ns, zone, NVME_ZS_ZSO); + nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSO, req); if (nvme_zns_commit_zone(ns, zone) < 0) { return NVME_INTERNAL_DEV_ERROR; } @@ -1696,7 +1768,7 @@ static uint16_t nvme_zone_mgmt_send_set_zde(NvmeCtrl *n, NvmeRequest *req, return status; } - status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + status = nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSC, req); if (status) { return status; } @@ -2122,7 +2194,7 @@ static uint16_t nvme_rwz(NvmeCtrl *n, NvmeRequest *req) case NVME_ZS_ZSEO: break; default: - status = nvme_zrm_transition(ns, zone, NVME_ZS_ZSIO); + status = nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSIO, req); if (status) { goto invalid; } @@ -3348,11 +3420,10 @@ static void nvme_clear_ctrl(NvmeCtrl *n) if (nvme_ns_zoned(ns)) { for (int i = 0; i < ns->zns.num_zones; i++) { NvmeZone *zone = &ns->zns.zones[i]; - switch (nvme_zs(zone)) { case NVME_ZS_ZSIO: case NVME_ZS_ZSEO: - nvme_zrm_transition(ns, zone, NVME_ZS_ZSC); + nvme_zrm_transition(n, ns, zone, NVME_ZS_ZSC, NULL); nvme_zns_commit_zone(ns, zone); break; default: diff --git a/hw/block/trace-events b/hw/block/trace-events index 929409b79b41..18f7b24ef5e9 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -88,6 +88,9 @@ pci_nvme_mmio_read(uint64_t addr) "addr 0x%"PRIx64"" pci_nvme_mmio_write(uint64_t addr, uint64_t data) "addr 0x%"PRIx64" data 0x%"PRIx64"" pci_nvme_mmio_doorbell_cq(uint16_t cqid, uint16_t new_head) "cqid %"PRIu16" new_head %"PRIu16"" pci_nvme_mmio_doorbell_sq(uint16_t sqid, uint16_t new_tail) "sqid %"PRIu16" new_tail %"PRIu16"" +pci_nvme_zone_zrm_transition(uint16_t cid, uint32_t nsid, uint64_t zslba, uint8_t from, uint8_t to) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" from 0x%"PRIx8" to 0x%"PRIx8"" +pci_nvme_zone_zrm_candidate(uint16_t cid, uint32_t nsid, uint64_t zslba, uint64_t wp, uint8_t zc) "cid %"PRIu16" nsid %"PRIu32" zslba 0x%"PRIx64" wp 0x%"PRIx64" zc 0x%"PRIx8"" +pci_nvme_zone_zrm_release_open(uint16_t cid, uint32_t nsid) "cid %"PRIu16" nsid %"PRIu32"" pci_nvme_mmio_intm_set(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask set, data=0x%"PRIx64", new_mask=0x%"PRIx64"" pci_nvme_mmio_intm_clr(uint64_t data, uint64_t new_mask) "wrote MMIO, interrupt mask clr, data=0x%"PRIx64", new_mask=0x%"PRIx64"" pci_nvme_mmio_cfg(uint64_t data) "wrote MMIO, config controller config=0x%"PRIx64"" @@ -115,6 +118,8 @@ pci_nvme_err_zone_is_read_only(uint16_t cid, uint64_t slba) "cid %"PRIu16" lba 0 pci_nvme_err_zone_invalid_write(uint16_t cid, uint64_t slba, uint64_t wp) "cid %"PRIu16" lba 0x%"PRIx64" wp 0x%"PRIx64"" pci_nvme_err_zone_boundary(uint16_t cid, uint64_t slba, uint32_t nlb, uint64_t zcap) "cid %"PRIu16" lba 0x%"PRIx64" nlb %"PRIu32" zcap 0x%"PRIx64"" pci_nvme_err_zone_pending_writes(uint16_t cid, uint64_t zslba, uint64_t wp, uint64_t wp_staging) "cid %"PRIu16" zslba 0x%"PRIx64" wp 0x%"PRIx64" wp_staging 0x%"PRIx64"" +pci_nvme_err_too_many_active_zones(uint16_t cid) "cid %"PRIu16"" +pci_nvme_err_too_many_open_zones(uint16_t cid) "cid %"PRIu16"" pci_nvme_err_invalid_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_num_sgld(uint16_t cid, uint8_t typ) "cid %"PRIu16" type 0x%"PRIx8"" pci_nvme_err_invalid_sgl_excess_length(uint16_t cid) "cid %"PRIu16""