From patchwork Mon Sep 28 02:35:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADF50C4727D for ; Mon, 28 Sep 2020 02:38:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3D72522207 for ; Mon, 28 Sep 2020 02:38:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="k6oCJhsp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3D72522207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35998 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj3P-0008SG-4I for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:38:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46266) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0w-0006Ur-8m; Sun, 27 Sep 2020 22:35:45 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0s-0003Li-KN; Sun, 27 Sep 2020 22:35:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260538; x=1632796538; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FdcPU9BnrZi3NPj38GmeY1mLOczZxKVC6LrD2h8UTBU=; b=k6oCJhspPWXm0JTbvNQVKbyAgJfrMGzgw3XYitMQBRxNPGvKyeNjOtrf mV+ElB3MtXBrDwS4GWDc1S2O0Vmd9wucL1QK97RHSFmrznuwj27wj/huD +Nhah1rypifHS5kB5CZwEQY8gWFJPygILKFMc8ywtVgUnL0XXYZEELi/M KxelmKl8qFavLNfMIKViAvRxiY/fM+u88a+c56W3PKM6uzlIX+4XxExsI X8xzvnW7+tvTtHhthV1ZSE9OsyrL6OOYiWQPWrtk1YaozexhrdsQfGQPl OJseHJ+ntUkUs69eC8f3MiA7D6DgTfbzqkyvbz8OJ2LBmAA3Jl7KDC3I8 g==; IronPort-SDR: 1OH6S2WlnY3BmA5XkdB4ZUWXHsz+9lAABVangna3SSKfx2tEhRHB6lSnU6PSgwQTOU9oQS+Q7d tmBnPzUDLt5+Sw623pEW94frmfNz55s+NAbIRUKRnoVeehG9/rZWlWXkKuRnX/hrnFDijygSIX oUArQpOnjXwX5wHONnakCKdliBsphTJLX47Lne767QNpB/dE4knrMIgDbVnryvBEsC67UVvC4Y qzKi6HBLFUVaPzczxDLAyYxO39WQ6rJiXhJOMUaLHP6a1G+SvqZOVdjaloRalDWcWysSblhGMV i0Y= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125225" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:36 +0800 IronPort-SDR: T+pCHnq7ZHiF2VbzSnCiWkYB4jjMaGjBV37p76PeKL1rCnMp+7utXKloakFzsWQI0sBx6RV+iO rnvG0xayp/VA== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:32 -0700 IronPort-SDR: ngUfx7ozmazbInmh6By1PgHdolbI4ETsm6RS6b5Sm5HOXEggNm6ya3TCiJYrrA7NY/o6C9Dyxv YMyFU+IuflZQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:35 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 01/14] hw/block/nvme: Report actual LBA data shift in LBAF Date: Mon, 28 Sep 2020 11:35:15 +0900 Message-Id: <20200928023528.15260-2-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Calculate the data shift value to report based on the set value of logical_block_size device property. In the process, use a local variable to calculate the LBA format index instead of the hardcoded value 0. This makes the code more readable and it will make it easier to add support for multiple LBA formats in the future. Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- hw/block/nvme-ns.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 2ba0263dda..bbd7879492 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -47,6 +47,8 @@ static void nvme_ns_init(NvmeNamespace *ns) static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) { + int lba_index; + if (!blkconf_blocksizes(&ns->blkconf, errp)) { return -1; } @@ -67,6 +69,9 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) n->features.vwc = 0x1; } + lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); + ns->id_ns.lbaf[lba_index].ds = 31 - clz32(n->conf.logical_block_size); + return 0; } From patchwork Mon Sep 28 02:35:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1999C4346E for ; Mon, 28 Sep 2020 02:38:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 997EF22207 for ; Mon, 28 Sep 2020 02:38:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="MplvjZkZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 997EF22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36318 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj3R-0000BX-Kz for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:38:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46296) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0y-0006Uy-1N; Sun, 27 Sep 2020 22:35:45 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21622) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0w-0003Lu-3P; Sun, 27 Sep 2020 22:35:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260541; x=1632796541; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kAOjJbQK0O4qA6LsZ0F+BQT4VgQqwVKlOLpOFzsyTzs=; b=MplvjZkZ6mnGo63vkDq0W4mwxH4UoHXhiDq9S8fQvQqCWMAdqyJB8PPV YWjs03qJdyZYlzt4oUcA7jDcqOv/V6bleh98HD2r+bhDVBBUsVJ4MX/6h VPYtH14ICZGkbpm9RxcgQh3MucpW56FVPU35IBtYUqq75hxnig6MPMfIy 7pAnWnJZqjZE5mYYGi2beSDN0Wffek53l8adt3tUtSBZHgxmePwo3b0rP Fpm/JrLgGn8GFAVoXMwR3NFNPnBgj9emb68bjqlOg2veWNSvbOh88E5JO mH0NnX75qJFu+/af81J9gxiDA4f9KHXsTyRBQXXaKJouPWqVauR1nDMpO w==; IronPort-SDR: dBzMJlUJndunAypvID94++evb3607d/PPdofgp3cFpzf0iCP9RExWxwo7pt+ej4VUQv22eYZca 1K01S5fXYJEe9zciBBVGhA+r/PIHt9GWd/W/ECpLlDDPSb2q2c4FhrH1vtKDC0zli/fejAL8li zSY4tIW/FuhGYwIxbngxwoF1woB9JnWXXcMg/KKRAaDtEPZ88IOlv3bSQFvFSv18nHWCw1m0XG JUhEO+JbHbCzJuk+Elf1ZqhkiLuU5+eOoyQ7sTVaWYYVpVZsLfRjhvrHLqIrC2YMOACUtwzbkz pDY= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125227" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:38 +0800 IronPort-SDR: VIMArm/i/WspenA4qcgAWdI+AtJen04vxV+Ir851UQWRnsQW/xd2rwqpmstFnYcEMoCrBmM6w0 6RTjAOVQVxUQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:35 -0700 IronPort-SDR: NKTndDPg2YftlEr07tcT8/+KsYzVp1UgRJmuP5pUKYzzBVu9hD8mGTwzn5TjqJ9QR5QnGgdYm4 WTLuch4lLoJw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:37 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 02/14] hw/block/nvme: Add Commands Supported and Effects log Date: Mon, 28 Sep 2020 11:35:16 +0900 Message-Id: <20200928023528.15260-3-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This log page becomes necessary to implement to allow checking for Zone Append command support in Zoned Namespace Command Set. This commit adds the code to report this log page for NVM Command Set only. The parts that are specific to zoned operation will be added later in the series. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 41 ++++++++++++++++++++++++++++++++++++++++- hw/block/trace-events | 2 ++ include/block/nvme.h | 19 +++++++++++++++++++ 3 files changed, 61 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index da8344f196..1ddc7e52cc 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1301,6 +1301,43 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, + uint64_t off, NvmeRequest *req) +{ + NvmeEffectsLog cmd_eff_log = {}; + uint32_t *iocs = cmd_eff_log.iocs; + uint32_t *acs = cmd_eff_log.acs; + uint32_t trans_len; + + trace_pci_nvme_cmd_supp_and_effects_log_read(); + + if (off >= sizeof(cmd_eff_log)) { + trace_pci_nvme_err_invalid_effects_log_offset(off); + return NVME_INVALID_FIELD | NVME_DNR; + } + + acs[NVME_ADM_CMD_DELETE_SQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_CREATE_SQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_DELETE_CQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_CREATE_CQ] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_IDENTIFY] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_SET_FEATURES] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_GET_FEATURES] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFFECTS_CSUPP; + acs[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFFECTS_CSUPP; + + iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + + trans_len = MIN(sizeof(cmd_eff_log) - off, buf_len); + + return nvme_dma(n, ((uint8_t *)&cmd_eff_log) + off, trans_len, + DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = &req->cmd; @@ -1344,6 +1381,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) return nvme_smart_info(n, rae, len, off, req); case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); + case NVME_LOG_CMD_EFFECTS: + return nvme_cmd_effects(n, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -2743,7 +2782,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) id->acl = 3; id->aerl = n->params.aerl; id->frmw = (NVME_NUM_FW_SLOTS << 1) | NVME_FRMW_SLOT1_RO; - id->lpa = NVME_LPA_EXTENDED; + id->lpa = NVME_LPA_CSE | NVME_LPA_EXTENDED; /* recommended default value (~70 C) */ id->wctemp = cpu_to_le16(NVME_TEMPERATURE_WARNING); diff --git a/hw/block/trace-events b/hw/block/trace-events index bbe6f27367..2929a8df11 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -86,6 +86,7 @@ pci_nvme_mmio_start_success(void) "setting controller enable bit succeeded" pci_nvme_mmio_stopped(void) "cleared controller enable bit" pci_nvme_mmio_shutdown_set(void) "shutdown bit set" pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" +pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -104,6 +105,7 @@ pci_nvme_err_invalid_prp(void) "invalid PRP" pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" +pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" diff --git a/include/block/nvme.h b/include/block/nvme.h index 58647bcdad..a738c8f9ba 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -734,10 +734,27 @@ enum NvmeSmartWarn { NVME_SMART_FAILED_VOLATILE_MEDIA = 1 << 4, }; +typedef struct NvmeEffectsLog { + uint32_t acs[256]; + uint32_t iocs[256]; + uint8_t resv[2048]; +} NvmeEffectsLog; + +enum { + NVME_CMD_EFFECTS_CSUPP = 1 << 0, + NVME_CMD_EFFECTS_LBCC = 1 << 1, + NVME_CMD_EFFECTS_NCC = 1 << 2, + NVME_CMD_EFFECTS_NIC = 1 << 3, + NVME_CMD_EFFECTS_CCC = 1 << 4, + NVME_CMD_EFFECTS_CSE_MASK = 3 << 16, + NVME_CMD_EFFECTS_UUID_SEL = 1 << 19, +}; + enum NvmeLogIdentifier { NVME_LOG_ERROR_INFO = 0x01, NVME_LOG_SMART_INFO = 0x02, NVME_LOG_FW_SLOT_INFO = 0x03, + NVME_LOG_CMD_EFFECTS = 0x05, }; typedef struct QEMU_PACKED NvmePSD { @@ -849,6 +866,7 @@ enum NvmeIdCtrlFrmw { }; enum NvmeIdCtrlLpa { + NVME_LPA_CSE = 1 << 1, NVME_LPA_EXTENDED = 1 << 2, }; @@ -1048,6 +1066,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeErrorLog) != 64); QEMU_BUILD_BUG_ON(sizeof(NvmeFwSlotInfoLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); + QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); From patchwork Mon Sep 28 02:35:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA7C0C4346E for ; Mon, 28 Sep 2020 02:41:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 66BEE22207 for ; Mon, 28 Sep 2020 02:41:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Lu4+gK4X" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 66BEE22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45136 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj6Z-0003xP-G5 for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:41:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46298) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0y-0006V0-Hq; Sun, 27 Sep 2020 22:35:45 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21617) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0w-0003LT-JM; Sun, 27 Sep 2020 22:35:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260542; x=1632796542; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=H7ZX4AlTNWoqe839pAwmEA2kOtcpZA6VJK2unBbgalk=; b=Lu4+gK4XD1zfoFY2iC1ol8RQTrEiZbaybBSGtl92etCrbKU07twwK7bj msXlcCeczK0FHdZGuR9jtx0hdyWLbR5n0H5AhkAsI02lWGORax8XGqmhF hOve+OJOAzprWWp7CcbsG36urpCyOXxodfdqhscnnyzmldmtA/N2mNWV6 GXEUtRTsDcfyh197lQTwEtMx3N3h0rP+KVm8DeGgAJEgXvQ6t47DB9zFn 2XEL6PFpvNgK67UrevEjzhABbNryZpffsMD55k8AA8BUUg4dNM3HJcp1N b++4ZPIk/Rmp/sHeaQnwrUBaMtb77bqBP9OwxieSgCVUx5p1GXTzM+n5T A==; IronPort-SDR: U70jL+O35QBytAxHBZOHwG8DKNmkrsbdPZY2O4E8gRw7pFEDRVwkxxy039FbU/lby1acO0kRX6 Wb0MWCRJ55EB0egjXqkrH1CA7MnDMrPRNxkzpBdcX6P0wDgKrrlkvbUBbKLjSNxt9+i6NrlUBg ob/s8UGDj+7y21q7dkVHXaXRO2OmsOPVu7mcwUs8dqufs4KmLLVRPxmFCFh2GQl/MBZ0QJ35G8 WXNfhB2SmRIakvXq1ZgqJAE9w34yonVA8WBhp2l1Ld9w2gzBAYS1twtR58GpaDLyFsuKFDFlN6 P/w= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125230" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:41 +0800 IronPort-SDR: P+popncvfE4Ufu9dMSTjbjIuEOxz+BNGCvETyM76glDE2jZChwauHz1j7/hXawQnnlbFrgWIKs MwVbkjvfY9gQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:37 -0700 IronPort-SDR: 4V8IMU9nqRgwWl0lRfwoDfmlQJ80UHKDmi+0onrGUCvJmscUy2XyLPOOLLnTvwUTV9RkAH866C YTaJjWeIP6uQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:39 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 03/14] hw/block/nvme: Introduce the Namespace Types definitions Date: Mon, 28 Sep 2020 11:35:17 +0900 Message-Id: <20200928023528.15260-4-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Define the structures and constants required to implement Namespace Types support. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.h | 2 ++ hw/block/nvme.c | 2 +- include/block/nvme.h | 74 +++++++++++++++++++++++++++++++++++--------- 3 files changed, 63 insertions(+), 15 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 83734f4606..cca23bc0b3 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -21,6 +21,8 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; + uint8_t csi; + QemuUUID uuid; } NvmeNamespaceParams; typedef struct NvmeNamespace { diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 1ddc7e52cc..29fa005fa2 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1598,7 +1598,7 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) * here. */ ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; - ns_descrs->uuid.hdr.nidl = NVME_NIDT_UUID_LEN; + ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID; stl_be_p(&ns_descrs->uuid.v, nsid); return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, diff --git a/include/block/nvme.h b/include/block/nvme.h index a738c8f9ba..4587311783 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -51,6 +51,11 @@ enum NvmeCapMask { CAP_PMR_MASK = 0x1, }; +enum NvmeCapCssBits { + CAP_CSS_NVM = 0x01, + CAP_CSS_CSI_SUPP = 0x40, +}; + #define NVME_CAP_MQES(cap) (((cap) >> CAP_MQES_SHIFT) & CAP_MQES_MASK) #define NVME_CAP_CQR(cap) (((cap) >> CAP_CQR_SHIFT) & CAP_CQR_MASK) #define NVME_CAP_AMS(cap) (((cap) >> CAP_AMS_SHIFT) & CAP_AMS_MASK) @@ -102,6 +107,12 @@ enum NvmeCcMask { CC_IOCQES_MASK = 0xf, }; +enum NvmeCcCss { + CSS_NVM_ONLY = 0, + CSS_CSI = 6, + CSS_ADMIN_ONLY = 7, +}; + #define NVME_CC_EN(cc) ((cc >> CC_EN_SHIFT) & CC_EN_MASK) #define NVME_CC_CSS(cc) ((cc >> CC_CSS_SHIFT) & CC_CSS_MASK) #define NVME_CC_MPS(cc) ((cc >> CC_MPS_SHIFT) & CC_MPS_MASK) @@ -110,6 +121,21 @@ enum NvmeCcMask { #define NVME_CC_IOSQES(cc) ((cc >> CC_IOSQES_SHIFT) & CC_IOSQES_MASK) #define NVME_CC_IOCQES(cc) ((cc >> CC_IOCQES_SHIFT) & CC_IOCQES_MASK) +#define NVME_SET_CC_EN(cc, val) \ + (cc |= (uint32_t)((val) & CC_EN_MASK) << CC_EN_SHIFT) +#define NVME_SET_CC_CSS(cc, val) \ + (cc |= (uint32_t)((val) & CC_CSS_MASK) << CC_CSS_SHIFT) +#define NVME_SET_CC_MPS(cc, val) \ + (cc |= (uint32_t)((val) & CC_MPS_MASK) << CC_MPS_SHIFT) +#define NVME_SET_CC_AMS(cc, val) \ + (cc |= (uint32_t)((val) & CC_AMS_MASK) << CC_AMS_SHIFT) +#define NVME_SET_CC_SHN(cc, val) \ + (cc |= (uint32_t)((val) & CC_SHN_MASK) << CC_SHN_SHIFT) +#define NVME_SET_CC_IOSQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOSQES_MASK) << CC_IOSQES_SHIFT) +#define NVME_SET_CC_IOCQES(cc, val) \ + (cc |= (uint32_t)((val) & CC_IOCQES_MASK) << CC_IOCQES_SHIFT) + enum NvmeCstsShift { CSTS_RDY_SHIFT = 0, CSTS_CFS_SHIFT = 1, @@ -524,8 +550,13 @@ typedef struct QEMU_PACKED NvmeIdentify { uint64_t rsvd2[2]; uint64_t prp1; uint64_t prp2; - uint32_t cns; - uint32_t rsvd11[5]; + uint8_t cns; + uint8_t rsvd10; + uint16_t ctrlid; + uint16_t nvmsetid; + uint8_t rsvd11; + uint8_t csi; + uint32_t rsvd12[4]; } NvmeIdentify; typedef struct QEMU_PACKED NvmeRwCmd { @@ -645,6 +676,7 @@ enum NvmeStatusCodes { NVME_MD_SGL_LEN_INVALID = 0x0010, NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, + NVME_CMD_SET_CMB_REJECTED = 0x002b, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -771,11 +803,15 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 -enum { - NVME_ID_CNS_NS = 0x0, - NVME_ID_CNS_CTRL = 0x1, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x2, - NVME_ID_CNS_NS_DESCR_LIST = 0x3, +enum NvmeIdCns { + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { @@ -922,6 +958,7 @@ enum NvmeFeatureIds { NVME_WRITE_ATOMICITY = 0xa, NVME_ASYNCHRONOUS_EVENT_CONF = 0xb, NVME_TIMESTAMP = 0xe, + NVME_COMMAND_SET_PROFILE = 0x19, NVME_SOFTWARE_PROGRESS_MARKER = 0x80, NVME_FID_MAX = 0x100, }; @@ -1006,18 +1043,26 @@ typedef struct QEMU_PACKED NvmeIdNsDescr { uint8_t rsvd2[2]; } NvmeIdNsDescr; -enum { - NVME_NIDT_EUI64_LEN = 8, - NVME_NIDT_NGUID_LEN = 16, - NVME_NIDT_UUID_LEN = 16, +enum NvmeNsIdentifierLength { + NVME_NIDL_EUI64 = 8, + NVME_NIDL_NGUID = 16, + NVME_NIDL_UUID = 16, + NVME_NIDL_CSI = 1, }; enum NvmeNsIdentifierType { - NVME_NIDT_EUI64 = 0x1, - NVME_NIDT_NGUID = 0x2, - NVME_NIDT_UUID = 0x3, + NVME_NIDT_EUI64 = 0x01, + NVME_NIDT_NGUID = 0x02, + NVME_NIDT_UUID = 0x03, + NVME_NIDT_CSI = 0x04, }; +enum NvmeCsi { + NVME_CSI_NVM = 0x00, +}; + +#define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1068,6 +1113,7 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); From patchwork Mon Sep 28 02:35:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 588F2C4346E for ; Mon, 28 Sep 2020 02:43:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A574B22262 for ; Mon, 28 Sep 2020 02:43:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="YPGGX+Em" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A574B22262 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53490 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj8l-0007M1-Cy for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:43:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46318) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj10-0006Vu-Gh; Sun, 27 Sep 2020 22:35:46 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj0y-0003Li-Lh; Sun, 27 Sep 2020 22:35:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260544; x=1632796544; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5V4B9jY8gojQYVX7Dhb77g+hxKORtiI2eNVWciSPRBI=; b=YPGGX+EmY3tQgI7gAdruzZmnH4PPEINPNm/ykYlDUON4ZUOWHgrJRJYf XJIoHU+8+di6wedXUR9w8SD2xTGmk/sNeWlXrbQULFFqiC5skgFdo9zEH o3wRsplvsc0IIIlTBpdqlnnFxI2RW+VMl3k8s2OxgR7RMuE4T1oJPaU5A iQbI8v9iCauf8s9O7xf1DXM6RczodX5/5yNBXV5RATr17JJlYyxlVJzCa diEvLsVAevuuybOXVbR3sAJveRa2nuco/atoiFXmPxKBjQ6xyTw6XYyOq JlJ1nR4ShhSG1dw8g/Hx03tFeDu9OmYORKMNypVXqerkMyqS8piHBuNRQ Q==; IronPort-SDR: mRjRIqj5l+iXDGKKSJ2koRnMypOPfUY3cjdgoDmndabgJ4QLhTTDZqn4PKZcbi2nU4gdORjDx8 NIpGpi1RY5EBppEpeMQ9XHelNZ9ag3Mqd+15GOrAeMiUhVQVuLMvJCP0SgEdV22tq6z4hIJ/I1 KniIzrYleIntLKAlW3IytSYlpasozUkrqDDxy5Tj6yog+uxtrKCf1ONsPvrK45iEZGNAN6TTCk f56HII0B1HNj9LN95TmcPKmW4m8guOb4eUp7LE3ZgwQHMl0a362pOneGMpEvtNa9rlFv/pmlv9 rTA= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125234" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:43 +0800 IronPort-SDR: ToH3UbitqD50S8zEoZxt7DREzsE+azPHcsfd9l4dCjZT4ZignN+xbAppMQtBp6m28l5neR1EkG pV+Y4bjRE+mg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:39 -0700 IronPort-SDR: iYoxKqfW/L9SXmjdSvgSDfy7iiWKDLvxCYbrYGw78I5eR0reF2eKuuVco6SkpstL/7MjT/aqka gyVPF6HWlaHw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:41 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 04/14] hw/block/nvme: Define trace events related to NS Types Date: Mon, 28 Sep 2020 11:35:18 +0900 Message-Id: <20200928023528.15260-5-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A few trace events are defined that are relevant to implementing Namespace Types (NVMe TP 4056). Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- hw/block/trace-events | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/block/trace-events b/hw/block/trace-events index 2929a8df11..b93429b04c 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -49,8 +49,12 @@ pci_nvme_create_cq(uint64_t addr, uint16_t cqid, uint16_t vector, uint16_t size, pci_nvme_del_sq(uint16_t qid) "deleting submission queue sqid=%"PRIu16"" pci_nvme_del_cq(uint16_t cqid) "deleted completion queue, cqid=%"PRIu16"" pci_nvme_identify_ctrl(void) "identify controller" +pci_nvme_identify_ctrl_csi(uint8_t csi) "identify controller, csi=0x%"PRIx8"" pci_nvme_identify_ns(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_ns_csi(uint32_t ns, uint8_t csi) "nsid=%"PRIu32", csi=0x%"PRIx8"" pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32"" +pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8"" +pci_nvme_identify_cmd_set(void) "identify i/o command set" pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32"" pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64"" pci_nvme_getfeat(uint16_t cid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32"" @@ -87,6 +91,8 @@ pci_nvme_mmio_stopped(void) "cleared controller enable bit" pci_nvme_mmio_shutdown_set(void) "shutdown bit set" pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" +pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set selected by host, bar.cc=0x%"PRIx32"" +pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets selected by host, bar.cc=0x%"PRIx32"" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -106,6 +112,9 @@ pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" +pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller is enabled" +pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" pci_nvme_err_invalid_create_sq_sqid(uint16_t sqid) "failed creating submission queue, invalid sqid=%"PRIu16"" @@ -161,6 +170,7 @@ pci_nvme_ub_db_wr_invalid_cq(uint32_t qid) "completion queue doorbell write for pci_nvme_ub_db_wr_invalid_cqhead(uint32_t qid, uint16_t new_head) "completion queue doorbell write value beyond queue size, cqid=%"PRIu32", new_head=%"PRIu16", ignoring" pci_nvme_ub_db_wr_invalid_sq(uint32_t qid) "submission queue doorbell write for nonexistent queue, sqid=%"PRIu32", ignoring" pci_nvme_ub_db_wr_invalid_sqtail(uint32_t qid, uint16_t new_tail) "submission queue doorbell write value beyond queue size, sqid=%"PRIu32", new_head=%"PRIu16", ignoring" +pci_nvme_ub_unknown_css_value(void) "unknown value in cc.css field" # xen-block.c xen_block_realize(const char *type, uint32_t disk, uint32_t partition) "%s d%up%u" From patchwork Mon Sep 28 02:35:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9DBEC4741F for ; Mon, 28 Sep 2020 02:38:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 51C0722262 for ; Mon, 28 Sep 2020 02:38:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Q5Chr0KK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 51C0722262 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36050 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj3P-0008Tb-42 for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:38:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46364) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj14-0006eA-3f; Sun, 27 Sep 2020 22:35:50 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj11-0003Li-Bg; Sun, 27 Sep 2020 22:35:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260547; x=1632796547; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t2Qxx4najmz146ohxIowPpGXNgaSx6iVZjiHXOTR+tk=; b=Q5Chr0KK3+XK25KY6NLtB1jPBJRJeohgpyXwaAvgzR46WWYV3tNuX8ff wxkcNzKE1yIWUSU2BsZv10kHUpJL9C5NR+zsYsWl95RJXmOClUJFqiQaa twoPpgaom79JDGQnwdjg+hs/Xn8EAjiRtl25okcpr4wcooTJCISlxGToV dWVLLDpmz/mMCBkEAXW1hhPMKMbZG/DljZtHTvA1Yx41tMGl1mZxOhOBn 3ahUnNrX5fmzKOE7o6C76oXgh162KRIJZkU8yPQ/A5RURdVVGyMtOenAl 1K1Y2PU7IivBHUbjGl1O4zU33a7SZaZbLFYWBwtKnzd/gE4qJSZLUDbBF Q==; IronPort-SDR: I8Elb/UX9NL8v8aHay2YHmHj2jOVNGWhvyoYunIYJLMNly9oi6622K2JeQet8bVseJduGYqFxZ 9W/nPPFpNN5ZljjL90ip75rdigom5U5lGZhXx1NlOl3h/oB/3EWfiJgHlLYBfCq/iV3ohaUNQn x/z3EnGjUjIBzVS+PiXET24GW3z1RiejtFP4nMT86nnhoGsr5GjJaFY4TH49CzSUr5+mvkF1Ti 2t7wyr+cgYANgrk1pfMN5zM7Y/NtkLfIlcUWXVz+ZFLH+dn0zOGqIriX7w0Fc6bNXD2d7Gf6zV MN4= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125237" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:45 +0800 IronPort-SDR: q1lxyNLdPyiU5Ini4Ch6mkRKqRQLrnrWH8dwN23NAYUj+CadZAazM2QV0HpTUjLloA+IsPRJBh a/iU9W5h9cTQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:41 -0700 IronPort-SDR: 5oVHSBpFJYjWdZ6vNDuuF/EvtBdUMEgN7kxtF3PZHPnD0M/Yjo2/ofsouUkhfXZx/wDu5zKv6k ZJRHgQQNIsDQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:44 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace Types Date: Mon, 28 Sep 2020 11:35:19 +0900 Message-Id: <20200928023528.15260-6-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel Namespace Types introduce a new command set, "I/O Command Sets", that allows the host to retrieve the command sets associated with a namespace. Introduce support for the command set and enable detection for the NVM Command Set. The new workflows for identify commands rely heavily on zero-filled identify structs. E.g., certain CNS commands are defined to return a zero-filled identify struct when an inactive namespace NSID is supplied. Add a helper function in order to avoid code duplication when reporting zero-filled identify structures. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.c | 3 + hw/block/nvme.c | 210 +++++++++++++++++++++++++++++++++++++-------- 2 files changed, 175 insertions(+), 38 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index bbd7879492..31b7f986c3 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -40,6 +40,9 @@ static void nvme_ns_init(NvmeNamespace *ns) id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); + ns->params.csi = NVME_CSI_NVM; + qemu_uuid_generate(&ns->params.uuid); /* TODO make UUIDs persistent */ + /* no thin provisioning */ id_ns->ncap = id_ns->nsze; id_ns->nuse = id_ns->ncap; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 29fa005fa2..4ec1ddc90a 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1495,6 +1495,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req) return NVME_SUCCESS; } +static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req) +{ + uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {}; + + return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_identify_ctrl(); @@ -1503,11 +1510,23 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + + trace_pci_nvme_identify_ctrl_csi(c->csi); + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - NvmeIdNs *id_ns, inactive = { 0 }; uint32_t nsid = le32_to_cpu(c->nsid); trace_pci_nvme_identify_ns(nsid); @@ -1518,23 +1537,46 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) ns = nvme_ns(n, nsid); if (unlikely(!ns)) { - id_ns = &inactive; - } else { - id_ns = &ns->id_ns; + return nvme_rpt_empty_id_struct(n, req); } - return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs), + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), DMA_DIRECTION_FROM_DEVICE, req); } +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeNamespace *ns; + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint32_t nsid = le32_to_cpu(c->nsid); + + trace_pci_nvme_identify_ns_csi(nsid, c->csi); + + if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) { + return NVME_INVALID_NSID | NVME_DNR; + } + + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { + return nvme_rpt_empty_id_struct(n, req); + } + + if (c->csi == NVME_CSI_NVM) { + return nvme_rpt_empty_id_struct(n, req); + } + + return NVME_INVALID_FIELD | NVME_DNR; +} + static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; - static const int data_len = NVME_IDENTIFY_DATA_SIZE; uint32_t min_nsid = le32_to_cpu(c->nsid); - uint32_t *list; - uint16_t ret; - int j = 0; + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + uint32_t *list_ptr = (uint32_t *)list; + int i, j = 0; trace_pci_nvme_identify_nslist(min_nsid); @@ -1548,48 +1590,76 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_NSID | NVME_DNR; } - list = g_malloc0(data_len); - for (int i = 1; i <= n->num_namespaces; i++) { - if (i <= min_nsid || !nvme_ns(n, i)) { + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { continue; } - list[j++] = cpu_to_le32(i); + if (ns->params.nsid < min_nsid) { + continue; + } + list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; } } - ret = nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVICE, - req); - g_free(list); - return ret; + + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); +} + +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeNamespace *ns; + NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + uint32_t min_nsid = le32_to_cpu(c->nsid); + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + uint32_t *list_ptr = (uint32_t *)list; + int i, j = 0; + + trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi); + + if (c->csi != NVME_CSI_NVM) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + if (ns->params.nsid < min_nsid) { + continue; + } + list_ptr[j++] = cpu_to_le32(ns->params.nsid); + if (j == data_len / sizeof(uint32_t)) { + break; + } + } + + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeNamespace *ns; uint32_t nsid = le32_to_cpu(c->nsid); - uint8_t list[NVME_IDENTIFY_DATA_SIZE]; - - struct data { - struct { - NvmeIdNsDescr hdr; - uint8_t v[16]; - } uuid; - }; - - struct data *ns_descrs = (struct data *)list; + NvmeIdNsDescr *desc; + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + void *list_ptr = list; trace_pci_nvme_identify_ns_descr_list(nsid); - if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) { - return NVME_INVALID_NSID | NVME_DNR; - } - if (unlikely(!nvme_ns(n, nsid))) { return NVME_INVALID_FIELD | NVME_DNR; } - memset(list, 0x0, sizeof(list)); + ns = nvme_ns(n, nsid); + if (unlikely(!ns)) { + return nvme_rpt_empty_id_struct(n, req); + } /* * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data @@ -1597,12 +1667,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) * Namespace Identification Descriptor. Add a very basic Namespace UUID * here. */ - ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID; - ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID; - stl_be_p(&ns_descrs->uuid.v, nsid); + desc = list_ptr; + desc->nidt = NVME_NIDT_UUID; + desc->nidl = NVME_NIDL_UUID; + list_ptr += sizeof(*desc); + memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID); + list_ptr += NVME_NIDL_UUID; - return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE, - DMA_DIRECTION_FROM_DEVICE, req); + desc = list_ptr; + desc->nidt = NVME_NIDT_CSI; + desc->nidl = NVME_NIDL_CSI; + list_ptr += sizeof(*desc); + *(uint8_t *)list_ptr = NVME_CSI_NVM; + + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); +} + +static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) +{ + uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; + static const int data_len = sizeof(list); + + trace_pci_nvme_identify_cmd_set(); + + NVME_SET_CSI(*list, NVME_CSI_NVM); + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) @@ -1612,12 +1701,20 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: return nvme_identify_ns(n, req); + case NVME_ID_CNS_CS_NS: + return nvme_identify_ns_csi(n, req); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); + case NVME_ID_CNS_CS_CTRL: + return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: return nvme_identify_nslist(n, req); + case NVME_ID_CNS_CS_NS_ACTIVE_LIST: + return nvme_identify_nslist_csi(n, req); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); + case NVME_ID_CNS_IO_COMMAND_SET: + return nvme_identify_cmd_set(n, req); default: trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns)); return NVME_INVALID_FIELD | NVME_DNR; @@ -1799,6 +1896,9 @@ defaults: result |= NVME_INTVC_NOCOALESCING; } + break; + case NVME_COMMAND_SET_PROFILE: + result = 0; break; default: result = nvme_feature_default[fid]; @@ -1939,6 +2039,12 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) break; case NVME_TIMESTAMP: return nvme_set_feature_timestamp(n, req); + case NVME_COMMAND_SET_PROFILE: + if (dw11 & 0x1ff) { + trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff); + return NVME_CMD_SET_CMB_REJECTED | NVME_DNR; + } + break; default: return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR; } @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, break; case 0x14: /* CC */ trace_pci_nvme_mmio_cfg(data & 0xffffffff); + + if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) { + if (NVME_CC_EN(n->bar.cc)) { + NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled, + "changing selected command set when enabled"); + } else { + switch (NVME_CC_CSS(data)) { + case CSS_NVM_ONLY: + trace_pci_nvme_css_nvm_cset_selected_by_host(data & + 0xffffffff); + break; + case CSS_CSI: + NVME_SET_CC_CSS(n->bar.cc, CSS_CSI); + trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff); + break; + case CSS_ADMIN_ONLY: + break; + default: + NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value, + "unknown value in CC.CSS field"); + } + } + } + /* Windows first sends data, then sends enable bit */ if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) && !NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc)) @@ -2810,7 +2940,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_MQES(n->bar.cap, 0x7ff); NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); - NVME_CAP_SET_CSS(n->bar.cap, 1); + /* + * The device now always supports NS Types, but all commands + * that support CSI field will only handle NVM Command Set. + */ + NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP)); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); n->bar.vs = NVME_SPEC_VER; From patchwork Mon Sep 28 02:35:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7BD3C4346E for ; Mon, 28 Sep 2020 02:42:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5963422207 for ; Mon, 28 Sep 2020 02:42:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="cJMipVga" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5963422207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47582 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj77-0004y3-He for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:42:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46382) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj15-0006i2-Qr; Sun, 27 Sep 2020 22:35:51 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21635) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj13-0003Me-20; Sun, 27 Sep 2020 22:35:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260548; x=1632796548; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UDuLJS8ivD9CXbE+11TI2ORdz66TvhIrpaAfBRSi47o=; b=cJMipVgaU+JKNTq1hBeUD8po5rLbrIWInUQ2YwbRH98jEmefWswnBB+3 jUFoa6yy1oTLdivhAzqJfM56D0KSbo7cD28ZFX9J3/3Y/9NQ1KSRcIK/l TW01iKdY2OuY603w7hb5VyNNWEHO6VrMbqMRb4ELCYxg4Tj2yBk/61y0C nyKXWjliitJEIA8xX4UTcN+A01Bk7QMfRfEpm7z3+PuUIKmaTNiKvH6ef EkDORBorVqxbv0TFQWaoZXf0FMFvTFV+MiFE+XZlRhH89EACNhVLsy1NN jEOPqJCeXQ7gG0gKzOdpORyQmdTsj0oZfUPw5p7ErE3OfEGJaRYCqWJ/F A==; IronPort-SDR: +fDfnpunvuiv21PHQZyYaMJJHvaTKdTOAu6EqJLchOiHnrLTALbfNDBK6dkKa5KJ1iBxbSr1fh BX084xf7e/JPIeKEqdfKucu/SyK55ByGfuoLQW3xWOL+O0KCKH7qPmVc78flPFeHbgg+W4coGn f46BciReZvVpjEXtYiZrDWWf8U6SnaHRNuqKXLb5G6HpPyn21fOpuF/FVdFdR5Eg29rT1TuRGh 9ZGuFzGJg/FWqrYJzeG+KmoSocmr+oS0X07KR7IU3Cigb90t38wIVbm1I9ER17iWjzIvCdGRgC Eck= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125238" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:47 +0800 IronPort-SDR: 420OFxD41w2DRShhWUs22zq9moxhasx01CDMslf1cqBolP7JtfWHHJWQZ2hGdLbsgjeDnfMKOz z8cEmxVyiyrg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:43 -0700 IronPort-SDR: zU4B6l/MF/it5oRIUc4lVhffLLtZw5557RX5U0cvp2GqHtXe92Xz3tnnBbXLcBA/K3CtCToUV2 KcGXANpT74tg== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:46 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 06/14] hw/block/nvme: Add support for active/inactive namespaces Date: Mon, 28 Sep 2020 11:35:20 +0900 Message-Id: <20200928023528.15260-7-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Niklas Cassel In NVMe, a namespace is active if it exists and is attached to the controller. CAP.CSS (together with the I/O Command Set data structure) defines what command sets are supported by the controller. CC.CSS (together with Set Profile) can be set to enable a subset of the available command sets. The namespaces belonging to a disabled command set will not be able to attach to the controller, and will thus be inactive. E.g., if the user sets CC.CSS to Admin Only, NVM namespaces should be marked as inactive. The identify namespace, the identify namespace CSI specific, and the namespace list commands have two different versions, one that only shows active namespaces, and the other version that shows existing namespaces, regardless of whether the namespace is attached or not. Add an attached member to struct NvmeNamespace, and implement the missing CNS commands. The added functionality will also simplify the implementation of namespace management in the future, since namespace management can also attach and detach namespaces. Signed-off-by: Niklas Cassel Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.h | 1 + hw/block/nvme.c | 60 ++++++++++++++++++++++++++++++++++++++------ include/block/nvme.h | 20 +++++++++------ 3 files changed, 65 insertions(+), 16 deletions(-) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index cca23bc0b3..acdb76f058 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -22,6 +22,7 @@ typedef struct NvmeNamespaceParams { uint32_t nsid; uint8_t csi; + bool attached; QemuUUID uuid; } NvmeNamespaceParams; diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 4ec1ddc90a..63ad03d6d6 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1523,7 +1523,8 @@ static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1540,11 +1541,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->params.attached) { + return nvme_rpt_empty_id_struct(n, req); + } + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1561,6 +1567,10 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return nvme_rpt_empty_id_struct(n, req); } + if (only_active && !ns->params.attached) { + return nvme_rpt_empty_id_struct(n, req); + } + if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); } @@ -1568,7 +1578,8 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } -static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1598,6 +1609,9 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid < min_nsid) { continue; } + if (only_active && !ns->params.attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1607,7 +1621,8 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req) return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, + bool only_active) { NvmeNamespace *ns; NvmeIdentify *c = (NvmeIdentify *)&req->cmd; @@ -1631,6 +1646,9 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req) if (ns->params.nsid < min_nsid) { continue; } + if (only_active && !ns->params.attached) { + continue; + } list_ptr[j++] = cpu_to_le32(ns->params.nsid); if (j == data_len / sizeof(uint32_t)) { break; @@ -1700,17 +1718,25 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req) switch (le32_to_cpu(c->cns)) { case NVME_ID_CNS_NS: - return nvme_identify_ns(n, req); + return nvme_identify_ns(n, req, true); case NVME_ID_CNS_CS_NS: - return nvme_identify_ns_csi(n, req); + return nvme_identify_ns_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT: + return nvme_identify_ns(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT: + return nvme_identify_ns_csi(n, req, false); case NVME_ID_CNS_CTRL: return nvme_identify_ctrl(n, req); case NVME_ID_CNS_CS_CTRL: return nvme_identify_ctrl_csi(n, req); case NVME_ID_CNS_NS_ACTIVE_LIST: - return nvme_identify_nslist(n, req); + return nvme_identify_nslist(n, req, true); case NVME_ID_CNS_CS_NS_ACTIVE_LIST: - return nvme_identify_nslist_csi(n, req); + return nvme_identify_nslist_csi(n, req, true); + case NVME_ID_CNS_NS_PRESENT_LIST: + return nvme_identify_nslist(n, req, false); + case NVME_ID_CNS_CS_NS_PRESENT_LIST: + return nvme_identify_nslist_csi(n, req, false); case NVME_ID_CNS_NS_DESCR_LIST: return nvme_identify_ns_descr_list(n, req); case NVME_ID_CNS_IO_COMMAND_SET: @@ -2188,8 +2214,10 @@ static void nvme_clear_ctrl(NvmeCtrl *n) static int nvme_start_ctrl(NvmeCtrl *n) { + NvmeNamespace *ns; uint32_t page_bits = NVME_CC_MPS(n->bar.cc) + 12; uint32_t page_size = 1 << page_bits; + int i; if (unlikely(n->cq[0])) { trace_pci_nvme_err_startfail_cq(); @@ -2276,6 +2304,22 @@ static int nvme_start_ctrl(NvmeCtrl *n) nvme_init_sq(&n->admin_sq, n, n->bar.asq, 0, 0, NVME_AQA_ASQS(n->bar.aqa) + 1); + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + ns->params.attached = false; + switch (ns->params.csi) { + case NVME_CSI_NVM: + if (NVME_CC_CSS(n->bar.cc) == CSS_NVM_ONLY || + NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + ns->params.attached = true; + } + break; + } + } + nvme_set_timestamp(n, 0ULL); QTAILQ_INIT(&n->aer_queue); diff --git a/include/block/nvme.h b/include/block/nvme.h index 4587311783..b182fe40b2 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -804,14 +804,18 @@ typedef struct QEMU_PACKED NvmePSD { #define NVME_IDENTIFY_DATA_SIZE 4096 enum NvmeIdCns { - NVME_ID_CNS_NS = 0x00, - NVME_ID_CNS_CTRL = 0x01, - NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, - NVME_ID_CNS_NS_DESCR_LIST = 0x03, - NVME_ID_CNS_CS_NS = 0x05, - NVME_ID_CNS_CS_CTRL = 0x06, - NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, - NVME_ID_CNS_IO_COMMAND_SET = 0x1c, + NVME_ID_CNS_NS = 0x00, + NVME_ID_CNS_CTRL = 0x01, + NVME_ID_CNS_NS_ACTIVE_LIST = 0x02, + NVME_ID_CNS_NS_DESCR_LIST = 0x03, + NVME_ID_CNS_CS_NS = 0x05, + NVME_ID_CNS_CS_CTRL = 0x06, + NVME_ID_CNS_CS_NS_ACTIVE_LIST = 0x07, + NVME_ID_CNS_NS_PRESENT_LIST = 0x10, + NVME_ID_CNS_NS_PRESENT = 0x11, + NVME_ID_CNS_CS_NS_PRESENT_LIST = 0x1a, + NVME_ID_CNS_CS_NS_PRESENT = 0x1b, + NVME_ID_CNS_IO_COMMAND_SET = 0x1c, }; typedef struct QEMU_PACKED NvmeIdCtrl { From patchwork Mon Sep 28 02:35:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E5D1C4346E for ; Mon, 28 Sep 2020 02:41:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A434C22207 for ; Mon, 28 Sep 2020 02:41:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="e+l660sZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A434C22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46400 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj6w-0004V2-MD for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:41:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46396) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj18-0006nY-1g; Sun, 27 Sep 2020 22:35:54 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj15-0003Li-AD; Sun, 27 Sep 2020 22:35:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260551; x=1632796551; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/8T3r48lypPmtnWJ26Xu7M8fiROgym4XT2MFunTfVIo=; b=e+l660sZyfXJgcLt9JuhUqcPaPqmf1RY5miHbt+LWlk7Bu2JoO1lfDCh O587vehSa3ZaYv5lQ+OdEWk6YSyUxMCkq8YSqY0Ty1P6nPCUjb0KWspwp f6smU3c4KodCdBIztrR819cQJs7LjBhcJSrzY/2QQ7UTXzmrRui7enKPP MDMMzF9/dBeRnF2anWm93Npu1R0aBWNQZONKy0Ybg8ut0GlCMLEVLiqOm CxcshqUjvTxdRPPFMIrIVQ1tJkQ9WWwF6BgWL8AudZKe0+sVstoAonnq6 cYDQ2z2oSEG/0e+WOIXQV0Ms9/YDFcSyH5klrhBtuQ7JarLSCIfzZw2l1 A==; IronPort-SDR: amDQjfNbs4gfbpZ7LvIPMTV58twOHyzEvbn0lCyU1JzQcEnhdhDchPC9mZqAmfyFnkdsY3xN/w HSyKjMrjQMjDLgM1Fj/b5Ii9xxFI4uZRbqjK700R7MKBfwBNYpqMk0PoWLRR5zXR2nGPBkqPQu oss9Kr2jQacpKXMUQF3jqD42xw696ChceAlgPy2H4/OxSkh3IL2ekhSLk3YZbinwDPZdPe1741 2NZy24m5KAQTfwGeiu86lVSNlKVlVIkQz/CNJSdY4DLK3xwIQgPbctt0x/GoulxrWNjqBf+B4Q KSE= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125241" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:49 +0800 IronPort-SDR: 323ssRm8SFf5hmji7WKwQ2n5LVU/ZPs1l5PXgAjLuWif4I5typYz/vGaxCo/+3957bW6yqm9my TIUBAeQrS/XQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:46 -0700 IronPort-SDR: mqt7CpmG58+GxRrQvLer/dG+oHKSv7+9GSVus+rZZSZsuKPKWjdaJt0XA5WyrfVUxsd4lBbHPI pIGy2PwNs7FQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:48 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 07/14] hw/block/nvme: Make Zoned NS Command Set definitions Date: Mon, 28 Sep 2020 11:35:21 +0900 Message-Id: <20200928023528.15260-8-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Define values and structures that are needed to support Zoned Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator. All new protocol definitions are located in include/block/nvme.h and everything added that is specific to this implementation is kept in hw/block/nvme.h. In order to improve scalability, all open, closed and full zones are organized in separate linked lists. Consequently, almost all zone operations don't require scanning of the entire zone array (which potentially can be quite large) - it is only necessary to enumerate one or more zone lists. Zone lists are designed to be position-independent as they can be persisted to the backing file as a part of zone metadata. NvmeZoneList struct defined in this patch serves as a head of every zone list. NvmeZone structure encapsulates NvmeZoneDescriptor defined in Zoned Command Set specification and adds a few more fields that are internal to this implementation. Signed-off-by: Niklas Cassel Signed-off-by: Hans Holmberg Signed-off-by: Ajay Joshi Signed-off-by: Matias Bjorling Signed-off-by: Shin'ichiro Kawasaki Signed-off-by: Alexey Bogoslavsky Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.h | 114 +++++++++++++++++++++++++++++++++++++++++++ hw/block/nvme.h | 10 ++++ include/block/nvme.h | 107 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 231 insertions(+) diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index acdb76f058..04172f083e 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -19,11 +19,33 @@ #define NVME_NS(obj) \ OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS) +typedef struct NvmeZone { + NvmeZoneDescr d; + uint64_t w_ptr; + uint32_t next; + uint32_t prev; + uint8_t rsvd80[8]; +} NvmeZone; + +#define NVME_ZONE_LIST_NIL UINT_MAX + +typedef struct NvmeZoneList { + uint32_t head; + uint32_t tail; + uint32_t size; + uint8_t rsvd12[4]; +} NvmeZoneList; + typedef struct NvmeNamespaceParams { uint32_t nsid; uint8_t csi; bool attached; QemuUUID uuid; + + bool zoned; + bool cross_zone_read; + uint64_t zone_size_mb; + uint64_t zone_capacity_mb; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -33,6 +55,18 @@ typedef struct NvmeNamespace { int64_t size; NvmeIdNs id_ns; + NvmeIdNsZoned *id_ns_zoned; + NvmeZone *zone_array; + NvmeZoneList *exp_open_zones; + NvmeZoneList *imp_open_zones; + NvmeZoneList *closed_zones; + NvmeZoneList *full_zones; + uint32_t num_zones; + uint64_t zone_size; + uint64_t zone_capacity; + uint64_t zone_array_size; + uint32_t zone_size_log2; + NvmeNamespaceParams params; } NvmeNamespace; @@ -74,4 +108,84 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +static inline uint8_t nvme_get_zone_state(NvmeZone *zone) +{ + return zone->d.zs >> 4; +} + +static inline void nvme_set_zone_state(NvmeZone *zone, enum NvmeZoneState state) +{ + zone->d.zs = state << 4; +} + +static inline uint64_t nvme_zone_rd_boundary(NvmeNamespace *ns, NvmeZone *zone) +{ + return zone->d.zslba + ns->zone_size; +} + +static inline uint64_t nvme_zone_wr_boundary(NvmeZone *zone) +{ + return zone->d.zslba + zone->d.zcap; +} + +static inline bool nvme_wp_is_valid(NvmeZone *zone) +{ + uint8_t st = nvme_get_zone_state(zone); + + return st != NVME_ZONE_STATE_FULL && + st != NVME_ZONE_STATE_READ_ONLY && + st != NVME_ZONE_STATE_OFFLINE; +} + +/* + * Initialize a zone list head. + */ +static inline void nvme_init_zone_list(NvmeZoneList *zl) +{ + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + zl->size = 0; +} + +/* + * Initialize the number of entries contained in a zone list. + */ +static inline uint32_t nvme_zone_list_size(NvmeZoneList *zl) +{ + return zl->size; +} + +/* + * Check if the zone is not currently included into any zone list. + */ +static inline bool nvme_zone_not_in_list(NvmeZone *zone) +{ + return (bool)(zone->prev == 0 && zone->next == 0); +} + +/* + * Return the zone at the head of zone list or NULL if the list is empty. + */ +static inline NvmeZone *nvme_peek_zone_head(NvmeNamespace *ns, NvmeZoneList *zl) +{ + if (zl->head == NVME_ZONE_LIST_NIL) { + return NULL; + } + return &ns->zone_array[zl->head]; +} + +/* + * Return the next zone in the list. + */ +static inline NvmeZone *nvme_next_zone_in_list(NvmeNamespace *ns, NvmeZone *z, + NvmeZoneList *zl) +{ + assert(!nvme_zone_not_in_list(z)); + + if (z->next == NVME_ZONE_LIST_NIL) { + return NULL; + } + return &ns->zone_array[z->next]; +} + #endif /* NVME_NS_H */ diff --git a/hw/block/nvme.h b/hw/block/nvme.h index e080a2318a..f09e741d9a 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -6,6 +6,9 @@ #define NVME_MAX_NAMESPACES 256 +#define NVME_DEFAULT_ZONE_SIZE 128 /* MiB */ +#define NVME_DEFAULT_MAX_ZA_SIZE 128 /* KiB */ + typedef struct NvmeParams { char *serial; uint32_t num_queues; /* deprecated since 5.1 */ @@ -16,6 +19,8 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; + uint8_t fill_pattern; + uint32_t zasl_kb; } NvmeParams; typedef struct NvmeAsyncEvent { @@ -28,6 +33,8 @@ typedef struct NvmeRequest { struct NvmeNamespace *ns; BlockAIOCB *aiocb; uint16_t status; + int64_t fill_ofs; + uint32_t fill_len; NvmeCqe cqe; NvmeCmd cmd; BlockAcctCookie acct; @@ -147,6 +154,9 @@ typedef struct NvmeCtrl { QTAILQ_HEAD(, NvmeAsyncEvent) aer_queue; int aer_queued; + uint32_t zasl_bs; + uint8_t zasl; + NvmeNamespace namespace; NvmeNamespace *namespaces[NVME_MAX_NAMESPACES]; NvmeSQueue **sq; diff --git a/include/block/nvme.h b/include/block/nvme.h index b182fe40b2..a7126e123f 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -488,6 +488,9 @@ enum NvmeIoCommands { NVME_CMD_COMPARE = 0x05, NVME_CMD_WRITE_ZEROES = 0x08, NVME_CMD_DSM = 0x09, + NVME_CMD_ZONE_MGMT_SEND = 0x79, + NVME_CMD_ZONE_MGMT_RECV = 0x7a, + NVME_CMD_ZONE_APPEND = 0x7d, }; typedef struct QEMU_PACKED NvmeDeleteQ { @@ -677,6 +680,7 @@ enum NvmeStatusCodes { NVME_SGL_DESCR_TYPE_INVALID = 0x0011, NVME_INVALID_USE_OF_CMB = 0x0012, NVME_CMD_SET_CMB_REJECTED = 0x002b, + NVME_INVALID_CMD_SET = 0x002c, NVME_LBA_RANGE = 0x0080, NVME_CAP_EXCEEDED = 0x0081, NVME_NS_NOT_READY = 0x0082, @@ -701,6 +705,14 @@ enum NvmeStatusCodes { NVME_CONFLICTING_ATTRS = 0x0180, NVME_INVALID_PROT_INFO = 0x0181, NVME_WRITE_TO_RO = 0x0182, + NVME_ZONE_BOUNDARY_ERROR = 0x01b8, + NVME_ZONE_FULL = 0x01b9, + NVME_ZONE_READ_ONLY = 0x01ba, + NVME_ZONE_OFFLINE = 0x01bb, + NVME_ZONE_INVALID_WRITE = 0x01bc, + NVME_ZONE_TOO_MANY_ACTIVE = 0x01bd, + NVME_ZONE_TOO_MANY_OPEN = 0x01be, + NVME_ZONE_INVAL_TRANSITION = 0x01bf, NVME_WRITE_FAULT = 0x0280, NVME_UNRECOVERED_READ = 0x0281, NVME_E2E_GUARD_ERROR = 0x0282, @@ -885,6 +897,11 @@ typedef struct QEMU_PACKED NvmeIdCtrl { uint8_t vs[1024]; } NvmeIdCtrl; +typedef struct NvmeIdCtrlZoned { + uint8_t zasl; + uint8_t rsvd1[4095]; +} NvmeIdCtrlZoned; + enum NvmeIdCtrlOacs { NVME_OACS_SECURITY = 1 << 0, NVME_OACS_FORMAT = 1 << 1, @@ -1009,6 +1026,12 @@ typedef struct QEMU_PACKED NvmeLBAF { uint8_t rp; } NvmeLBAF; +typedef struct QEMU_PACKED NvmeLBAFE { + uint64_t zsze; + uint8_t zdes; + uint8_t rsvd9[7]; +} NvmeLBAFE; + #define NVME_NSID_BROADCAST 0xffffffff typedef struct QEMU_PACKED NvmeIdNs { @@ -1063,10 +1086,24 @@ enum NvmeNsIdentifierType { enum NvmeCsi { NVME_CSI_NVM = 0x00, + NVME_CSI_ZONED = 0x02, }; #define NVME_SET_CSI(vec, csi) (vec |= (uint8_t)(1 << (csi))) +typedef struct QEMU_PACKED NvmeIdNsZoned { + uint16_t zoc; + uint16_t ozcs; + uint32_t mar; + uint32_t mor; + uint32_t rrl; + uint32_t frl; + uint8_t rsvd20[2796]; + NvmeLBAFE lbafe[16]; + uint8_t rsvd3072[768]; + uint8_t vs[256]; +} NvmeIdNsZoned; + /*Deallocate Logical Block Features*/ #define NVME_ID_NS_DLFEAT_GUARD_CRC(dlfeat) ((dlfeat) & 0x10) #define NVME_ID_NS_DLFEAT_WRITE_ZEROES(dlfeat) ((dlfeat) & 0x08) @@ -1098,6 +1135,71 @@ enum NvmeIdNsDps { DPS_FIRST_EIGHT = 8, }; +enum NvmeZoneAttr { + NVME_ZA_FINISHED_BY_CTLR = 1 << 0, + NVME_ZA_FINISH_RECOMMENDED = 1 << 1, + NVME_ZA_RESET_RECOMMENDED = 1 << 2, + NVME_ZA_ZD_EXT_VALID = 1 << 7, +}; + +typedef struct QEMU_PACKED NvmeZoneReportHeader { + uint64_t nr_zones; + uint8_t rsvd[56]; +} NvmeZoneReportHeader; + +enum NvmeZoneReceiveAction { + NVME_ZONE_REPORT = 0, + NVME_ZONE_REPORT_EXTENDED = 1, +}; + +enum NvmeZoneReportType { + NVME_ZONE_REPORT_ALL = 0, + NVME_ZONE_REPORT_EMPTY = 1, + NVME_ZONE_REPORT_IMPLICITLY_OPEN = 2, + NVME_ZONE_REPORT_EXPLICITLY_OPEN = 3, + NVME_ZONE_REPORT_CLOSED = 4, + NVME_ZONE_REPORT_FULL = 5, + NVME_ZONE_REPORT_READ_ONLY = 6, + NVME_ZONE_REPORT_OFFLINE = 7, +}; + +enum NvmeZoneType { + NVME_ZONE_TYPE_RESERVED = 0x00, + NVME_ZONE_TYPE_SEQ_WRITE = 0x02, +}; + +enum NvmeZoneSendAction { + NVME_ZONE_ACTION_RSD = 0x00, + NVME_ZONE_ACTION_CLOSE = 0x01, + NVME_ZONE_ACTION_FINISH = 0x02, + NVME_ZONE_ACTION_OPEN = 0x03, + NVME_ZONE_ACTION_RESET = 0x04, + NVME_ZONE_ACTION_OFFLINE = 0x05, + NVME_ZONE_ACTION_SET_ZD_EXT = 0x10, +}; + +typedef struct QEMU_PACKED NvmeZoneDescr { + uint8_t zt; + uint8_t zs; + uint8_t za; + uint8_t rsvd3[5]; + uint64_t zcap; + uint64_t zslba; + uint64_t wp; + uint8_t rsvd32[32]; +} NvmeZoneDescr; + +enum NvmeZoneState { + NVME_ZONE_STATE_RESERVED = 0x00, + NVME_ZONE_STATE_EMPTY = 0x01, + NVME_ZONE_STATE_IMPLICITLY_OPEN = 0x02, + NVME_ZONE_STATE_EXPLICITLY_OPEN = 0x03, + NVME_ZONE_STATE_CLOSED = 0x04, + NVME_ZONE_STATE_READ_ONLY = 0x0D, + NVME_ZONE_STATE_FULL = 0x0E, + NVME_ZONE_STATE_OFFLINE = 0x0F, +}; + static inline void _nvme_check_size(void) { QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096); @@ -1117,9 +1219,14 @@ static inline void _nvme_check_size(void) QEMU_BUILD_BUG_ON(sizeof(NvmeSmartLog) != 512); QEMU_BUILD_BUG_ON(sizeof(NvmeEffectsLog) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrl) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdCtrlZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAF) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeLBAFE) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNs) != 4096); + QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsZoned) != 4096); QEMU_BUILD_BUG_ON(sizeof(NvmeSglDescriptor) != 16); QEMU_BUILD_BUG_ON(sizeof(NvmeIdNsDescr) != 4); + QEMU_BUILD_BUG_ON(sizeof(NvmeZoneDescr) != 64); } #endif From patchwork Mon Sep 28 02:35:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BE5FC4346E for ; Mon, 28 Sep 2020 02:45:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BB4C622207 for ; Mon, 28 Sep 2020 02:45:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="qrrWb9zR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BB4C622207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56542 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMjAC-0000Au-Pb for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:45:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46406) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1D-0006rw-Iy; Sun, 27 Sep 2020 22:35:59 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21635) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj17-0003Me-Ha; Sun, 27 Sep 2020 22:35:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260553; x=1632796553; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jniklUrqzx+BB7Pz//D7wlZglx+Rllx3WLHWKqAdUlk=; b=qrrWb9zRq+9nvycG8dz76Gm4JsUoI7LRRsspdireeWpUu3dvNg1uZsqQ moqIqoA+SwguW0zk+9gGK2Njkh/G3wkhLyafRyXnxXT4Lo92XPpZ8qEhX 7cjuJoYSYCm781rdGGN/8tC6bjhJHQ8Xf0S6L798cRqgKIgV5jWR/obVn gFOzNvtjn4/bw1W45MF0BkdeUZEPS+G2Y5RBCmGgLirEPTVy/NTjqUNjs tTWYvDDpFt9qzBlu3t8HXc9dJ2HYlepCe0ESXEHeScajL4biGxoBncj3V 4e+mQ44BbYnIo57tIIHmDkfJ/pl5R6IXkaJ+9fI8eKEttuaayC6ZJe02p A==; IronPort-SDR: OLo0ejfxB303/9i2rH/gMJ2bzVAIjj4/hSutYqqsvkPRZ6th7srdSLT/A5kLse438/QPw3BBzY JiozYG3pNks2+9VZZ9rHUR6zIN1OZVGha+Y43CdEEDuxo9eaRdM68pGH0aPGDQu24N4YgKH2h/ I+k7CauHJmCnWmuVPoNLFWxV9F49bvPkr5+wP8kiKgwvfSir5Ktq59p6Js6kbPa6Qjj0MCNzCZ /P5e5zdJAiMlM/5ETYPXr0xrU7cz9SjVE2izYHpgS8Je76DxRfpJqhvSlTIlZ02egKybX0ZgWg CCY= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125244" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:52 +0800 IronPort-SDR: IhlJltbQ5i2V1zLUxs8ha7ySw4SYPDG8WsbXMWy4kkYukNApCIWX3lAlq0wC8Hc7+kr6Lr8BP5 hxjYFJGe6F9w== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:48 -0700 IronPort-SDR: MhkeFBnnzESjG0hN3VXTWtAZOU9jq5SETkKCybn8ACkqnwSU1IdNAaaXQ9AAekj6lGlBpwyYSW iMrxuv2b+qrQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:50 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 08/14] hw/block/nvme: Define Zoned NS Command Set trace events Date: Mon, 28 Sep 2020 11:35:22 +0900 Message-Id: <20200928023528.15260-9-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The Zoned Namespace Command Set / Namespace Types implementation that is being introduced in this series adds a good number of trace events. Combine all tracepoint definitions into a separate patch to make reviewing more convenient. Signed-off-by: Dmitry Fomichev --- hw/block/trace-events | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/hw/block/trace-events b/hw/block/trace-events index b93429b04c..386f28e457 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -93,6 +93,17 @@ pci_nvme_mmio_shutdown_cleared(void) "shutdown bit cleared" pci_nvme_cmd_supp_and_effects_log_read(void) "commands supported and effects log read" pci_nvme_css_nvm_cset_selected_by_host(uint32_t cc) "NVM command set selected by host, bar.cc=0x%"PRIx32"" pci_nvme_css_all_csets_sel_by_host(uint32_t cc) "all supported command sets selected by host, bar.cc=0x%"PRIx32"" +pci_nvme_open_zone(uint64_t slba, uint32_t zone_idx, int all) "open zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_close_zone(uint64_t slba, uint32_t zone_idx, int all) "close zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_finish_zone(uint64_t slba, uint32_t zone_idx, int all) "finish zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_reset_zone(uint64_t slba, uint32_t zone_idx, int all) "reset zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_offline_zone(uint64_t slba, uint32_t zone_idx, int all) "offline zone, slba=%"PRIu64", idx=%"PRIu32", all=%"PRIi32"" +pci_nvme_set_descriptor_extension(uint64_t slba, uint32_t zone_idx) "set zone descriptor extension, slba=%"PRIu64", idx=%"PRIu32"" +pci_nvme_zd_extension_set(uint32_t zone_idx) "set descriptor extension for zone_idx=%"PRIu32"" +pci_nvme_power_on_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state" +pci_nvme_power_on_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" +pci_nvme_power_on_full(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Full state" +pci_nvme_mapped_zone_file(char *zfile_name, int ret) "mapped zone file %s, error %d" # nvme traces for error conditions pci_nvme_err_mdts(uint16_t cid, size_t len) "cid %"PRIu16" len %zu" @@ -111,9 +122,23 @@ pci_nvme_err_invalid_prp(void) "invalid PRP" pci_nvme_err_invalid_opc(uint8_t opc) "invalid opcode 0x%"PRIx8"" pci_nvme_err_invalid_admin_opc(uint8_t opc) "invalid admin opcode 0x%"PRIx8"" pci_nvme_err_invalid_lba_range(uint64_t start, uint64_t len, uint64_t limit) "Invalid LBA start=%"PRIu64" len=%"PRIu64" limit=%"PRIu64"" +pci_nvme_err_unaligned_zone_cmd(uint8_t action, uint64_t slba, uint64_t zslba) "unaligned zone op 0x%"PRIx32", got slba=%"PRIu64", zslba=%"PRIu64"" +pci_nvme_err_invalid_zone_state_transition(uint8_t state, uint8_t action, uint64_t slba, uint8_t attrs) "0x%"PRIx32"->0x%"PRIx32", slba=%"PRIu64", attrs=0x%"PRIx32"" +pci_nvme_err_write_not_at_wp(uint64_t slba, uint64_t zone, uint64_t wp) "writing at slba=%"PRIu64", zone=%"PRIu64", but wp=%"PRIu64"" +pci_nvme_err_append_not_at_start(uint64_t slba, uint64_t zone) "appending at slba=%"PRIu64", but zone=%"PRIu64"" +pci_nvme_err_zone_write_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_zone_read_not_ok(uint64_t slba, uint32_t nlb, uint32_t status) "slba=%"PRIu64", nlb=%"PRIu32", status=0x%"PRIx16"" +pci_nvme_err_append_too_large(uint64_t slba, uint32_t nlb, uint8_t zasl) "slba=%"PRIu64", nlb=%"PRIu32", zasl=%"PRIu8"" +pci_nvme_err_insuff_active_res(uint32_t max_active) "max_active=%"PRIu32" zone limit exceeded" +pci_nvme_err_insuff_open_res(uint32_t max_open) "max_open=%"PRIu32" zone limit exceeded" +pci_nvme_err_zone_file_invalid(int error) "validation error=%"PRIi32"" +pci_nvme_err_zd_extension_map_error(uint32_t zone_idx) "can't map descriptor extension for zone_idx=%"PRIu32"" +pci_nvme_err_invalid_changed_zone_list_offset(uint64_t ofs) "changed zone list log offset must be 0, got %"PRIu64"" +pci_nvme_err_invalid_changed_zone_list_len(uint32_t len) "changed zone list log size is 4096, got %"PRIu32"" pci_nvme_err_invalid_effects_log_offset(uint64_t ofs) "commands supported and effects log offset must be 0, got %"PRIu64"" pci_nvme_err_change_css_when_enabled(void) "changing CC.CSS while controller is enabled" pci_nvme_err_only_nvm_cmd_set_avail(void) "setting 110b CC.CSS, but only NVM command set is enabled" +pci_nvme_err_only_zoned_cmd_set_avail(void) "setting 001b CC.CSS, but only ZONED+NVM command set is enabled" pci_nvme_err_invalid_iocsci(uint32_t idx) "unsupported command set combination index %"PRIu32"" pci_nvme_err_invalid_del_sq(uint16_t qid) "invalid submission queue deletion, sid=%"PRIu16"" pci_nvme_err_invalid_create_sq_cqid(uint16_t cqid) "failed creating submission queue, invalid cqid=%"PRIu16"" @@ -147,6 +172,7 @@ pci_nvme_err_startfail_sqent_too_large(uint8_t log2ps, uint8_t maxlog2ps) "nvme_ pci_nvme_err_startfail_asqent_sz_zero(void) "nvme_start_ctrl failed because the admin submission queue size is zero" pci_nvme_err_startfail_acqent_sz_zero(void) "nvme_start_ctrl failed because the admin completion queue size is zero" pci_nvme_err_startfail(void) "setting controller enable bit failed" +pci_nvme_err_invalid_mgmt_action(int action) "action=0x%"PRIx8"" # Traces for undefined behavior pci_nvme_ub_mmiowr_misaligned32(uint64_t offset) "MMIO write not 32-bit aligned, offset=0x%"PRIx64"" From patchwork Mon Sep 28 02:35:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A6F1C4346E for ; Mon, 28 Sep 2020 02:45:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B340E22207 for ; Mon, 28 Sep 2020 02:45:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="nX48aA1g" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B340E22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMjA5-0008H4-Pr for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:45:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46470) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1I-00074v-MZ; Sun, 27 Sep 2020 22:36:04 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1D-0003Li-Ps; Sun, 27 Sep 2020 22:36:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260559; x=1632796559; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NswH0YmaXPJb7y5Cllz+WVpmTIjykRa54e00djU4VSU=; b=nX48aA1gAC+d2LHCLYbCwX8fTfxH6RLeAl2qQWSmHK+A8XZ9d6PN/tO3 IaCDf0in7JrKyu8xeA/QyblTCFTZvyjdQDInVwbLuYhGdT3+nLW/1Ego2 TLYwZ+T2sxvRWrdov49dQZnqVQXbrRVz5glGCkYSAszu5SXICrRBB1VMq Os82/X1c73NXr36VEMx7OdFTy9sgdcQdfCBeat8cV0bicQYEWnU7OgsRT MSGYfioUY8dArjMKDBI3Q6mIlMg/k8mkCNRGXZqzwx26W6nB34ZAJYKI8 Q5yrpXyQue2c5gM9yNJbxX4r7WUywXlqdD9ZlRFaBR0XrUHgmA3hTNMxF w==; IronPort-SDR: S+zqGQYDwP7JfH/l/KYX9VO1+eRQeXiYEnP39EZNDYvwiWGy+UcDOHIY5g0wtTTp2mhsV+coAV 6kg0g1tLd2W0aC1oMAo/S6pjjC1WzinJF464aE6TlW/Lt8kf+pdc0U1J09rnF+FiTtYe9+KXFx q7gelWsRKqaKKR4hSxlz16FJETgl0BDb7Hwh/V78ETsOgirLhs0zUmie9jThjBId8Kk3I7w8E9 ga4YZ2HuRLVWsoBxcBH6bMZ67tVgOZEHNuBNilYWkBymrMa/i6eKAdTkxq9+ZaqkXnudqN6ed5 Lqo= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125246" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:54 +0800 IronPort-SDR: jUPqLGmLFiziexB37yHXvhOgueRJYm4gLmkbBTnPu4pdm1AArpSolxn5+NRS4VYWdF94g52nAT 8euqLAF23+BQ== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:50 -0700 IronPort-SDR: wtMtUuZ9xSPhZbMkrLXvg1aTvsSJHOjsunc0/ecFjxLe1zL9pzt+lDwDLKVkHkBt1QFrXzg9Zg B8qOXsXLzsTw== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:52 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 09/14] hw/block/nvme: Support Zoned Namespace Command Set Date: Mon, 28 Sep 2020 11:35:23 +0900 Message-Id: <20200928023528.15260-10-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The emulation code has been changed to advertise NVM Command Set when "zoned" device property is not set (default) and Zoned Namespace Command Set otherwise. Handlers for three new NVMe commands introduced in Zoned Namespace Command Set specification are added, namely for Zone Management Receive, Zone Management Send and Zone Append. Device initialization code has been extended to create a proper configuration for zoned operation using device properties. Read/Write command handler is modified to only allow writes at the write pointer if the namespace is zoned. For Zone Append command, writes implicitly happen at the write pointer and the starting write pointer value is returned as the result of the command. Write Zeroes handler is modified to add zoned checks that are identical to those done as a part of Write flow. The code to support for Zone Descriptor Extensions is not included in this commit and ZDES 0 is always reported. A later commit in this series will add ZDE support. This commit doesn't yet include checks for active and open zone limits. It is assumed that there are no limits on either active or open zones. Signed-off-by: Niklas Cassel Signed-off-by: Hans Holmberg Signed-off-by: Ajay Joshi Signed-off-by: Chaitanya Kulkarni Signed-off-by: Matias Bjorling Signed-off-by: Aravind Ramesh Signed-off-by: Shin'ichiro Kawasaki Signed-off-by: Adam Manzanares Signed-off-by: Dmitry Fomichev --- block/nvme.c | 2 +- hw/block/nvme-ns.c | 185 ++++++++- hw/block/nvme-ns.h | 6 +- hw/block/nvme.c | 872 +++++++++++++++++++++++++++++++++++++++++-- include/block/nvme.h | 6 +- 5 files changed, 1033 insertions(+), 38 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 05485fdd11..7a513c9a17 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -333,7 +333,7 @@ static inline int nvme_translate_error(const NvmeCqe *c) { uint16_t status = (le16_to_cpu(c->status) >> 1) & 0xFF; if (status) { - trace_nvme_error(le32_to_cpu(c->result), + trace_nvme_error(le32_to_cpu(c->result32), le16_to_cpu(c->sq_head), le16_to_cpu(c->sq_id), le16_to_cpu(c->cid), diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 31b7f986c3..6d9dc9205b 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -33,14 +33,14 @@ static void nvme_ns_init(NvmeNamespace *ns) NvmeIdNs *id_ns = &ns->id_ns; if (blk_get_flags(ns->blkconf.blk) & BDRV_O_UNMAP) { - ns->id_ns.dlfeat = 0x9; + ns->id_ns.dlfeat = 0x8; } id_ns->lbaf[0].ds = BDRV_SECTOR_BITS; id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns)); - ns->params.csi = NVME_CSI_NVM; + ns->csi = NVME_CSI_NVM; qemu_uuid_generate(&ns->params.uuid); /* TODO make UUIDs persistent */ /* no thin provisioning */ @@ -73,7 +73,162 @@ static int nvme_ns_init_blk(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) } lba_index = NVME_ID_NS_FLBAS_INDEX(ns->id_ns.flbas); - ns->id_ns.lbaf[lba_index].ds = 31 - clz32(n->conf.logical_block_size); + ns->id_ns.lbaf[lba_index].ds = 31 - clz32(ns->blkconf.logical_block_size); + + return 0; +} + +/* + * Add a zone to the tail of a zone list. + */ +void nvme_add_zone_tail(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone) +{ + uint32_t idx = (uint32_t)(zone - ns->zone_array); + + assert(nvme_zone_not_in_list(zone)); + + if (!zl->size) { + zl->head = zl->tail = idx; + zone->next = zone->prev = NVME_ZONE_LIST_NIL; + } else { + ns->zone_array[zl->tail].next = idx; + zone->prev = zl->tail; + zone->next = NVME_ZONE_LIST_NIL; + zl->tail = idx; + } + zl->size++; +} + +/* + * Remove a zone from a zone list. The zone must be linked in the list. + */ +void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone) +{ + uint32_t idx = (uint32_t)(zone - ns->zone_array); + + assert(!nvme_zone_not_in_list(zone)); + + --zl->size; + if (zl->size == 0) { + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + } else if (idx == zl->head) { + zl->head = zone->next; + ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + } else if (idx == zl->tail) { + zl->tail = zone->prev; + ns->zone_array[zl->tail].next = NVME_ZONE_LIST_NIL; + } else { + ns->zone_array[zone->next].prev = zone->prev; + ns->zone_array[zone->prev].next = zone->next; + } + + zone->prev = zone->next = 0; +} + +static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) +{ + uint64_t zone_size, zone_cap; + uint32_t nz, lbasz = ns->blkconf.logical_block_size; + + if (ns->params.zone_size_mb) { + zone_size = ns->params.zone_size_mb; + } else { + zone_size = NVME_DEFAULT_ZONE_SIZE; + } + if (ns->params.zone_capacity_mb) { + zone_cap = ns->params.zone_capacity_mb; + } else { + zone_cap = zone_size; + } + ns->zone_size = zone_size * MiB / lbasz; + ns->zone_capacity = zone_cap * MiB / lbasz; + if (ns->zone_capacity > ns->zone_size) { + error_setg(errp, "zone capacity exceeds zone size"); + return -1; + } + + nz = DIV_ROUND_UP(ns->size / lbasz, ns->zone_size); + ns->num_zones = nz; + ns->zone_array_size = sizeof(NvmeZone) * nz; + ns->zone_size_log2 = 0; + if (is_power_of_2(ns->zone_size)) { + ns->zone_size_log2 = 63 - clz64(ns->zone_size); + } + + return 0; +} + +static void nvme_init_zone_meta(NvmeNamespace *ns) +{ + uint64_t start = 0, zone_size = ns->zone_size; + uint64_t capacity = ns->num_zones * zone_size; + NvmeZone *zone; + int i; + + ns->zone_array = g_malloc0(ns->zone_array_size); + ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + + nvme_init_zone_list(ns->exp_open_zones); + nvme_init_zone_list(ns->imp_open_zones); + nvme_init_zone_list(ns->closed_zones); + nvme_init_zone_list(ns->full_zones); + + zone = ns->zone_array; + for (i = 0; i < ns->num_zones; i++, zone++) { + if (start + zone_size > capacity) { + zone_size = capacity - start; + } + zone->d.zt = NVME_ZONE_TYPE_SEQ_WRITE; + nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); + zone->d.za = 0; + zone->d.zcap = ns->zone_capacity; + zone->d.zslba = start; + zone->d.wp = start; + zone->w_ptr = start; + zone->prev = 0; + zone->next = 0; + start += zone_size; + } +} + +static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, + Error **errp) +{ + NvmeIdNsZoned *id_ns_z; + + if (n->params.fill_pattern == 0) { + ns->id_ns.dlfeat |= 0x01; + } else if (n->params.fill_pattern == 0xff) { + ns->id_ns.dlfeat |= 0x02; + } + + if (nvme_calc_zone_geometry(ns, errp) != 0) { + return -1; + } + + nvme_init_zone_meta(ns); + + id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned)); + + /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ + id_ns_z->mar = 0xffffffff; + id_ns_z->mor = 0xffffffff; + id_ns_z->zoc = 0; + id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; + + id_ns_z->lbafe[lba_index].zsze = cpu_to_le64(ns->zone_size); + id_ns_z->lbafe[lba_index].zdes = 0; /* FIXME make helper */ + + ns->csi = NVME_CSI_ZONED; + ns->id_ns.ncap = cpu_to_le64(ns->zone_capacity * ns->num_zones); + ns->id_ns.nuse = ns->id_ns.ncap; + ns->id_ns.nsze = ns->id_ns.ncap; + + ns->id_ns_zoned = id_ns_z; return 0; } @@ -103,6 +258,12 @@ int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.zoned) { + if (nvme_zoned_init_ns(n, ns, 0, errp) != 0) { + return -1; + } + } + return 0; } @@ -116,6 +277,16 @@ void nvme_ns_flush(NvmeNamespace *ns) blk_flush(ns->blkconf.blk); } +void nvme_ns_cleanup(NvmeNamespace *ns) +{ + g_free(ns->id_ns_zoned); + g_free(ns->zone_array); + g_free(ns->exp_open_zones); + g_free(ns->imp_open_zones); + g_free(ns->closed_zones); + g_free(ns->full_zones); +} + static void nvme_ns_realize(DeviceState *dev, Error **errp) { NvmeNamespace *ns = NVME_NS(dev); @@ -133,6 +304,14 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_UINT32("nsid", NvmeNamespace, params.nsid, 0), + + DEFINE_PROP_BOOL("zoned", NvmeNamespace, params.zoned, false), + DEFINE_PROP_UINT64("zone_size", NvmeNamespace, params.zone_size_mb, + NVME_DEFAULT_ZONE_SIZE), + DEFINE_PROP_UINT64("zone_capacity", NvmeNamespace, + params.zone_capacity_mb, 0), + DEFINE_PROP_BOOL("cross_zone_read", NvmeNamespace, + params.cross_zone_read, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 04172f083e..daa13546c4 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -38,7 +38,6 @@ typedef struct NvmeZoneList { typedef struct NvmeNamespaceParams { uint32_t nsid; - uint8_t csi; bool attached; QemuUUID uuid; @@ -52,6 +51,7 @@ typedef struct NvmeNamespace { DeviceState parent_obj; BlockConf blkconf; int32_t bootindex; + uint8_t csi; int64_t size; NvmeIdNs id_ns; @@ -107,6 +107,7 @@ typedef struct NvmeCtrl NvmeCtrl; int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); void nvme_ns_drain(NvmeNamespace *ns); void nvme_ns_flush(NvmeNamespace *ns); +void nvme_ns_cleanup(NvmeNamespace *ns); static inline uint8_t nvme_get_zone_state(NvmeZone *zone) { @@ -188,4 +189,7 @@ static inline NvmeZone *nvme_next_zone_in_list(NvmeNamespace *ns, NvmeZone *z, return &ns->zone_array[z->next]; } +void nvme_add_zone_tail(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); +void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); + #endif /* NVME_NS_H */ diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 63ad03d6d6..38e25a4d1f 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -54,6 +54,7 @@ #include "qemu/osdep.h" #include "qemu/units.h" #include "qemu/error-report.h" +#include "crypto/random.h" #include "hw/block/block.h" #include "hw/pci/msix.h" #include "hw/pci/pci.h" @@ -127,6 +128,46 @@ static uint16_t nvme_sqid(NvmeRequest *req) return le16_to_cpu(req->sq->sqid); } +static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + if (!nvme_zone_not_in_list(zone)) { + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_remove_zone(ns, ns->exp_open_zones, zone); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_remove_zone(ns, ns->imp_open_zones, zone); + break; + case NVME_ZONE_STATE_CLOSED: + nvme_remove_zone(ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_FULL: + nvme_remove_zone(ns, ns->full_zones, zone); + } + } + + nvme_set_zone_state(zone, state); + + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_add_zone_tail(ns, ns->exp_open_zones, zone); + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_add_zone_tail(ns, ns->imp_open_zones, zone); + break; + case NVME_ZONE_STATE_CLOSED: + nvme_add_zone_tail(ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_FULL: + nvme_add_zone_tail(ns, ns->full_zones, zone); + case NVME_ZONE_STATE_READ_ONLY: + break; + default: + zone->d.za = 0; + } +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low = n->ctrl_mem.addr; @@ -813,7 +854,7 @@ static void nvme_process_aers(void *opaque) req = n->aer_reqs[n->outstanding_aers]; - result = (NvmeAerResult *) &req->cqe.result; + result = (NvmeAerResult *) &req->cqe.result32; result->event_type = event->result.event_type; result->event_info = event->result.event_info; result->log_page = event->result.log_page; @@ -882,6 +923,200 @@ static inline uint16_t nvme_check_bounds(NvmeCtrl *n, NvmeNamespace *ns, return NVME_SUCCESS; } +static void nvme_fill_data(QEMUSGList *qsg, QEMUIOVector *iov, uint64_t offset, + uint32_t max_len, uint8_t pattern) +{ + ScatterGatherEntry *entry; + uint32_t len, ent_len; + + if (qsg->nsg > 0) { + entry = qsg->sg; + len = qsg->size; + if (max_len) { + len = MIN(len, max_len); + } + for (; len > 0; len -= ent_len) { + ent_len = MIN(len, entry->len); + if (offset > ent_len) { + offset -= ent_len; + } else if (offset != 0) { + dma_memory_set(qsg->as, entry->base + offset, + pattern, ent_len - offset); + offset = 0; + } else { + dma_memory_set(qsg->as, entry->base, pattern, ent_len); + } + entry++; + } + } else if (iov->iov) { + len = iov_size(iov->iov, iov->niov); + if (max_len) { + len = MIN(len, max_len); + } + qemu_iovec_memset(iov, offset, pattern, len - offset); + } +} + +static uint16_t nvme_check_zone_write(NvmeZone *zone, uint64_t slba, + uint32_t nlb) +{ + uint16_t status; + + if (unlikely((slba + nlb) > nvme_zone_wr_boundary(zone))) { + return NVME_ZONE_BOUNDARY_ERROR; + } + + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_FULL: + status = NVME_ZONE_FULL; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE; + break; + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_ZONE_READ_ONLY; + break; + default: + assert(false); + } + return status; +} + +static uint16_t nvme_check_zone_read(NvmeNamespace *ns, NvmeZone *zone, + uint64_t slba, uint32_t nlb) +{ + uint64_t lba = slba, count; + uint16_t status; + uint8_t zs; + + do { + if (!ns->params.cross_zone_read && + (lba + nlb > nvme_zone_rd_boundary(ns, zone))) { + return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; + } + + zs = nvme_get_zone_state(zone); + switch (zs) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_FULL: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_READ_ONLY: + status = NVME_SUCCESS; + break; + case NVME_ZONE_STATE_OFFLINE: + status = NVME_ZONE_OFFLINE | NVME_DNR; + break; + default: + assert(false); + } + if (status != NVME_SUCCESS) { + break; + } + + if (lba + nlb > nvme_zone_rd_boundary(ns, zone)) { + count = nvme_zone_rd_boundary(ns, zone) - lba; + } else { + count = nlb; + } + + lba += count; + nlb -= count; + zone++; + } while (nlb); + + return status; +} + +static inline uint32_t nvme_zone_idx(NvmeNamespace *ns, uint64_t slba) +{ + return ns->zone_size_log2 > 0 ? slba >> ns->zone_size_log2 : + slba / ns->zone_size; +} + +static bool nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, + bool failed) +{ + NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; + NvmeZone *zone; + uint64_t slba, start_wp = req->cqe.result64; + uint32_t nlb, zone_idx; + uint8_t zs; + + if (rw->opcode != NVME_CMD_WRITE && + rw->opcode != NVME_CMD_ZONE_APPEND && + rw->opcode != NVME_CMD_WRITE_ZEROES) { + return false; + } + + slba = le64_to_cpu(rw->slba); + nlb = le16_to_cpu(rw->nlb) + 1; + zone_idx = nvme_zone_idx(ns, slba); + assert(zone_idx < ns->num_zones); + zone = &ns->zone_array[zone_idx]; + + if (!failed && zone->w_ptr < start_wp + nlb) { + /* + * A preceding queued write to the zone has failed, + * now this write is not at the WP, fail it too. + */ + failed = true; + } + + if (failed) { + if (zone->w_ptr > start_wp) { + zone->w_ptr = start_wp; + } + req->cqe.result64 = 0; + } else if (zone->w_ptr == nvme_zone_wr_boundary(zone)) { + zs = nvme_get_zone_state(zone); + switch (zs) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); + /* fall through */ + case NVME_ZONE_STATE_FULL: + break; + default: + assert(false); + } + zone->d.wp = zone->w_ptr; + } else { + zone->d.wp += nlb; + } + + return failed; +} + +static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone, + uint32_t nlb) +{ + uint64_t result = zone->w_ptr; + uint8_t zs; + + zone->w_ptr += nlb; + + if (zone->w_ptr < nvme_zone_wr_boundary(zone)) { + zs = nvme_get_zone_state(zone); + switch (zs) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); + } + } + + return result; +} + static void nvme_rw_cb(void *opaque, int ret) { NvmeRequest *req = opaque; @@ -896,10 +1131,27 @@ static void nvme_rw_cb(void *opaque, int ret) trace_pci_nvme_rw_cb(nvme_cid(req), blk_name(blk)); if (!ret) { - block_acct_done(stats, acct); + if (ns->params.zoned) { + if (nvme_finalize_zoned_write(ns, req, false)) { + ret = EIO; + block_acct_failed(stats, acct); + req->status = NVME_ZONE_INVALID_WRITE; + } else if (req->fill_ofs >= 0) { + nvme_fill_data(&req->qsg, &req->iov, req->fill_ofs, + req->fill_len, + nvme_ctrl(req)->params.fill_pattern); + } + } + if (!ret) { + block_acct_done(stats, acct); + } } else { uint16_t status; + if (ns->params.zoned) { + nvme_finalize_zoned_write(ns, req, true); + } + block_acct_failed(stats, acct); switch (req->cmd.opcode) { @@ -953,6 +1205,7 @@ static uint16_t nvme_do_aio(BlockBackend *blk, int64_t offset, size_t len, break; case NVME_CMD_WRITE: + case NVME_CMD_ZONE_APPEND: is_write = true; /* fallthrough */ @@ -997,8 +1250,10 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) NvmeNamespace *ns = req->ns; uint64_t slba = le64_to_cpu(rw->slba); uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + NvmeZone *zone = NULL; uint64_t offset = nvme_l2b(ns, slba); uint32_t count = nvme_l2b(ns, nlb); + uint32_t zone_idx; uint16_t status; trace_pci_nvme_write_zeroes(nvme_cid(req), nvme_nsid(ns), slba, nlb); @@ -1009,20 +1264,43 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return status; } + if (ns->params.zoned) { + zone_idx = nvme_zone_idx(ns, slba); + assert(zone_idx < ns->num_zones); + zone = &ns->zone_array[zone_idx]; + + status = nvme_check_zone_write(zone, slba, nlb); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status); + return status | NVME_DNR; + } + + assert(nvme_wp_is_valid(zone)); + if (unlikely(slba != zone->w_ptr)) { + trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, + zone->w_ptr); + return NVME_ZONE_INVALID_WRITE | NVME_DNR; + } + + req->cqe.result64 = nvme_advance_zone_wp(ns, zone, nlb); + } + return nvme_do_aio(ns->blkconf.blk, offset, count, req); } -static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) +static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req, bool append) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; NvmeNamespace *ns = req->ns; - uint32_t nlb = (uint32_t)le16_to_cpu(rw->nlb) + 1; + uint32_t nlb = le32_to_cpu(rw->nlb) + 1; uint64_t slba = le64_to_cpu(rw->slba); - uint64_t data_size = nvme_l2b(ns, nlb); - uint64_t data_offset = nvme_l2b(ns, slba); - enum BlockAcctType acct = req->cmd.opcode == NVME_CMD_WRITE ? - BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; + uint64_t data_offset; + + NvmeZone *zone = NULL; + uint32_t zone_idx = 0; + bool is_write = rw->opcode == NVME_CMD_WRITE || append; + enum BlockAcctType acct = is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ; uint16_t status; trace_pci_nvme_rw(nvme_cid(req), nvme_io_opc_str(rw->opcode), @@ -1040,18 +1318,468 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req) goto invalid; } + if (ns->params.zoned) { + zone_idx = nvme_zone_idx(ns, slba); + assert(zone_idx < ns->num_zones); + zone = &ns->zone_array[zone_idx]; + + if (is_write) { + status = nvme_check_zone_write(zone, slba, nlb); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status); + goto invalid; + } + + assert(nvme_wp_is_valid(zone)); + if (append) { + if (unlikely(slba != zone->d.zslba)) { + trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba); + status = NVME_ZONE_INVALID_WRITE | NVME_DNR; + goto invalid; + } + if (data_size > (n->page_size << n->zasl)) { + trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl); + status = NVME_INVALID_FIELD | NVME_DNR; + goto invalid; + } + slba = zone->w_ptr; + } else if (unlikely(slba != zone->w_ptr)) { + trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, + zone->w_ptr); + status = NVME_ZONE_INVALID_WRITE | NVME_DNR; + goto invalid; + } + req->fill_ofs = -1LL; + } else { + status = nvme_check_zone_read(ns, zone, slba, nlb); + if (status != NVME_SUCCESS) { + trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status); + goto invalid; + } + + if (slba + nlb > zone->w_ptr) { + /* + * All or some data is read above the WP. Need to + * fill out the buffer area that has no backing data + * with a predefined data pattern (zeros by default) + */ + if (slba >= zone->w_ptr) { + req->fill_ofs = 0; + } else { + req->fill_ofs = nvme_l2b(ns, zone->w_ptr - slba); + } + req->fill_len = nvme_l2b(ns, + nvme_zone_rd_boundary(ns, zone) - slba); + } else { + req->fill_ofs = -1LL; + } + } + } else if (append) { + trace_pci_nvme_err_invalid_opc(rw->opcode); + status = NVME_INVALID_OPCODE | NVME_DNR; + goto invalid; + } + status = nvme_map_dptr(n, data_size, req); if (status) { goto invalid; } + if (ns->params.zoned) { + if (unlikely(req->fill_ofs == 0 && + slba + nlb <= nvme_zone_rd_boundary(ns, zone))) { + /* No backend I/O necessary, only need to fill the buffer */ + nvme_fill_data(&req->qsg, &req->iov, 0, 0, n->params.fill_pattern); + req->status = NVME_SUCCESS; + return NVME_SUCCESS; + } + if (is_write) { + req->cqe.result64 = nvme_advance_zone_wp(ns, zone, nlb); + } + } + + data_offset = nvme_l2b(ns, slba); + return nvme_do_aio(ns->blkconf.blk, data_offset, data_size, req); invalid: block_acct_invalid(blk_get_stats(ns->blkconf.blk), acct); + return status | NVME_DNR; +} + +static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, + uint64_t *slba, uint32_t *zone_idx) +{ + uint32_t dw10 = le32_to_cpu(c->cdw10); + uint32_t dw11 = le32_to_cpu(c->cdw11); + + if (!ns->params.zoned) { + trace_pci_nvme_err_invalid_opc(c->opcode); + return NVME_INVALID_OPCODE | NVME_DNR; + } + + *slba = ((uint64_t)dw11) << 32 | dw10; + if (unlikely(*slba >= ns->id_ns.nsze)) { + trace_pci_nvme_err_invalid_lba_range(*slba, 0, ns->id_ns.nsze); + *slba = 0; + return NVME_LBA_RANGE | NVME_DNR; + } + + *zone_idx = nvme_zone_idx(ns, *slba); + assert(*zone_idx < ns->num_zones); + + return NVME_SUCCESS; +} + +static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); + /* fall through */ + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_open_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_CLOSED; +} + +static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + /* fall through */ + case NVME_ZONE_STATE_CLOSED: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_close_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN; +} + +static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_EMPTY: + zone->w_ptr = nvme_zone_wr_boundary(zone); + zone->d.wp = zone->w_ptr; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); + /* fall through */ + case NVME_ZONE_STATE_FULL: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_finish_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN || + state == NVME_ZONE_STATE_CLOSED; +} + +static uint16_t nvme_reset_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + case NVME_ZONE_STATE_CLOSED: + case NVME_ZONE_STATE_FULL: + zone->w_ptr = zone->d.zslba; + zone->d.wp = zone->w_ptr; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EMPTY); + /* fall through */ + case NVME_ZONE_STATE_EMPTY: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_reset_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_IMPLICITLY_OPEN || + state == NVME_ZONE_STATE_EXPLICITLY_OPEN || + state == NVME_ZONE_STATE_CLOSED || + state == NVME_ZONE_STATE_FULL; +} + +static uint16_t nvme_offline_zone(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + switch (state) { + case NVME_ZONE_STATE_READ_ONLY: + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_OFFLINE); + /* fall through */ + case NVME_ZONE_STATE_OFFLINE: + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + +static bool nvme_cond_offline_all(uint8_t state) +{ + return state == NVME_ZONE_STATE_READ_ONLY; +} + +typedef uint16_t (*op_handler_t)(NvmeNamespace *, NvmeZone *, + uint8_t); +typedef bool (*need_to_proc_zone_t)(uint8_t); + +static uint16_t name_do_zone_op(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state, bool all, + op_handler_t op_hndlr, + need_to_proc_zone_t proc_zone) +{ + int i; + uint16_t status = 0; + + if (!all) { + status = op_hndlr(ns, zone, state); + } else { + for (i = 0; i < ns->num_zones; i++, zone++) { + state = nvme_get_zone_state(zone); + if (proc_zone(state)) { + status = op_hndlr(ns, zone, state); + if (status != NVME_SUCCESS) { + break; + } + } + } + } + return status; } +static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint64_t slba = 0; + uint32_t zone_idx = 0; + uint16_t status; + uint8_t action, state; + bool all; + NvmeZone *zone; + + action = dw13 & 0xff; + all = dw13 & 0x100; + + req->status = NVME_SUCCESS; + + if (!all) { + status = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx); + if (status) { + return status; + } + } + + zone = &ns->zone_array[zone_idx]; + if (slba != zone->d.zslba) { + trace_pci_nvme_err_unaligned_zone_cmd(action, slba, zone->d.zslba); + return NVME_INVALID_FIELD | NVME_DNR; + } + state = nvme_get_zone_state(zone); + + switch (action) { + + case NVME_ZONE_ACTION_OPEN: + trace_pci_nvme_open_zone(slba, zone_idx, all); + status = name_do_zone_op(ns, zone, state, all, + nvme_open_zone, nvme_cond_open_all); + break; + + case NVME_ZONE_ACTION_CLOSE: + trace_pci_nvme_close_zone(slba, zone_idx, all); + status = name_do_zone_op(ns, zone, state, all, + nvme_close_zone, nvme_cond_close_all); + break; + + case NVME_ZONE_ACTION_FINISH: + trace_pci_nvme_finish_zone(slba, zone_idx, all); + status = name_do_zone_op(ns, zone, state, all, + nvme_finish_zone, nvme_cond_finish_all); + break; + + case NVME_ZONE_ACTION_RESET: + trace_pci_nvme_reset_zone(slba, zone_idx, all); + status = name_do_zone_op(ns, zone, state, all, + nvme_reset_zone, nvme_cond_reset_all); + break; + + case NVME_ZONE_ACTION_OFFLINE: + trace_pci_nvme_offline_zone(slba, zone_idx, all); + status = name_do_zone_op(ns, zone, state, all, + nvme_offline_zone, nvme_cond_offline_all); + break; + + case NVME_ZONE_ACTION_SET_ZD_EXT: + trace_pci_nvme_set_descriptor_extension(slba, zone_idx); + return NVME_INVALID_FIELD | NVME_DNR; + break; + + default: + trace_pci_nvme_err_invalid_mgmt_action(action); + status = NVME_INVALID_FIELD; + } + + if (status == NVME_ZONE_INVAL_TRANSITION) { + trace_pci_nvme_err_invalid_zone_state_transition(state, action, slba, + zone->d.za); + } + if (status) { + status |= NVME_DNR; + } + + return status; +} + +static bool nvme_zone_matches_filter(uint32_t zafs, NvmeZone *zl) +{ + int zs = nvme_get_zone_state(zl); + + switch (zafs) { + case NVME_ZONE_REPORT_ALL: + return true; + case NVME_ZONE_REPORT_EMPTY: + return zs == NVME_ZONE_STATE_EMPTY; + case NVME_ZONE_REPORT_IMPLICITLY_OPEN: + return zs == NVME_ZONE_STATE_IMPLICITLY_OPEN; + case NVME_ZONE_REPORT_EXPLICITLY_OPEN: + return zs == NVME_ZONE_STATE_EXPLICITLY_OPEN; + case NVME_ZONE_REPORT_CLOSED: + return zs == NVME_ZONE_STATE_CLOSED; + case NVME_ZONE_REPORT_FULL: + return zs == NVME_ZONE_STATE_FULL; + case NVME_ZONE_REPORT_READ_ONLY: + return zs == NVME_ZONE_STATE_READ_ONLY; + case NVME_ZONE_REPORT_OFFLINE: + return zs == NVME_ZONE_STATE_OFFLINE; + default: + return false; + } +} + +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) +{ + NvmeCmd *cmd = (NvmeCmd *)&req->cmd; + NvmeNamespace *ns = req->ns; + /* cdw12 is zero-based number of dwords to return. Convert to bytes */ + uint32_t len = (le32_to_cpu(cmd->cdw12) + 1) << 2; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + uint32_t zone_idx, zra, zrasf, partial; + uint64_t max_zones, nr_zones = 0; + uint16_t ret; + uint64_t slba; + NvmeZoneDescr *z; + NvmeZone *zs; + NvmeZoneReportHeader *header; + void *buf, *buf_p; + size_t zone_entry_sz; + + req->status = NVME_SUCCESS; + + ret = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx); + if (ret) { + return ret; + } + + if (len < sizeof(NvmeZoneReportHeader)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zra = dw13 & 0xff; + if (!(zra == NVME_ZONE_REPORT || zra == NVME_ZONE_REPORT_EXTENDED)) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + if (zra == NVME_ZONE_REPORT_EXTENDED) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + zrasf = (dw13 >> 8) & 0xff; + if (zrasf > NVME_ZONE_REPORT_OFFLINE) { + return NVME_INVALID_FIELD | NVME_DNR; + } + + partial = (dw13 >> 16) & 0x01; + + zone_entry_sz = sizeof(NvmeZoneDescr); + + max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; + buf = g_malloc0(len); + + header = (NvmeZoneReportHeader *)buf; + buf_p = buf + sizeof(NvmeZoneReportHeader); + + while (zone_idx < ns->num_zones && nr_zones < max_zones) { + zs = &ns->zone_array[zone_idx]; + + if (!nvme_zone_matches_filter(zrasf, zs)) { + zone_idx++; + continue; + } + + z = (NvmeZoneDescr *)buf_p; + buf_p += sizeof(NvmeZoneDescr); + nr_zones++; + + z->zt = zs->d.zt; + z->zs = zs->d.zs; + z->zcap = cpu_to_le64(zs->d.zcap); + z->zslba = cpu_to_le64(zs->d.zslba); + z->za = zs->d.za; + + if (nvme_wp_is_valid(zs)) { + z->wp = cpu_to_le64(zs->d.wp); + } else { + z->wp = cpu_to_le64(~0ULL); + } + + zone_idx++; + } + + if (!partial) { + for (; zone_idx < ns->num_zones; zone_idx++) { + zs = &ns->zone_array[zone_idx]; + if (nvme_zone_matches_filter(zrasf, zs)) { + nr_zones++; + } + } + } + header->nr_zones = cpu_to_le64(nr_zones); + + ret = nvme_dma(n, (uint8_t *)buf, len, DMA_DIRECTION_FROM_DEVICE, req); + + g_free(buf); + + return ret; +} + static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) { uint32_t nsid = le32_to_cpu(req->cmd.nsid); @@ -1073,9 +1801,15 @@ static uint16_t nvme_io_cmd(NvmeCtrl *n, NvmeRequest *req) return nvme_flush(n, req); case NVME_CMD_WRITE_ZEROES: return nvme_write_zeroes(n, req); + case NVME_CMD_ZONE_APPEND: + return nvme_rw(n, req, true); case NVME_CMD_WRITE: case NVME_CMD_READ: - return nvme_rw(n, req); + return nvme_rw(n, req, false); + case NVME_CMD_ZONE_MGMT_SEND: + return nvme_zone_mgmt_send(n, req); + case NVME_CMD_ZONE_MGMT_RECV: + return nvme_zone_mgmt_recv(n, req); default: trace_pci_nvme_err_invalid_opc(req->cmd.opcode); return NVME_INVALID_OPCODE | NVME_DNR; @@ -1301,7 +2035,7 @@ static uint16_t nvme_error_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, DMA_DIRECTION_FROM_DEVICE, req); } -static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, +static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len, uint64_t off, NvmeRequest *req) { NvmeEffectsLog cmd_eff_log = {}; @@ -1326,11 +2060,20 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint32_t buf_len, acs[NVME_ADM_CMD_GET_LOG_PAGE] = NVME_CMD_EFFECTS_CSUPP; acs[NVME_ADM_CMD_ASYNC_EV_REQ] = NVME_CMD_EFFECTS_CSUPP; - iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | - NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; - iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + if (NVME_CC_CSS(n->bar.cc) != CSS_ADMIN_ONLY) { + iocs[NVME_CMD_FLUSH] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE_ZEROES] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_WRITE] = NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_READ] = NVME_CMD_EFFECTS_CSUPP; + } + + if (csi == NVME_CSI_ZONED && NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + iocs[NVME_CMD_ZONE_APPEND] = NVME_CMD_EFFECTS_CSUPP | + NVME_CMD_EFFECTS_LBCC; + iocs[NVME_CMD_ZONE_MGMT_SEND] = NVME_CMD_EFFECTS_CSUPP; + iocs[NVME_CMD_ZONE_MGMT_RECV] = NVME_CMD_EFFECTS_CSUPP; + } trans_len = MIN(sizeof(cmd_eff_log) - off, buf_len); @@ -1349,6 +2092,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) uint8_t lid = dw10 & 0xff; uint8_t lsp = (dw10 >> 8) & 0xf; uint8_t rae = (dw10 >> 15) & 0x1; + uint8_t csi = le32_to_cpu(cmd->cdw14) >> 24; uint32_t numdl, numdu; uint64_t off, lpol, lpou; size_t len; @@ -1382,7 +2126,7 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req) case NVME_LOG_FW_SLOT_INFO: return nvme_fw_log_info(n, len, off, req); case NVME_LOG_CMD_EFFECTS: - return nvme_cmd_effects(n, len, off, req); + return nvme_cmd_effects(n, csi, len, off, req); default: trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid); return NVME_INVALID_FIELD | NVME_DNR; @@ -1502,6 +2246,16 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req) return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req); } +static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns) +{ + switch (ns->csi) { + case NVME_CSI_NVM: + case NVME_CSI_ZONED: + return true; + } + return false; +} + static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_identify_ctrl(); @@ -1513,11 +2267,16 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req) { NvmeIdentify *c = (NvmeIdentify *)&req->cmd; + NvmeIdCtrlZoned id = {}; trace_pci_nvme_identify_ctrl_csi(c->csi); if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); + } else if (c->csi == NVME_CSI_ZONED) { + id.zasl = n->zasl; + return nvme_dma(n, (uint8_t *)&id, sizeof(id), + DMA_DIRECTION_FROM_DEVICE, req); } return NVME_INVALID_FIELD | NVME_DNR; @@ -1545,8 +2304,12 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, req); } - return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), - DMA_DIRECTION_FROM_DEVICE, req); + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { + return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs), + DMA_DIRECTION_FROM_DEVICE, req); + } + + return NVME_INVALID_CMD_SET | NVME_DNR; } static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, @@ -1571,8 +2334,11 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, return nvme_rpt_empty_id_struct(n, req); } - if (c->csi == NVME_CSI_NVM) { + if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { return nvme_rpt_empty_id_struct(n, req); + } else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) { + return nvme_dma(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned), + DMA_DIRECTION_FROM_DEVICE, req); } return NVME_INVALID_FIELD | NVME_DNR; @@ -1634,7 +2400,7 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi); - if (c->csi != NVME_CSI_NVM) { + if (c->csi != NVME_CSI_NVM && c->csi != NVME_CSI_ZONED) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1643,7 +2409,7 @@ static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req, if (!ns) { continue; } - if (ns->params.nsid < min_nsid) { + if (ns->params.nsid < min_nsid || c->csi != ns->csi) { continue; } if (only_active && !ns->params.attached) { @@ -1696,19 +2462,29 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req) desc->nidt = NVME_NIDT_CSI; desc->nidl = NVME_NIDL_CSI; list_ptr += sizeof(*desc); - *(uint8_t *)list_ptr = NVME_CSI_NVM; + *(uint8_t *)list_ptr = ns->csi; return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req) { + NvmeNamespace *ns; uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {}; static const int data_len = sizeof(list); + int i; trace_pci_nvme_identify_cmd_set(); NVME_SET_CSI(*list, NVME_CSI_NVM); + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (ns && ns->params.zoned) { + NVME_SET_CSI(*list, NVME_CSI_ZONED); + break; + } + } + return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req); } @@ -1751,7 +2527,7 @@ static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req) { uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff; - req->cqe.result = 1; + req->cqe.result32 = 1; if (nvme_check_sqid(n, sqid)) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1932,7 +2708,7 @@ defaults: } out: - req->cqe.result = cpu_to_le32(result); + req->cqe.result32 = cpu_to_le32(result); return NVME_SUCCESS; } @@ -2057,8 +2833,8 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req) ((dw11 >> 16) & 0xFFFF) + 1, n->params.max_ioqpairs, n->params.max_ioqpairs); - req->cqe.result = cpu_to_le32((n->params.max_ioqpairs - 1) | - ((n->params.max_ioqpairs - 1) << 16)); + req->cqe.result32 = cpu_to_le32((n->params.max_ioqpairs - 1) | + ((n->params.max_ioqpairs - 1) << 16)); break; case NVME_ASYNCHRONOUS_EVENT_CONF: n->features.async_config = dw11; @@ -2310,16 +3086,28 @@ static int nvme_start_ctrl(NvmeCtrl *n) continue; } ns->params.attached = false; - switch (ns->params.csi) { + switch (ns->csi) { case NVME_CSI_NVM: if (NVME_CC_CSS(n->bar.cc) == CSS_NVM_ONLY || NVME_CC_CSS(n->bar.cc) == CSS_CSI) { ns->params.attached = true; } break; + case NVME_CSI_ZONED: + if (NVME_CC_CSS(n->bar.cc) == CSS_CSI) { + ns->params.attached = true; + } + break; } } + if (!n->zasl_bs) { + assert(n->params.mdts); + n->zasl = n->params.mdts; + } else { + n->zasl = 31 - clz32(n->zasl_bs / n->page_size); + } + nvme_set_timestamp(n, 0ULL); QTAILQ_INIT(&n->aer_queue); @@ -2382,10 +3170,11 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data, case CSS_NVM_ONLY: trace_pci_nvme_css_nvm_cset_selected_by_host(data & 0xffffffff); - break; + break; case CSS_CSI: NVME_SET_CC_CSS(n->bar.cc, CSS_CSI); - trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff); + trace_pci_nvme_css_all_csets_sel_by_host(data & + 0xffffffff); break; case CSS_ADMIN_ONLY: break; @@ -2780,6 +3569,12 @@ static void nvme_init_state(NvmeCtrl *n) n->features.temp_thresh_hi = NVME_TEMPERATURE_WARNING; n->starttime_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); n->aer_reqs = g_new0(NvmeRequest *, n->params.aerl + 1); + + if (!n->params.zasl_kb) { + n->zasl_bs = n->params.mdts ? 0 : NVME_DEFAULT_MAX_ZA_SIZE * KiB; + } else { + n->zasl_bs = n->params.zasl_kb * KiB; + } } int nvme_register_namespace(NvmeCtrl *n, NvmeNamespace *ns, Error **errp) @@ -2985,8 +3780,9 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev) NVME_CAP_SET_CQR(n->bar.cap, 1); NVME_CAP_SET_TO(n->bar.cap, 0xf); /* - * The device now always supports NS Types, but all commands - * that support CSI field will only handle NVM Command Set. + * The device now always supports NS Types, even when "zoned" property + * is set to zero. If this is the case, all commands that support CSI + * field only handle NVM Command Set. */ NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP)); NVME_CAP_SET_MPSMAX(n->bar.cap, 4); @@ -3033,9 +3829,21 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) static void nvme_exit(PCIDevice *pci_dev) { NvmeCtrl *n = NVME(pci_dev); + NvmeNamespace *ns; + int i; nvme_clear_ctrl(n); + + for (i = 1; i <= n->num_namespaces; i++) { + ns = nvme_ns(n, i); + if (!ns) { + continue; + } + + nvme_ns_cleanup(ns); + } g_free(n->namespaces); + g_free(n->cq); g_free(n->sq); g_free(n->aer_reqs); @@ -3063,6 +3871,8 @@ static Property nvme_props[] = { DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false), + DEFINE_PROP_UINT8("fill_pattern", NvmeCtrl, params.fill_pattern, 0), + DEFINE_PROP_UINT32("zone_append_size_limit", NvmeCtrl, params.zasl_kb, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/block/nvme.h b/include/block/nvme.h index a7126e123f..628c665728 100644 --- a/include/block/nvme.h +++ b/include/block/nvme.h @@ -651,8 +651,10 @@ typedef struct QEMU_PACKED NvmeAerResult { } NvmeAerResult; typedef struct QEMU_PACKED NvmeCqe { - uint32_t result; - uint32_t rsvd; + union { + uint64_t result64; + uint32_t result32; + }; uint16_t sq_head; uint16_t sq_id; uint16_t cid; From patchwork Mon Sep 28 02:35:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32C05C4346E for ; Mon, 28 Sep 2020 02:44:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B929E22207 for ; Mon, 28 Sep 2020 02:44:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="gPv0rUY0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B929E22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55480 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj9j-0008A8-PR for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:44:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46434) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1G-0006yg-7Q; Sun, 27 Sep 2020 22:36:02 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:24223) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1D-0003Qn-Q3; Sun, 27 Sep 2020 22:36:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260559; x=1632796559; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6L3PlgujRKfdhSNMMi7uGGEnYVu22SDxebs0fKXdOCs=; b=gPv0rUY0/rB4Y0k6ihO99cHvGGKCpBeMRn1Gh9bw9zzaguJNYSly+GUa rzJEhM6wscNGq7uy6FgwJOCIysJEpx3PiFkdnORFmcu0XVA1VY8Adi52g 3FC1fdwS2dWrBk9BTCOgIzdjWz5SqtQrXC+TDabU82ufwUD48wjJgRCh9 3hk8pQ3pREM7xvyzf4lMFPv10m1Wgt77go8z7+NNVWitpawQRZpbM7+gL 3bU42WguU/XAoNJCw0QKDnMAeYg5+oe/NJwQUjDU7HcXLf9kWpbKUf1Ya BEE1e33FI/UpRJ55bc3q18OfsOJ58YAzjHkuBnX7GmQ6+/JxJ7e7Y0u+u A==; IronPort-SDR: eGESs7HT1BKTbCUhYEUPvseScAI6KBTQESy+4pO70b1vBr/AYinqnEcW5v91SCK5Rk2o+o2awt odC0AncH6m4gSJpIOcewi1//dnDW3UfClzGonnq6sWlRo9xHJHq4TEwEIELfLBScMKJhfgU0Do t/q5GgBWKHE4fG5ZJURi7nCMVx714sBIfG5HXX/N/OjYlzBb65C44XkW0p7D2E8MIb87aGBDul FvbJJv/4cpnEIZXkP2AEfIxS+awMm+49XRXC2gIBiyzVAl+fcbT4BLxJutc5QgPU4ipamHcaw8 XN0= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125249" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:56 +0800 IronPort-SDR: 7pHlDm55yXzGvCyJ8DSohEAwiUSKloS15HYDwx0dP15HiIEih9yKtdpov0WNGUJGsA1GpiGxhh /xOteuQRTKVA== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:52 -0700 IronPort-SDR: 3MZz5zYMDeF3R+wHcmx/PocsVaMUo0fOpddcru+h54XLmtc/34jlPR3eJmkO7En8/v/yNBJAxv tAFQMenEyMvQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:55 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 10/14] hw/block/nvme: Introduce max active and open zone limits Date: Mon, 28 Sep 2020 11:35:24 +0900 Message-Id: <20200928023528.15260-11-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Added two module properties, "max_active" and "max_open" to control the maximum number of zones that can be active or open. Once these variables are set to non-default values, these limits are checked during I/O and Too Many Active or Too Many Open command status is returned if they are exceeded. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.c | 42 +++++++++++++++++++- hw/block/nvme-ns.h | 42 ++++++++++++++++++++ hw/block/nvme.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 181 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 6d9dc9205b..63a2e3f47d 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -126,6 +126,28 @@ void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone) zone->prev = zone->next = 0; } +/* + * Take the first zone out from a list, return NULL if the list is empty. + */ +NvmeZone *nvme_remove_zone_head(NvmeNamespace *ns, NvmeZoneList *zl) +{ + NvmeZone *zone = nvme_peek_zone_head(ns, zl); + + if (zone) { + --zl->size; + if (zl->size == 0) { + zl->head = NVME_ZONE_LIST_NIL; + zl->tail = NVME_ZONE_LIST_NIL; + } else { + zl->head = zone->next; + ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + } + zone->prev = zone->next = 0; + } + + return zone; +} + static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) { uint64_t zone_size, zone_cap; @@ -156,6 +178,20 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) ns->zone_size_log2 = 63 - clz64(ns->zone_size); } + /* Make sure that the values of all ZNS properties are sane */ + if (ns->params.max_open_zones > nz) { + error_setg(errp, + "max_open_zones value %u exceeds the number of zones %u", + ns->params.max_open_zones, nz); + return -1; + } + if (ns->params.max_active_zones > nz) { + error_setg(errp, + "max_active_zones value %u exceeds the number of zones %u", + ns->params.max_active_zones, nz); + return -1; + } + return 0; } @@ -215,8 +251,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned)); /* MAR/MOR are zeroes-based, 0xffffffff means no limit */ - id_ns_z->mar = 0xffffffff; - id_ns_z->mor = 0xffffffff; + id_ns_z->mar = cpu_to_le32(ns->params.max_active_zones - 1); + id_ns_z->mor = cpu_to_le32(ns->params.max_open_zones - 1); id_ns_z->zoc = 0; id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; @@ -312,6 +348,8 @@ static Property nvme_ns_props[] = { params.zone_capacity_mb, 0), DEFINE_PROP_BOOL("cross_zone_read", NvmeNamespace, params.cross_zone_read, false), + DEFINE_PROP_UINT32("max_active", NvmeNamespace, params.max_active_zones, 0), + DEFINE_PROP_UINT32("max_open", NvmeNamespace, params.max_open_zones, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index daa13546c4..0664fe0892 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -45,6 +45,8 @@ typedef struct NvmeNamespaceParams { bool cross_zone_read; uint64_t zone_size_mb; uint64_t zone_capacity_mb; + uint32_t max_active_zones; + uint32_t max_open_zones; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -66,6 +68,8 @@ typedef struct NvmeNamespace { uint64_t zone_capacity; uint64_t zone_array_size; uint32_t zone_size_log2; + int32_t nr_open_zones; + int32_t nr_active_zones; NvmeNamespaceParams params; } NvmeNamespace; @@ -189,7 +193,45 @@ static inline NvmeZone *nvme_next_zone_in_list(NvmeNamespace *ns, NvmeZone *z, return &ns->zone_array[z->next]; } +static inline void nvme_aor_inc_open(NvmeNamespace *ns) +{ + assert(ns->nr_open_zones >= 0); + if (ns->params.max_open_zones) { + ns->nr_open_zones++; + assert(ns->nr_open_zones <= ns->params.max_open_zones); + } +} + +static inline void nvme_aor_dec_open(NvmeNamespace *ns) +{ + if (ns->params.max_open_zones) { + assert(ns->nr_open_zones > 0); + ns->nr_open_zones--; + } + assert(ns->nr_open_zones >= 0); +} + +static inline void nvme_aor_inc_active(NvmeNamespace *ns) +{ + assert(ns->nr_active_zones >= 0); + if (ns->params.max_active_zones) { + ns->nr_active_zones++; + assert(ns->nr_active_zones <= ns->params.max_active_zones); + } +} + +static inline void nvme_aor_dec_active(NvmeNamespace *ns) +{ + if (ns->params.max_active_zones) { + assert(ns->nr_active_zones > 0); + ns->nr_active_zones--; + assert(ns->nr_active_zones >= ns->nr_open_zones); + } + assert(ns->nr_active_zones >= 0); +} + void nvme_add_zone_tail(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); +NvmeZone *nvme_remove_zone_head(NvmeNamespace *ns, NvmeZoneList *zl); #endif /* NVME_NS_H */ diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 38e25a4d1f..40947aa659 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -168,6 +168,26 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, } } +/* + * Check if we can open a zone without exceeding open/active limits. + * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5). + */ +static int nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn) +{ + if (ns->params.max_active_zones != 0 && + ns->nr_active_zones + act > ns->params.max_active_zones) { + trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones); + return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR; + } + if (ns->params.max_open_zones != 0 && + ns->nr_open_zones + opn > ns->params.max_open_zones) { + trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones); + return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR; + } + + return NVME_SUCCESS; +} + static bool nvme_addr_is_cmb(NvmeCtrl *n, hwaddr addr) { hwaddr low = n->ctrl_mem.addr; @@ -1035,6 +1055,40 @@ static uint16_t nvme_check_zone_read(NvmeNamespace *ns, NvmeZone *zone, return status; } +static void nvme_auto_transition_zone(NvmeNamespace *ns, bool implicit, + bool adding_active) +{ + NvmeZone *zone; + + if (implicit && ns->params.max_open_zones && + ns->nr_open_zones == ns->params.max_open_zones) { + zone = nvme_remove_zone_head(ns, ns->imp_open_zones); + if (zone) { + /* + * Automatically close this implicitly open zone. + */ + nvme_aor_dec_open(ns); + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + } + } +} + +static uint16_t nvme_auto_open_zone(NvmeNamespace *ns, NvmeZone *zone) +{ + uint16_t status = NVME_SUCCESS; + uint8_t zs = nvme_get_zone_state(zone); + + if (zs == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(ns, true, true); + status = nvme_aor_check(ns, 1, 1); + } else if (zs == NVME_ZONE_STATE_CLOSED) { + nvme_auto_transition_zone(ns, true, false); + status = nvme_aor_check(ns, 0, 1); + } + + return status; +} + static inline uint32_t nvme_zone_idx(NvmeNamespace *ns, uint64_t slba) { return ns->zone_size_log2 > 0 ? slba >> ns->zone_size_log2 : @@ -1080,7 +1134,11 @@ static bool nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req, switch (zs) { case NVME_ZONE_STATE_IMPLICITLY_OPEN: case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); /* fall through */ @@ -1109,7 +1167,10 @@ static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone, zs = nvme_get_zone_state(zone); switch (zs) { case NVME_ZONE_STATE_EMPTY: + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); } } @@ -1282,6 +1343,11 @@ static uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return NVME_ZONE_INVALID_WRITE | NVME_DNR; } + status = nvme_auto_open_zone(ns, zone); + if (status != NVME_SUCCESS) { + return status; + } + req->cqe.result64 = nvme_advance_zone_wp(ns, zone, nlb); } @@ -1349,6 +1415,12 @@ static uint16_t nvme_rw(NvmeCtrl *n, NvmeRequest *req, bool append) status = NVME_ZONE_INVALID_WRITE | NVME_DNR; goto invalid; } + + status = nvme_auto_open_zone(ns, zone); + if (status != NVME_SUCCESS) { + return status; + } + req->fill_ofs = -1LL; } else { status = nvme_check_zone_read(ns, zone, slba, nlb); @@ -1434,9 +1506,27 @@ static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, uint8_t state) { + uint16_t status; + switch (state) { case NVME_ZONE_STATE_EMPTY: + nvme_auto_transition_zone(ns, false, true); + status = nvme_aor_check(ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + status = nvme_aor_check(ns, 0, 1); + if (status != NVME_SUCCESS) { + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_aor_dec_active(ns); + } + return status; + } + nvme_aor_inc_open(ns); + /* fall through */ case NVME_ZONE_STATE_IMPLICITLY_OPEN: nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); /* fall through */ @@ -1458,6 +1548,7 @@ static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); /* fall through */ case NVME_ZONE_STATE_CLOSED: @@ -1479,7 +1570,11 @@ static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_EMPTY: zone->w_ptr = nvme_zone_wr_boundary(zone); zone->d.wp = zone->w_ptr; @@ -1505,7 +1600,11 @@ static uint16_t nvme_reset_zone(NvmeNamespace *ns, NvmeZone *zone, switch (state) { case NVME_ZONE_STATE_EXPLICITLY_OPEN: case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_aor_dec_open(ns); + /* fall through */ case NVME_ZONE_STATE_CLOSED: + nvme_aor_dec_active(ns); + /* fall through */ case NVME_ZONE_STATE_FULL: zone->w_ptr = zone->d.zslba; zone->d.wp = zone->w_ptr; From patchwork Mon Sep 28 02:35:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53FFDC4727D for ; Mon, 28 Sep 2020 02:41:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DDB1122262 for ; Mon, 28 Sep 2020 02:41:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="IONB8yJh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DDB1122262 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46452 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMj6x-0004WL-0v for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:41:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46464) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1H-000734-Vr; Sun, 27 Sep 2020 22:36:04 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21635) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1E-0003Me-9M; Sun, 27 Sep 2020 22:36:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260560; x=1632796560; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RRUH3XJ5WtNv6SkJC3qV061JD5ws6PVCqRhg5m53NkA=; b=IONB8yJhR4vswXSbvajfX4bY5pZUmicX8g3iFLKPWbeZezr2JWXIfmwg quOkF7AMR9cjbxeUfJHUhrZbS1U/O8u0OCDy/xIkue3GzMU4SfNJt0Sp3 RdSOEUszUdcER+CYQuX03D4YyXaIRLv/yXjlrsJrQ3u/gJbcpsZBSAueb Nv8eKpJ45XbJgpC/G5OuLNz/zH8cmMO+FpbgpJTsJ9WF5paVK8I1drM/i w0TVNsvsCk2qkQMAAH2FUPIS0dQKX7UbsIK7C6GbCAwT0xBUcm2KhNEzG HHkVM9Fy+VjP6yrdSMT5oZJKv/hbMUaBjGlF2Qy4rfk1PFf0w6sw5UL1G g==; IronPort-SDR: 5FNnv0wW2WoafNMQhI9G/v9LcEgZrGit7kWzZu729ZyFY7f2ZVaj7+97LM6TfXkuUGAwFc1he2 AVNgNJqCHjjIKgrRRFS4lLizgXgZSp7VqHWcXtlup7klmUHwnjgC/deCRet5bvXfu08wiUX91M gCwJCBJL0iXISdddYki0ieHVPcKfEc/At9OaG2rOx/Q/d1b/LbR1z0zTfQ1qCEj7GpDkiYx6QU 0aGjCqmOQ6Qh+wRAiS7N5B4ZHDAzLeZGTFn7Nb7Ayvck/tnycT7gTT+vUw1y5B0L+txZtH05qD s3I= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125250" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:35:58 +0800 IronPort-SDR: dTDOdPVP7WssZHoqj6WpuoAfQdoxpc/FonsJMbh2HXJO/rZpG0/vDrbzLq61r4moF6w8z5WL21 wBRtXtPAklpA== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:54 -0700 IronPort-SDR: xoT52ZGRlhqaC9MjMehB37rXtMJo2u1t5i+76EWpFB1pmMU08hxFEbsA0FaSI2Ev6opohFvmZa rOkaMnD18lUg== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:57 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 11/14] hw/block/nvme: Support Zone Descriptor Extensions Date: Mon, 28 Sep 2020 11:35:25 +0900 Message-Id: <20200928023528.15260-12-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Zone Descriptor Extension is a label that can be assigned to a zone. It can be set to an Empty zone and it stays assigned until the zone is reset. This commit adds a new optional module property, "zone_descr_ext_size". Its value must be a multiple of 64 bytes. If this value is non-zero, it becomes possible to assign extensions of that size to any Empty zones. The default value for this property is 0, therefore setting extensions is disabled by default. Signed-off-by: Hans Holmberg Signed-off-by: Dmitry Fomichev Reviewed-by: Klaus Jensen --- hw/block/nvme-ns.c | 10 ++++++++- hw/block/nvme-ns.h | 8 ++++++++ hw/block/nvme.c | 51 ++++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 66 insertions(+), 3 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 63a2e3f47d..60156dfeaf 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -207,6 +207,10 @@ static void nvme_init_zone_meta(NvmeNamespace *ns) ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + if (ns->params.zd_extension_size) { + ns->zd_extensions = g_malloc0(ns->params.zd_extension_size * + ns->num_zones); + } nvme_init_zone_list(ns->exp_open_zones); nvme_init_zone_list(ns->imp_open_zones); @@ -257,7 +261,8 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, id_ns_z->ozcs = ns->params.cross_zone_read ? 0x01 : 0x00; id_ns_z->lbafe[lba_index].zsze = cpu_to_le64(ns->zone_size); - id_ns_z->lbafe[lba_index].zdes = 0; /* FIXME make helper */ + id_ns_z->lbafe[lba_index].zdes = + ns->params.zd_extension_size >> 6; /* Units of 64B */ ns->csi = NVME_CSI_ZONED; ns->id_ns.ncap = cpu_to_le64(ns->zone_capacity * ns->num_zones); @@ -321,6 +326,7 @@ void nvme_ns_cleanup(NvmeNamespace *ns) g_free(ns->imp_open_zones); g_free(ns->closed_zones); g_free(ns->full_zones); + g_free(ns->zd_extensions); } static void nvme_ns_realize(DeviceState *dev, Error **errp) @@ -350,6 +356,8 @@ static Property nvme_ns_props[] = { params.cross_zone_read, false), DEFINE_PROP_UINT32("max_active", NvmeNamespace, params.max_active_zones, 0), DEFINE_PROP_UINT32("max_open", NvmeNamespace, params.max_open_zones, 0), + DEFINE_PROP_UINT32("zone_descr_ext_size", NvmeNamespace, + params.zd_extension_size, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index 0664fe0892..ed14644e09 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -47,6 +47,7 @@ typedef struct NvmeNamespaceParams { uint64_t zone_capacity_mb; uint32_t max_active_zones; uint32_t max_open_zones; + uint32_t zd_extension_size; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -68,6 +69,7 @@ typedef struct NvmeNamespace { uint64_t zone_capacity; uint64_t zone_array_size; uint32_t zone_size_log2; + uint8_t *zd_extensions; int32_t nr_open_zones; int32_t nr_active_zones; @@ -142,6 +144,12 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st != NVME_ZONE_STATE_OFFLINE; } +static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns, + uint32_t zone_idx) +{ + return &ns->zd_extensions[zone_idx * ns->params.zd_extension_size]; +} + /* * Initialize a zone list head. */ diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 40947aa659..27d191c659 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1644,6 +1644,26 @@ static bool nvme_cond_offline_all(uint8_t state) return state == NVME_ZONE_STATE_READ_ONLY; } +static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone, + uint8_t state) +{ + uint16_t status; + + if (state == NVME_ZONE_STATE_EMPTY) { + nvme_auto_transition_zone(ns, false, true); + status = nvme_aor_check(ns, 1, 0); + if (status != NVME_SUCCESS) { + return status; + } + nvme_aor_inc_active(ns); + zone->d.za |= NVME_ZA_ZD_EXT_VALID; + nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); + return NVME_SUCCESS; + } + + return NVME_ZONE_INVAL_TRANSITION; +} + typedef uint16_t (*op_handler_t)(NvmeNamespace *, NvmeZone *, uint8_t); typedef bool (*need_to_proc_zone_t)(uint8_t); @@ -1684,6 +1704,7 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) uint8_t action, state; bool all; NvmeZone *zone; + uint8_t *zd_ext; action = dw13 & 0xff; all = dw13 & 0x100; @@ -1738,7 +1759,22 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) case NVME_ZONE_ACTION_SET_ZD_EXT: trace_pci_nvme_set_descriptor_extension(slba, zone_idx); - return NVME_INVALID_FIELD | NVME_DNR; + if (all || !ns->params.zd_extension_size) { + return NVME_INVALID_FIELD | NVME_DNR; + } + zd_ext = nvme_get_zd_extension(ns, zone_idx); + status = nvme_dma(n, zd_ext, ns->params.zd_extension_size, + DMA_DIRECTION_TO_DEVICE, req); + if (status) { + trace_pci_nvme_err_zd_extension_map_error(zone_idx); + return status; + } + + status = nvme_set_zd_ext(ns, zone, state); + if (status == NVME_SUCCESS) { + trace_pci_nvme_zd_extension_set(zone_idx); + return status; + } break; default: @@ -1816,7 +1852,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zra == NVME_ZONE_REPORT_EXTENDED && !ns->params.zd_extension_size) { return NVME_INVALID_FIELD | NVME_DNR; } @@ -1828,6 +1864,9 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) partial = (dw13 >> 16) & 0x01; zone_entry_sz = sizeof(NvmeZoneDescr); + if (zra == NVME_ZONE_REPORT_EXTENDED) { + zone_entry_sz += ns->params.zd_extension_size; + } max_zones = (len - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; buf = g_malloc0(len); @@ -1859,6 +1898,14 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) z->wp = cpu_to_le64(~0ULL); } + if (zra == NVME_ZONE_REPORT_EXTENDED) { + if (zs->d.za & NVME_ZA_ZD_EXT_VALID) { + memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx), + ns->params.zd_extension_size); + } + buf_p += ns->params.zd_extension_size; + } + zone_idx++; } From patchwork Mon Sep 28 02:35:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B226C4346E for ; Mon, 28 Sep 2020 02:48:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D232F22207 for ; Mon, 28 Sep 2020 02:48:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="bWMnLzgV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D232F22207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36382 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMjCu-0003cC-V4 for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:48:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46552) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1c-0007j1-Nm; Sun, 27 Sep 2020 22:36:24 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:24223) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1a-0003Qn-SP; Sun, 27 Sep 2020 22:36:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260582; x=1632796582; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VAGNi4spwsntd9O1DrvwvG6JaIyYEgfNJFW+sZPWRqI=; b=bWMnLzgVMtpGcgegrNAm7h74r3PmD/OApF5JanYBW15+w6OerFzEXpzw Y02BOQA8IemuyBwgdmFijvEQ1PxkLDrAAVp/FCe/tPOdnw0RBSaQCq+bX YX8PQENRnVdIvVKAg/rVeyrYl/vRmmI4TshUuHnZB0cl89pZkw7gPsTLa p0SNWBEQ3T4fhQRgyXMwREOpnuQLu2C8MnT1j1wCnaL7kCS6bqdsqcOvL V5EUZ/aW0pp457bOnl3S0uoZLC/qFXfxL+ECum8Ig4cwYrpLgLQk5IsNu zDpt7NDW/GQ/Cy5efZHuRDgjlQ3PIt/bXslYdRFjs21Niv7p11FQWZMXe g==; IronPort-SDR: j1+vUmkjWV0hmIZAO76jfzYWJ/zQV0rQMKC15m4JERe3BeRrezQHEeehqLfDqaBgcEE8dhSP8M 5JTC+PvZLlSpZJZzsJc9XXHfV47bK6vsgc/UzQ8dYQRAMiqEfmhYpeblDCiqXdlq97z2NMmZuZ xSttt5zUqtJnlyo7Q92gPdW3AkI/tfQpD5No+kFvkwa9xvG52WOKIfmNVzFOWkNoWaGsZkl1mE Q7jF0ChxJl8+Akj/IRG+j1u8uInyPtTOXVcHme2NkBk0E2iB4UgzNwUHNJJ2UbOAGLc47Xxtnp D7g= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125251" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:36:00 +0800 IronPort-SDR: 82Dvgp2jxDTCAusTx0fB1e3SMP39dS8+ag8AWYzg2PmjY1fw00+Ka44KargdIgWg0DU0HBemn4 05cFxXV2YWoA== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:57 -0700 IronPort-SDR: /7X+CZ8f5qcbWvHiD28CG1WhfZlGmopsGvWLjM/hDr5eDU+Xc1/8YuPhpQ64hBKpW3YJmxrMVe 4RDQu5FoSHDg== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:35:59 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 12/14] hw/block/nvme: Add injection of Offline/Read-Only zones Date: Mon, 28 Sep 2020 11:35:26 +0900 Message-Id: <20200928023528.15260-13-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" ZNS specification defines two zone conditions for the zones that no longer can function properly, possibly because of flash wear or other internal fault. It is useful to be able to "inject" a small number of such zones for testing purposes. This commit defines two optional device properties, "offline_zones" and "rdonly_zones". Users can assign non-zero values to these variables to specify the number of zones to be initialized as Offline or Read-Only. The actual number of injected zones may be smaller than the requested amount - Read-Only and Offline counts are expected to be much smaller than the total number of zones on a drive. Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++ hw/block/nvme-ns.h | 2 ++ hw/block/nvme.c | 1 - 3 files changed, 66 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 60156dfeaf..47751f2d54 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -21,6 +21,7 @@ #include "sysemu/sysemu.h" #include "sysemu/block-backend.h" #include "qapi/error.h" +#include "crypto/random.h" #include "hw/qdev-properties.h" #include "hw/qdev-core.h" @@ -192,6 +193,32 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) return -1; } + if (ns->params.zd_extension_size) { + if (ns->params.zd_extension_size & 0x3f) { + error_setg(errp, + "zone descriptor extension size must be a multiple of 64B"); + return -1; + } + if ((ns->params.zd_extension_size >> 6) > 0xff) { + error_setg(errp, "zone descriptor extension size is too large"); + return -1; + } + } + + if (ns->params.max_open_zones < nz) { + if (ns->params.nr_offline_zones > nz - ns->params.max_open_zones) { + error_setg(errp, "offline_zones value %u is too large", + ns->params.nr_offline_zones); + return -1; + } + if (ns->params.nr_rdonly_zones > + nz - ns->params.max_open_zones - ns->params.nr_offline_zones) { + error_setg(errp, "rdonly_zones value %u is too large", + ns->params.nr_rdonly_zones); + return -1; + } + } + return 0; } @@ -200,7 +227,9 @@ static void nvme_init_zone_meta(NvmeNamespace *ns) uint64_t start = 0, zone_size = ns->zone_size; uint64_t capacity = ns->num_zones * zone_size; NvmeZone *zone; + uint32_t rnd; int i; + uint16_t zs; ns->zone_array = g_malloc0(ns->zone_array_size); ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); @@ -233,6 +262,37 @@ static void nvme_init_zone_meta(NvmeNamespace *ns) zone->next = 0; start += zone_size; } + + /* If required, make some zones Offline or Read Only */ + + for (i = 0; i < ns->params.nr_offline_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), NULL); + rnd %= ns->num_zones; + } while (rnd < ns->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_OFFLINE); + } else { + i--; + } + } + + for (i = 0; i < ns->params.nr_rdonly_zones; i++) { + do { + qcrypto_random_bytes(&rnd, sizeof(rnd), NULL); + rnd %= ns->num_zones; + } while (rnd < ns->params.max_open_zones); + zone = &ns->zone_array[rnd]; + zs = nvme_get_zone_state(zone); + if (zs != NVME_ZONE_STATE_OFFLINE && + zs != NVME_ZONE_STATE_READ_ONLY) { + nvme_set_zone_state(zone, NVME_ZONE_STATE_READ_ONLY); + } else { + i--; + } + } } static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, @@ -358,6 +418,10 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT32("max_open", NvmeNamespace, params.max_open_zones, 0), DEFINE_PROP_UINT32("zone_descr_ext_size", NvmeNamespace, params.zd_extension_size, 0), + DEFINE_PROP_UINT32("offline_zones", NvmeNamespace, + params.nr_offline_zones, 0), + DEFINE_PROP_UINT32("rdonly_zones", NvmeNamespace, + params.nr_rdonly_zones, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index ed14644e09..e9b90f9677 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -48,6 +48,8 @@ typedef struct NvmeNamespaceParams { uint32_t max_active_zones; uint32_t max_open_zones; uint32_t zd_extension_size; + uint32_t nr_offline_zones; + uint32_t nr_rdonly_zones; } NvmeNamespaceParams; typedef struct NvmeNamespace { diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 27d191c659..80973f3ff6 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -54,7 +54,6 @@ #include "qemu/osdep.h" #include "qemu/units.h" #include "qemu/error-report.h" -#include "crypto/random.h" #include "hw/block/block.h" #include "hw/pci/msix.h" #include "hw/pci/pci.h" From patchwork Mon Sep 28 02:35:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 304319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AB2BC4346E for ; Mon, 28 Sep 2020 02:48:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 71E6522262 for ; Mon, 28 Sep 2020 02:48:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="pVRIiYt2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 71E6522262 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38808 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMjDJ-0004bx-Jw for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:48:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46570) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1f-0007lo-FU; Sun, 27 Sep 2020 22:36:27 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21635) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1c-0003Me-L2; Sun, 27 Sep 2020 22:36:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260584; x=1632796584; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r8vzmyyR3tFyO2UWJ8Gnqxx1eAjN58tS1L3B7DncrmU=; b=pVRIiYt2jzvDr4ljf4qW2xo2eIczYxsgSB1kT1YvwF2I87CS/I+yDULV tDbxMm2/380ereNH7JMppumUU4rolDY6JVUY2NstBo1zeN2mMN4W1n4zI z+SFKIwGAE2IutQDiaETlHE0fNwh0KRlyaUfgWakB2w4MU2GU40/pwHAd MHpIMYHafSr1syUZ3Yefj3mtLWzr6tRk7MfW3Ic/Hmrwh1Da8ToVlTKWt khylJvSOwGtumRxMNLXeWcbq7aPqv5I0aqqNwFfFOefh7HvQEI/wAHlCI qHN7Uf+3uEQuHaOVjyLKmk00OJh/tepWjjh1/c8J7K7LLjhujm9cfm4Dn A==; IronPort-SDR: zg04eFM0Lw5sZHCTL9J1JHXQymj7FZfSez6wGL1ZTCJ/fY6ygDnmPIhQvIXrEj7whVOTTySSOz LXdNtEPWodJRHpfK3bs7Ve7mV62wXJdcQli3uyoPHX7oXx79zlWUr9EJ6sIHMq7y0M73c/MSaf WHJ5a0Qo+ZDrGCgKIbyrg0wNJRIRm35ZysNIVq9eX2YixnELhyWz3iqbIsg/LB4+pVwIooJsVX 2cxj6Ni6wbAkGHYGVfQM9dVOYUZleHrCS33jXyHwU2nXvsJR3tdq8tR54KJlEaLRXftJTh8WOL VMs= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125252" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:36:03 +0800 IronPort-SDR: BvUYic4SiQhf2kjyb1lfNmfcXnoV21DAiKVyygZ1ZXbT1JWRCL5MVvXVVkb6CwiiLtpVDnXrW8 iE8h/O4/c5Xg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:22:59 -0700 IronPort-SDR: OTFCS/5f4yBV75bxhZmptcCxO9QEFBfuH/crjULHD+BfQQtiC1b/Okq1bYoyYQOT8NDQIMWw+X xYtTf39xzINA== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:36:01 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 13/14] hw/block/nvme: Use zone metadata file for persistence Date: Mon, 28 Sep 2020 11:35:27 +0900 Message-Id: <20200928023528.15260-14-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A ZNS drive that is emulated by this module is currently initialized with all zones Empty upon startup. However, actual ZNS SSDs save the state and condition of all zones in their internal NVRAM in the event of power loss. When such a drive is powered up again, it closes or finishes all zones that were open at the moment of shutdown. Besides that, the write pointer position as well as the state and condition of all zones is preserved across power-downs. This commit adds the capability to have a persistent zone metadata to the device. The new optional module property, "zone_file", is introduced. If added to the command line, this property specifies the name of the file that stores the zone metadata. If "zone_file" is omitted, the device will be initialized with all zones empty, the same as before. If zone metadata is configured to be persistent, then zone descriptor extensions also persist across controller shutdowns. Signed-off-by: Dmitry Fomichev --- hw/block/nvme-ns.c | 341 ++++++++++++++++++++++++++++++++++++++++-- hw/block/nvme-ns.h | 33 ++++ hw/block/nvme.c | 2 + hw/block/trace-events | 1 + 4 files changed, 362 insertions(+), 15 deletions(-) diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 47751f2d54..a94021da81 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -20,12 +20,15 @@ #include "hw/pci/pci.h" #include "sysemu/sysemu.h" #include "sysemu/block-backend.h" +#include "sysemu/hostmem.h" +#include "qom/object_interfaces.h" #include "qapi/error.h" #include "crypto/random.h" #include "hw/qdev-properties.h" #include "hw/qdev-core.h" +#include "trace.h" #include "nvme.h" #include "nvme-ns.h" @@ -98,6 +101,7 @@ void nvme_add_zone_tail(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone) zl->tail = idx; } zl->size++; + nvme_set_zone_meta_dirty(ns); } /* @@ -113,12 +117,15 @@ void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone) if (zl->size == 0) { zl->head = NVME_ZONE_LIST_NIL; zl->tail = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(ns); } else if (idx == zl->head) { zl->head = zone->next; ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(ns); } else if (idx == zl->tail) { zl->tail = zone->prev; ns->zone_array[zl->tail].next = NVME_ZONE_LIST_NIL; + nvme_set_zone_meta_dirty(ns); } else { ns->zone_array[zone->next].prev = zone->prev; ns->zone_array[zone->prev].next = zone->next; @@ -144,6 +151,7 @@ NvmeZone *nvme_remove_zone_head(NvmeNamespace *ns, NvmeZoneList *zl) ns->zone_array[zl->head].prev = NVME_ZONE_LIST_NIL; } zone->prev = zone->next = 0; + nvme_set_zone_meta_dirty(ns); } return zone; @@ -219,11 +227,119 @@ static int nvme_calc_zone_geometry(NvmeNamespace *ns, Error **errp) } } + ns->meta_size = sizeof(NvmeZoneMeta) + ns->zone_array_size + + nz * ns->params.zd_extension_size; + ns->meta_size = ROUND_UP(ns->meta_size, qemu_real_host_page_size); + + return 0; +} + +static int nvme_validate_zone_file(NvmeNamespace *ns, uint64_t capacity) +{ + NvmeZoneMeta *meta = ns->zone_meta; + NvmeZone *zone = ns->zone_array; + uint64_t start = 0, zone_size = ns->zone_size; + int i, n_imp_open = 0, n_exp_open = 0, n_closed = 0, n_full = 0; + + if (meta->magic != NVME_ZONE_META_MAGIC) { + return 1; + } + if (meta->version != NVME_ZONE_META_VER) { + return 2; + } + if (meta->zone_size != zone_size) { + return 3; + } + if (meta->zone_capacity != ns->zone_capacity) { + return 4; + } + if (meta->nr_offline_zones != ns->params.nr_offline_zones) { + return 5; + } + if (meta->nr_rdonly_zones != ns->params.nr_rdonly_zones) { + return 6; + } + if (meta->lba_size != ns->blkconf.logical_block_size) { + return 7; + } + if (meta->zd_extension_size != ns->params.zd_extension_size) { + return 8; + } + + for (i = 0; i < ns->num_zones; i++, zone++) { + if (start + zone_size > capacity) { + zone_size = capacity - start; + } + if (zone->d.zt != NVME_ZONE_TYPE_SEQ_WRITE) { + return 9; + } + if (zone->d.zcap != ns->zone_capacity) { + return 10; + } + if (zone->d.zslba != start) { + return 11; + } + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_EMPTY: + case NVME_ZONE_STATE_OFFLINE: + case NVME_ZONE_STATE_READ_ONLY: + if (zone->d.wp != start) { + return 12; + } + break; + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_imp_open++; + break; + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_exp_open++; + break; + case NVME_ZONE_STATE_CLOSED: + if (zone->d.wp < start || + zone->d.wp >= zone->d.zslba + zone->d.zcap) { + return 13; + } + n_closed++; + break; + case NVME_ZONE_STATE_FULL: + if (zone->d.wp != zone->d.zslba + zone->d.zcap) { + return 14; + } + n_full++; + break; + default: + return 15; + } + + start += zone_size; + } + + if (n_exp_open != nvme_zone_list_size(ns->exp_open_zones)) { + return 16; + } + if (n_imp_open != nvme_zone_list_size(ns->imp_open_zones)) { + return 17; + } + if (n_closed != nvme_zone_list_size(ns->closed_zones)) { + return 18; + } + if (n_full != nvme_zone_list_size(ns->full_zones)) { + return 19; + } + return 0; } static void nvme_init_zone_meta(NvmeNamespace *ns) { + NvmeZoneMeta *meta = ns->zone_meta; uint64_t start = 0, zone_size = ns->zone_size; uint64_t capacity = ns->num_zones * zone_size; NvmeZone *zone; @@ -231,14 +347,26 @@ static void nvme_init_zone_meta(NvmeNamespace *ns) int i; uint16_t zs; - ns->zone_array = g_malloc0(ns->zone_array_size); - ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); - ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); - if (ns->params.zd_extension_size) { - ns->zd_extensions = g_malloc0(ns->params.zd_extension_size * - ns->num_zones); + if (ns->params.zone_file) { + meta->magic = NVME_ZONE_META_MAGIC; + meta->version = NVME_ZONE_META_VER; + meta->zone_size = zone_size; + meta->zone_capacity = ns->zone_capacity; + meta->lba_size = ns->blkconf.logical_block_size; + meta->nr_offline_zones = ns->params.nr_offline_zones; + meta->nr_rdonly_zones = ns->params.nr_rdonly_zones; + meta->zd_extension_size = ns->params.zd_extension_size; + } else { + assert(!ns->zone_meta); + ns->zone_array = g_malloc0(ns->zone_array_size); + ns->exp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->imp_open_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->closed_zones = g_malloc0(sizeof(NvmeZoneList)); + ns->full_zones = g_malloc0(sizeof(NvmeZoneList)); + if (ns->params.zd_extension_size) { + ns->zd_extensions = g_malloc0(ns->params.zd_extension_size * + ns->num_zones); + } } nvme_init_zone_list(ns->exp_open_zones); @@ -293,12 +421,180 @@ static void nvme_init_zone_meta(NvmeNamespace *ns) i--; } } + + if (ns->params.zone_file) { + nvme_set_zone_meta_dirty(ns); + } +} + +static int nvme_open_zone_file(NvmeNamespace *ns, bool *init_meta, + Error **errp) +{ + Object *file_be; + HostMemoryBackend *fb; + struct stat statbuf; + int ret; + + ret = stat(ns->params.zone_file, &statbuf); + if (ret && errno == ENOENT) { + *init_meta = true; + } else if (!S_ISREG(statbuf.st_mode)) { + error_setg(errp, "\"%s\" is not a regular file", + ns->params.zone_file); + return -1; + } + + file_be = object_new(TYPE_MEMORY_BACKEND_FILE); + object_property_set_str(file_be, "mem-path", ns->params.zone_file, + &error_abort); + object_property_set_int(file_be, "size", ns->meta_size, &error_abort); + object_property_set_bool(file_be, "share", true, &error_abort); + object_property_set_bool(file_be, "discard-data", false, &error_abort); + if (!user_creatable_complete(USER_CREATABLE(file_be), errp)) { + object_unref(file_be); + return -1; + } + object_property_add_child(OBJECT(ns), "_fb", file_be); + object_unref(file_be); + + fb = MEMORY_BACKEND(file_be); + ns->zone_mr = host_memory_backend_get_memory(fb); + + return 0; +} + +static int nvme_map_zone_file(NvmeNamespace *ns, bool *init_meta) +{ + ns->zone_meta = (void *)memory_region_get_ram_ptr(ns->zone_mr); + ns->zone_array = (NvmeZone *)(ns->zone_meta + 1); + ns->exp_open_zones = &ns->zone_meta->exp_open_zones; + ns->imp_open_zones = &ns->zone_meta->imp_open_zones; + ns->closed_zones = &ns->zone_meta->closed_zones; + ns->full_zones = &ns->zone_meta->full_zones; + + if (ns->params.zd_extension_size) { + ns->zd_extensions = (uint8_t *)(ns->zone_meta + 1); + ns->zd_extensions += ns->zone_array_size; + } + + return 0; +} + +void nvme_sync_zone_file(NvmeNamespace *ns, NvmeZone *zone, int len) +{ + uintptr_t z = (uintptr_t)zone, off = z - (uintptr_t)ns->zone_meta; + + if (ns->zone_meta) { + memory_region_msync(ns->zone_mr, off, len); + + if (ns->zone_meta->dirty) { + ns->zone_meta->dirty = false; + memory_region_msync(ns->zone_mr, 0, sizeof(NvmeZoneMeta)); + } + } +} + +/* + * Close or finish all the zones that might be still open after power-down. + */ +static void nvme_prepare_zones(NvmeNamespace *ns) +{ + NvmeZone *zone; + uint32_t set_state; + int i; + + assert(!ns->nr_active_zones); + assert(!ns->nr_open_zones); + + zone = ns->zone_array; + for (i = 0; i < ns->num_zones; i++, zone++) { + switch (nvme_get_zone_state(zone)) { + case NVME_ZONE_STATE_IMPLICITLY_OPEN: + nvme_remove_zone(ns, ns->imp_open_zones, zone); + break; + case NVME_ZONE_STATE_EXPLICITLY_OPEN: + nvme_remove_zone(ns, ns->exp_open_zones, zone); + break; + case NVME_ZONE_STATE_CLOSED: + nvme_aor_inc_active(ns); + /* fall through */ + default: + continue; + } + + if (zone->d.za & NVME_ZA_ZD_EXT_VALID) { + set_state = NVME_ZONE_STATE_CLOSED; + } else if (zone->d.wp == zone->d.zslba) { + set_state = NVME_ZONE_STATE_EMPTY; + } else if (ns->params.max_active_zones == 0 || + ns->nr_active_zones < ns->params.max_active_zones) { + set_state = NVME_ZONE_STATE_CLOSED; + } else { + set_state = NVME_ZONE_STATE_FULL; + } + + switch (set_state) { + case NVME_ZONE_STATE_CLOSED: + trace_pci_nvme_power_on_close(nvme_get_zone_state(zone), + zone->d.zslba); + nvme_aor_inc_active(ns); + nvme_add_zone_tail(ns, ns->closed_zones, zone); + break; + case NVME_ZONE_STATE_EMPTY: + trace_pci_nvme_power_on_reset(nvme_get_zone_state(zone), + zone->d.zslba); + break; + case NVME_ZONE_STATE_FULL: + trace_pci_nvme_power_on_full(nvme_get_zone_state(zone), + zone->d.zslba); + zone->d.wp = nvme_zone_wr_boundary(zone); + } + + zone->w_ptr = zone->d.wp; + nvme_set_zone_state(zone, set_state); + } +} + +static int nvme_load_zone_meta(NvmeNamespace *ns, bool *init_meta) +{ + uint64_t capacity = ns->num_zones * ns->zone_size; + int ret = 0; + + if (ns->params.zone_file) { + ret = nvme_map_zone_file(ns, init_meta); + trace_pci_nvme_mapped_zone_file(ns->params.zone_file, ret); + if (ret < 0) { + return ret; + } + + if (!*init_meta) { + ret = nvme_validate_zone_file(ns, capacity); + if (ret) { + trace_pci_nvme_err_zone_file_invalid(ret); + *init_meta = true; + } + } + } else { + *init_meta = true; + } + + if (*init_meta) { + nvme_init_zone_meta(ns); + trace_pci_nvme_initialized_zone_file(ns->params.zone_file); + } else { + nvme_prepare_zones(ns); + } + nvme_sync_zone_file(ns, ns->zone_array, ns->zone_array_size); + + return 0; } static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, Error **errp) { NvmeIdNsZoned *id_ns_z; + int ret; + bool init_meta = false; if (n->params.fill_pattern == 0) { ns->id_ns.dlfeat |= 0x01; @@ -310,7 +606,17 @@ static int nvme_zoned_init_ns(NvmeCtrl *n, NvmeNamespace *ns, int lba_index, return -1; } - nvme_init_zone_meta(ns); + if (ns->params.zone_file) { + if (nvme_open_zone_file(ns, &init_meta, errp) != 0) { + return -1; + } + } + + ret = nvme_load_zone_meta(ns, &init_meta); + if (ret) { + error_setg(errp, "could not load/init zone metadata, err=%d", ret); + return -1; + } id_ns_z = g_malloc0(sizeof(NvmeIdNsZoned)); @@ -376,17 +682,21 @@ void nvme_ns_drain(NvmeNamespace *ns) void nvme_ns_flush(NvmeNamespace *ns) { blk_flush(ns->blkconf.blk); + + nvme_sync_zone_file(ns, ns->zone_array, ns->zone_array_size); } void nvme_ns_cleanup(NvmeNamespace *ns) { + if (!ns->params.zone_file) { + g_free(ns->zone_array); + g_free(ns->exp_open_zones); + g_free(ns->imp_open_zones); + g_free(ns->closed_zones); + g_free(ns->full_zones); + g_free(ns->zd_extensions); + } g_free(ns->id_ns_zoned); - g_free(ns->zone_array); - g_free(ns->exp_open_zones); - g_free(ns->imp_open_zones); - g_free(ns->closed_zones); - g_free(ns->full_zones); - g_free(ns->zd_extensions); } static void nvme_ns_realize(DeviceState *dev, Error **errp) @@ -422,6 +732,7 @@ static Property nvme_ns_props[] = { params.nr_offline_zones, 0), DEFINE_PROP_UINT32("rdonly_zones", NvmeNamespace, params.nr_rdonly_zones, 0), + DEFINE_PROP_STRING("zone_file", NvmeNamespace, params.zone_file), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme-ns.h b/hw/block/nvme-ns.h index e9b90f9677..4ff0955f91 100644 --- a/hw/block/nvme-ns.h +++ b/hw/block/nvme-ns.h @@ -36,6 +36,27 @@ typedef struct NvmeZoneList { uint8_t rsvd12[4]; } NvmeZoneList; +#define NVME_ZONE_META_MAGIC 0x3aebaa70 +#define NVME_ZONE_META_VER 1 + +typedef struct NvmeZoneMeta { + uint32_t magic; + uint32_t version; + uint64_t zone_size; + uint64_t zone_capacity; + uint32_t nr_offline_zones; + uint32_t nr_rdonly_zones; + uint32_t lba_size; + uint32_t rsvd40; + NvmeZoneList exp_open_zones; + NvmeZoneList imp_open_zones; + NvmeZoneList closed_zones; + NvmeZoneList full_zones; + uint8_t zd_extension_size; + uint8_t dirty; + uint8_t rsvd594[3990]; +} NvmeZoneMeta; + typedef struct NvmeNamespaceParams { uint32_t nsid; bool attached; @@ -50,6 +71,7 @@ typedef struct NvmeNamespaceParams { uint32_t zd_extension_size; uint32_t nr_offline_zones; uint32_t nr_rdonly_zones; + char *zone_file; } NvmeNamespaceParams; typedef struct NvmeNamespace { @@ -62,6 +84,7 @@ typedef struct NvmeNamespace { NvmeIdNsZoned *id_ns_zoned; NvmeZone *zone_array; + NvmeZoneMeta *zone_meta; NvmeZoneList *exp_open_zones; NvmeZoneList *imp_open_zones; NvmeZoneList *closed_zones; @@ -74,6 +97,8 @@ typedef struct NvmeNamespace { uint8_t *zd_extensions; int32_t nr_open_zones; int32_t nr_active_zones; + MemoryRegion *zone_mr; + size_t meta_size; NvmeNamespaceParams params; } NvmeNamespace; @@ -110,6 +135,13 @@ static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) return lba << nvme_ns_lbads(ns); } +static inline void nvme_set_zone_meta_dirty(NvmeNamespace *ns) +{ + if (ns->params.zone_file) { + ns->zone_meta->dirty = true; + } +} + typedef struct NvmeCtrl NvmeCtrl; int nvme_ns_setup(NvmeCtrl *n, NvmeNamespace *ns, Error **errp); @@ -243,5 +275,6 @@ static inline void nvme_aor_dec_active(NvmeNamespace *ns) void nvme_add_zone_tail(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); void nvme_remove_zone(NvmeNamespace *ns, NvmeZoneList *zl, NvmeZone *zone); NvmeZone *nvme_remove_zone_head(NvmeNamespace *ns, NvmeZoneList *zl); +void nvme_sync_zone_file(NvmeNamespace *ns, NvmeZone *zone, int len); #endif /* NVME_NS_H */ diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 80973f3ff6..ff7d43d38f 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -165,6 +165,8 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, default: zone->d.za = 0; } + + nvme_sync_zone_file(ns, zone, sizeof(NvmeZone)); } /* diff --git a/hw/block/trace-events b/hw/block/trace-events index 386f28e457..1ea4846443 100644 --- a/hw/block/trace-events +++ b/hw/block/trace-events @@ -103,6 +103,7 @@ pci_nvme_zd_extension_set(uint32_t zone_idx) "set descriptor extension for zone_ pci_nvme_power_on_close(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Closed state" pci_nvme_power_on_reset(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Empty state" pci_nvme_power_on_full(uint32_t state, uint64_t slba) "zone state=%"PRIu32", slba=%"PRIu64" transitioned to Full state" +pci_nvme_initialized_zone_file(char *zfile_name) "mapped zone file %s" pci_nvme_mapped_zone_file(char *zfile_name, int ret) "mapped zone file %s, error %d" # nvme traces for error conditions From patchwork Mon Sep 28 02:35:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Fomichev X-Patchwork-Id: 272643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E30FAC4346E for ; Mon, 28 Sep 2020 02:48:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 64F8022207 for ; Mon, 28 Sep 2020 02:48:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="DH4m6/FC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64F8022207 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=wdc.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kMjD4-000411-Gc for qemu-devel@archiver.kernel.org; Sun, 27 Sep 2020 22:48:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46580) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1i-0007nW-MK; Sun, 27 Sep 2020 22:36:32 -0400 Received: from esa1.hgst.iphmx.com ([68.232.141.245]:21619) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kMj1e-0003Li-QQ; Sun, 27 Sep 2020 22:36:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1601260586; x=1632796586; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CDicpbQlEYirPMv6LJDX+sqlc3u9djdNDeAWD2fg/v8=; b=DH4m6/FC4KHJSTn2PHkI6ExZXiLBT4PIYfX4PKZKux5mZ7HsBlkUFpF4 8GG68sDmVlnqmOYoAW31jLDw36h3F7x5hWNaiGA/iJ7HQ8JH05Vzepp4y IliJ83bFfRvDYLcKX8TaIQ7H2JkJhZS49/Dov3BmY+XSlbg0bZ2tj82Uu nenU5tpaeEMnwKq6cAxUXfahZUuqVt2YMb8NfcoAqJS8J5unrs8/I6qsR PBo9H1GItyMOvjhBo8rLbQbYj2nysF02aNGkxRo8Cg/hKFSG0aoennLyo Zno7pjq8HLAi+0+5+BUeFdVLayHHbwpnHYfzUj6GfEDcxA0hPJ2AxMEEE A==; IronPort-SDR: IXlCPyiBNO9tHZUFGqYKJocM22q5NiEmxaiOqxNmaadflDZPSinpUaWOQ4x0j4GGVKZQFVhGdN TiQg2LJQYNP4sYfr3LiTnhNOnmALlqBX3DId4GQR7uFOJUi2KCIdfiokese0n1drKDSNis/axl 14liZrwyu4QrRArXEIWgnVQHlgQMeCz+myNSH5T/1jieTvjcQuZW06QPoNN+S2dob5aFx/vDKr 20ovk/2y5MHDHMCLBbn/DubUr5qRa2Hz01prHUJg1SSA2b63LklUb5AVq1jhg88UtMmbV5cz6/ lFo= X-IronPort-AV: E=Sophos;i="5.77,312,1596470400"; d="scan'208";a="258125254" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 28 Sep 2020 10:36:05 +0800 IronPort-SDR: v65OiUf2vqV0Y6duojK6o0FPTWAxrrwXA8Nj5eGQsjfewe/XHq5Y+HMvWVREjQGh/uv974yUuV aQrecWr7c9Sg== Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2020 19:23:01 -0700 IronPort-SDR: fr911q5Yf0OdNrJ/BWd94FydXg2dvERsfQ45ZlwDvtlOxFcw+pNzXfO1p+60DJC9yZYyGvSTJe yEGzLL3dgxOQ== WDCIronportException: Internal Received: from unknown (HELO redsun50.ssa.fujisawa.hgst.com) ([10.149.66.24]) by uls-op-cesaip02.wdc.com with ESMTP; 27 Sep 2020 19:36:03 -0700 From: Dmitry Fomichev To: Keith Busch , Klaus Jensen , Kevin Wolf , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Maxim Levitsky , Fam Zheng Subject: [PATCH v5 14/14] hw/block/nvme: Document zoned parameters in usage text Date: Mon, 28 Sep 2020 11:35:28 +0900 Message-Id: <20200928023528.15260-15-dmitry.fomichev@wdc.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200928023528.15260-1-dmitry.fomichev@wdc.com> References: <20200928023528.15260-1-dmitry.fomichev@wdc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=68.232.141.245; envelope-from=prvs=5334b480d=dmitry.fomichev@wdc.com; helo=esa1.hgst.iphmx.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/27 22:35:35 X-ACL-Warn: Detected OS = FreeBSD 9.x or newer [fuzzy] X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Niklas Cassel , Damien Le Moal , qemu-block@nongnu.org, Dmitry Fomichev , qemu-devel@nongnu.org, Alistair Francis , Matias Bjorling Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Added brief descriptions of the new device properties that are now available to users to configure features of Zoned Namespace Command Set in the emulator. This patch is for documentation only, no functionality change. Signed-off-by: Dmitry Fomichev --- hw/block/nvme.c | 44 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index ff7d43d38f..34fc6daf9d 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -9,7 +9,7 @@ */ /** - * Reference Specs: http://www.nvmexpress.org, 1.2, 1.1, 1.0e + * Reference Specs: http://www.nvmexpress.org, 1.4, 1.3, 1.2, 1.1, 1.0e * * https://nvmexpress.org/developers/nvme-specification/ */ @@ -23,7 +23,8 @@ * max_ioqpairs=, \ * aerl=, aer_max_queued=, \ * mdts= - * -device nvme-ns,drive=,bus=bus_name,nsid= + * -device nvme-ns,drive=,bus=bus_name,nsid=, \ + * zoned= * * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at * offset 0 in BAR2 and supports only WDS, RDS and SQS for now. @@ -49,6 +50,45 @@ * completion when there are no oustanding AERs. When the maximum number of * enqueued events are reached, subsequent events will be dropped. * + * Setting `zoned` to true selects Zoned Command Set at the namespace. + * In this case, the following options are available to configure zoned + * operation: + * zone_size= + * + * zone_capacity= + * The value 0 (default) forces zone capacity to be the same as zone + * size. The value of this property may not exceed zone size. + * + * zone_file= + * Zone metadata file, if specified, allows zone information + * to be persistent across shutdowns and restarts. + * + * zone_descr_ext_size= + * This value needs to be specified in 64B units. If it is zero, + * namespace(s) will not support zone descriptor extensions. + * + * max_active= + * + * max_open= + * + * zone_append_size_limit= + * The maximum I/O size that can be supported by Zone Append + * command. Since internally this this value is maintained as + * ZASL = log2( / ), some + * values assigned to this property may be rounded down and + * result in a lower maximum ZA data size being in effect. + * If MDTS property is not assigned, the default value of 128KiB is + * used as ZASL. + * + * offline_zones= + * + * rdonly_zones= + * + * cross_zone_read= + * + * fill_pattern= + * The byte pattern to return for any portions of unwritten data + * during read. */ #include "qemu/osdep.h"