diff mbox series

[v5,05/14] hw/block/nvme: Add support for Namespace Types

Message ID 20200928023528.15260-6-dmitry.fomichev@wdc.com
State New
Headers show
Series hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set | expand

Commit Message

Dmitry Fomichev Sept. 28, 2020, 2:35 a.m. UTC
From: Niklas Cassel <niklas.cassel@wdc.com>

Namespace Types introduce a new command set, "I/O Command Sets",
that allows the host to retrieve the command sets associated with
a namespace. Introduce support for the command set and enable
detection for the NVM Command Set.

The new workflows for identify commands rely heavily on zero-filled
identify structs. E.g., certain CNS commands are defined to return
a zero-filled identify struct when an inactive namespace NSID
is supplied.

Add a helper function in order to avoid code duplication when
reporting zero-filled identify structures.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
---
 hw/block/nvme-ns.c |   3 +
 hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++--------
 2 files changed, 175 insertions(+), 38 deletions(-)

Comments

Klaus Jensen Sept. 30, 2020, 8:15 a.m. UTC | #1
On Sep 28 11:35, Dmitry Fomichev wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>

> 

> Namespace Types introduce a new command set, "I/O Command Sets",

> that allows the host to retrieve the command sets associated with

> a namespace. Introduce support for the command set and enable

> detection for the NVM Command Set.

> 

> The new workflows for identify commands rely heavily on zero-filled

> identify structs. E.g., certain CNS commands are defined to return

> a zero-filled identify struct when an inactive namespace NSID

> is supplied.

> 

> Add a helper function in order to avoid code duplication when

> reporting zero-filled identify structures.

> 

> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---

>  hw/block/nvme-ns.c |   3 +

>  hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++--------

>  2 files changed, 175 insertions(+), 38 deletions(-)

> 

> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c

> index bbd7879492..31b7f986c3 100644

> --- a/hw/block/nvme-ns.c

> +++ b/hw/block/nvme-ns.c

> @@ -40,6 +40,9 @@ static void nvme_ns_init(NvmeNamespace *ns)

>  

>      id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));

>  

> +    ns->params.csi = NVME_CSI_NVM;

> +    qemu_uuid_generate(&ns->params.uuid); /* TODO make UUIDs persistent */

> +


It is straight-forward to put this into a 'uuid' nvme-ns parameter using
DEFINE_PROP_UUID. That will default to 'auto' which will generate an
UUID for each invocation, but if the user requires it to be
"persistent", it can be specified explicitly.

If you choose to do this, please extract to separate patch. Or I can
post it on top of nvme-next if you like.
Niklas Cassel Sept. 30, 2020, 12:47 p.m. UTC | #2
On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>

> 

> Namespace Types introduce a new command set, "I/O Command Sets",

> that allows the host to retrieve the command sets associated with

> a namespace. Introduce support for the command set and enable

> detection for the NVM Command Set.

> 

> The new workflows for identify commands rely heavily on zero-filled

> identify structs. E.g., certain CNS commands are defined to return

> a zero-filled identify struct when an inactive namespace NSID

> is supplied.

> 

> Add a helper function in order to avoid code duplication when

> reporting zero-filled identify structures.

> 

> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---

>  hw/block/nvme-ns.c |   3 +

>  hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++--------

>  2 files changed, 175 insertions(+), 38 deletions(-)

> 

> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c

> index bbd7879492..31b7f986c3 100644

> --- a/hw/block/nvme-ns.c

> +++ b/hw/block/nvme-ns.c


(snip)

> @@ -1597,12 +1667,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

>       * Namespace Identification Descriptor. Add a very basic Namespace UUID

>       * here.

>       */

> -    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;

> -    ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;

> -    stl_be_p(&ns_descrs->uuid.v, nsid);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_UUID;

> +    desc->nidl = NVME_NIDL_UUID;

> +    list_ptr += sizeof(*desc);

> +    memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);

> +    list_ptr += NVME_NIDL_UUID;

>  

> -    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,

> -                    DMA_DIRECTION_FROM_DEVICE, req);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_CSI;

> +    desc->nidl = NVME_NIDL_CSI;

> +    list_ptr += sizeof(*desc);

> +    *(uint8_t *)list_ptr = NVME_CSI_NVM;


I think that we should use ns->csi/ns->params.csi here rather than
NVME_CSI_NVM.
You do this change in a later patch, but I think it is more correct
to do it here already. (No reason not to, since ns->csi/ns->params.csi
should be set to NVME_CSI_NVM for NVM namespace already in this patch.)

> +

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

> +}
Niklas Cassel Oct. 1, 2020, 11:22 a.m. UTC | #3
On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>

> 

> Namespace Types introduce a new command set, "I/O Command Sets",

> that allows the host to retrieve the command sets associated with

> a namespace. Introduce support for the command set and enable

> detection for the NVM Command Set.

> 

> The new workflows for identify commands rely heavily on zero-filled

> identify structs. E.g., certain CNS commands are defined to return

> a zero-filled identify struct when an inactive namespace NSID

> is supplied.

> 

> Add a helper function in order to avoid code duplication when

> reporting zero-filled identify structures.

> 

> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---

>  hw/block/nvme-ns.c |   3 +

>  hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++--------

>  2 files changed, 175 insertions(+), 38 deletions(-)

> 

> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c

> index bbd7879492..31b7f986c3 100644

> --- a/hw/block/nvme-ns.c

> +++ b/hw/block/nvme-ns.c

> @@ -40,6 +40,9 @@ static void nvme_ns_init(NvmeNamespace *ns)

>  

>      id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));

>  

> +    ns->params.csi = NVME_CSI_NVM;

> +    qemu_uuid_generate(&ns->params.uuid); /* TODO make UUIDs persistent */

> +

>      /* no thin provisioning */

>      id_ns->ncap = id_ns->nsze;

>      id_ns->nuse = id_ns->ncap;

> diff --git a/hw/block/nvme.c b/hw/block/nvme.c

> index 29fa005fa2..4ec1ddc90a 100644

> --- a/hw/block/nvme.c

> +++ b/hw/block/nvme.c

> @@ -1495,6 +1495,13 @@ static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)

>      return NVME_SUCCESS;

>  }

>  

> +static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req)

> +{

> +    uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {};

> +

> +    return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req);

> +}

> +

>  static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)

>  {

>      trace_pci_nvme_identify_ctrl();

> @@ -1503,11 +1510,23 @@ static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)

>                      DMA_DIRECTION_FROM_DEVICE, req);

>  }

>  

> +static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req)

> +{

> +    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> +

> +    trace_pci_nvme_identify_ctrl_csi(c->csi);

> +

> +    if (c->csi == NVME_CSI_NVM) {

> +        return nvme_rpt_empty_id_struct(n, req);

> +    }

> +

> +    return NVME_INVALID_FIELD | NVME_DNR;

> +}

> +

>  static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)

>  {

>      NvmeNamespace *ns;

>      NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> -    NvmeIdNs *id_ns, inactive = { 0 };

>      uint32_t nsid = le32_to_cpu(c->nsid);

>  

>      trace_pci_nvme_identify_ns(nsid);

> @@ -1518,23 +1537,46 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)

>  

>      ns = nvme_ns(n, nsid);

>      if (unlikely(!ns)) {

> -        id_ns = &inactive;

> -    } else {

> -        id_ns = &ns->id_ns;

> +        return nvme_rpt_empty_id_struct(n, req);

>      }

>  

> -    return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs),

> +    return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs),

>                      DMA_DIRECTION_FROM_DEVICE, req);

>  }

>  

> +static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)

> +{

> +    NvmeNamespace *ns;

> +    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> +    uint32_t nsid = le32_to_cpu(c->nsid);

> +

> +    trace_pci_nvme_identify_ns_csi(nsid, c->csi);

> +

> +    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {

> +        return NVME_INVALID_NSID | NVME_DNR;

> +    }

> +

> +    ns = nvme_ns(n, nsid);

> +    if (unlikely(!ns)) {

> +        return nvme_rpt_empty_id_struct(n, req);

> +    }

> +

> +    if (c->csi == NVME_CSI_NVM) {

> +        return nvme_rpt_empty_id_struct(n, req);

> +    }

> +

> +    return NVME_INVALID_FIELD | NVME_DNR;

> +}

> +

>  static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)

>  {

> +    NvmeNamespace *ns;

>      NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> -    static const int data_len = NVME_IDENTIFY_DATA_SIZE;

>      uint32_t min_nsid = le32_to_cpu(c->nsid);

> -    uint32_t *list;

> -    uint16_t ret;

> -    int j = 0;

> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> +    static const int data_len = sizeof(list);

> +    uint32_t *list_ptr = (uint32_t *)list;

> +    int i, j = 0;

>  

>      trace_pci_nvme_identify_nslist(min_nsid);

>  

> @@ -1548,48 +1590,76 @@ static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)

>          return NVME_INVALID_NSID | NVME_DNR;

>      }

>  

> -    list = g_malloc0(data_len);

> -    for (int i = 1; i <= n->num_namespaces; i++) {

> -        if (i <= min_nsid || !nvme_ns(n, i)) {

> +    for (i = 1; i <= n->num_namespaces; i++) {

> +        ns = nvme_ns(n, i);

> +        if (!ns) {

>              continue;

>          }

> -        list[j++] = cpu_to_le32(i);

> +        if (ns->params.nsid < min_nsid) {

> +            continue;

> +        }

> +        list_ptr[j++] = cpu_to_le32(ns->params.nsid);

>          if (j == data_len / sizeof(uint32_t)) {

>              break;

>          }

>      }

> -    ret = nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVICE,

> -                   req);

> -    g_free(list);

> -    return ret;

> +

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

> +}

> +

> +static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req)

> +{

> +    NvmeNamespace *ns;

> +    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> +    uint32_t min_nsid = le32_to_cpu(c->nsid);

> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> +    static const int data_len = sizeof(list);

> +    uint32_t *list_ptr = (uint32_t *)list;

> +    int i, j = 0;

> +

> +    trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi);

> +

> +    if (c->csi != NVME_CSI_NVM) {

> +        return NVME_INVALID_FIELD | NVME_DNR;

> +    }

> +

> +    for (i = 1; i <= n->num_namespaces; i++) {

> +        ns = nvme_ns(n, i);

> +        if (!ns) {

> +            continue;

> +        }

> +        if (ns->params.nsid < min_nsid) {

> +            continue;

> +        }

> +        list_ptr[j++] = cpu_to_le32(ns->params.nsid);

> +        if (j == data_len / sizeof(uint32_t)) {

> +            break;

> +        }

> +    }

> +

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

>  }

>  

>  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

>  {

>      NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> +    NvmeNamespace *ns;

>      uint32_t nsid = le32_to_cpu(c->nsid);

> -    uint8_t list[NVME_IDENTIFY_DATA_SIZE];

> -

> -    struct data {

> -        struct {

> -            NvmeIdNsDescr hdr;

> -            uint8_t v[16];

> -        } uuid;

> -    };

> -

> -    struct data *ns_descrs = (struct data *)list;

> +    NvmeIdNsDescr *desc;

> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> +    static const int data_len = sizeof(list);

> +    void *list_ptr = list;

>  

>      trace_pci_nvme_identify_ns_descr_list(nsid);

>  

> -    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {

> -        return NVME_INVALID_NSID | NVME_DNR;

> -    }

> -

>      if (unlikely(!nvme_ns(n, nsid))) {

>          return NVME_INVALID_FIELD | NVME_DNR;

>      }

>  

> -    memset(list, 0x0, sizeof(list));

> +    ns = nvme_ns(n, nsid);

> +    if (unlikely(!ns)) {

> +        return nvme_rpt_empty_id_struct(n, req);

> +    }

>  

>      /*

>       * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data

> @@ -1597,12 +1667,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

>       * Namespace Identification Descriptor. Add a very basic Namespace UUID

>       * here.

>       */

> -    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;

> -    ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;

> -    stl_be_p(&ns_descrs->uuid.v, nsid);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_UUID;

> +    desc->nidl = NVME_NIDL_UUID;

> +    list_ptr += sizeof(*desc);

> +    memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);

> +    list_ptr += NVME_NIDL_UUID;

>  

> -    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,

> -                    DMA_DIRECTION_FROM_DEVICE, req);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_CSI;

> +    desc->nidl = NVME_NIDL_CSI;

> +    list_ptr += sizeof(*desc);

> +    *(uint8_t *)list_ptr = NVME_CSI_NVM;

> +

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

> +}

> +

> +static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req)

> +{

> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> +    static const int data_len = sizeof(list);

> +

> +    trace_pci_nvme_identify_cmd_set();

> +

> +    NVME_SET_CSI(*list, NVME_CSI_NVM);

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

>  }

>  

>  static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)

> @@ -1612,12 +1701,20 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)

>      switch (le32_to_cpu(c->cns)) {

>      case NVME_ID_CNS_NS:

>          return nvme_identify_ns(n, req);

> +    case NVME_ID_CNS_CS_NS:

> +        return nvme_identify_ns_csi(n, req);

>      case NVME_ID_CNS_CTRL:

>          return nvme_identify_ctrl(n, req);

> +    case NVME_ID_CNS_CS_CTRL:

> +        return nvme_identify_ctrl_csi(n, req);

>      case NVME_ID_CNS_NS_ACTIVE_LIST:

>          return nvme_identify_nslist(n, req);

> +    case NVME_ID_CNS_CS_NS_ACTIVE_LIST:

> +        return nvme_identify_nslist_csi(n, req);

>      case NVME_ID_CNS_NS_DESCR_LIST:

>          return nvme_identify_ns_descr_list(n, req);

> +    case NVME_ID_CNS_IO_COMMAND_SET:

> +        return nvme_identify_cmd_set(n, req);

>      default:

>          trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));

>          return NVME_INVALID_FIELD | NVME_DNR;

> @@ -1799,6 +1896,9 @@ defaults:

>              result |= NVME_INTVC_NOCOALESCING;

>          }

>  

> +        break;

> +    case NVME_COMMAND_SET_PROFILE:

> +        result = 0;

>          break;

>      default:

>          result = nvme_feature_default[fid];

> @@ -1939,6 +2039,12 @@ static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)

>          break;

>      case NVME_TIMESTAMP:

>          return nvme_set_feature_timestamp(n, req);

> +    case NVME_COMMAND_SET_PROFILE:

> +        if (dw11 & 0x1ff) {

> +            trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff);

> +            return NVME_CMD_SET_CMB_REJECTED | NVME_DNR;

> +        }

> +        break;

>      default:

>          return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;

>      }

> @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,

>          break;

>      case 0x14:  /* CC */

>          trace_pci_nvme_mmio_cfg(data & 0xffffffff);

> +

> +        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

> +            if (NVME_CC_EN(n->bar.cc)) {


I just saw this print when doing controller reset on a live system.

Added a debug print:
nvme_write_bar WRITING: 0x0 previous: 0x464061

so the second if-statement has to be:

    if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {

Sorry for introducing the bug in the first place.


Kind regards,
Niklas

> +                NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,

> +                               "changing selected command set when enabled");

> +            } else {

> +                switch (NVME_CC_CSS(data)) {

> +                case CSS_NVM_ONLY:

> +                    trace_pci_nvme_css_nvm_cset_selected_by_host(data &

> +                                                                 0xffffffff);

> +                    break;

> +                case CSS_CSI:

> +                    NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);

> +                    trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff);

> +                    break;

> +                case CSS_ADMIN_ONLY:

> +                    break;

> +                default:

> +                    NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,

> +                                   "unknown value in CC.CSS field");

> +                }

> +            }

> +        }

> +

>          /* Windows first sends data, then sends enable bit */

>          if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) &&

>              !NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc))

> @@ -2810,7 +2940,11 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)

>      NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);

>      NVME_CAP_SET_CQR(n->bar.cap, 1);

>      NVME_CAP_SET_TO(n->bar.cap, 0xf);

> -    NVME_CAP_SET_CSS(n->bar.cap, 1);

> +    /*

> +     * The device now always supports NS Types, but all commands

> +     * that support CSI field will only handle NVM Command Set.

> +     */

> +    NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP));

>      NVME_CAP_SET_MPSMAX(n->bar.cap, 4);

>  

>      n->bar.vs = NVME_SPEC_VER;

> -- 

> 2.21.0

>
Keith Busch Oct. 1, 2020, 3:29 p.m. UTC | #4
On Thu, Oct 01, 2020 at 11:22:46AM +0000, Niklas Cassel wrote:
> On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:

> > From: Niklas Cassel <niklas.cassel@wdc.com>

> > @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,

> >          break;

> >      case 0x14:  /* CC */

> >          trace_pci_nvme_mmio_cfg(data & 0xffffffff);

> > +

> > +        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

> > +            if (NVME_CC_EN(n->bar.cc)) {

> 

> I just saw this print when doing controller reset on a live system.

> 

> Added a debug print:

> nvme_write_bar WRITING: 0x0 previous: 0x464061

> 

> so the second if-statement has to be:

> 

>     if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {

> 

> Sorry for introducing the bug in the first place.


No worries.

I don't think the check should be here at all, really. The only check for valid
CSS should be in nvme_start_ctrl(), which I posted yesterday.
 
> > +                NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,

> > +                               "changing selected command set when enabled");

> > +            } else {

> > +                switch (NVME_CC_CSS(data)) {

> > +                case CSS_NVM_ONLY:

> > +                    trace_pci_nvme_css_nvm_cset_selected_by_host(data &

> > +                                                                 0xffffffff);

> > +                    break;

> > +                case CSS_CSI:

> > +                    NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);

> > +                    trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff);

> > +                    break;

> > +                case CSS_ADMIN_ONLY:

> > +                    break;

> > +                default:

> > +                    NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,

> > +                                   "unknown value in CC.CSS field");

> > +                }

> > +            }

> > +        }
Niklas Cassel Oct. 1, 2020, 3:50 p.m. UTC | #5
On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 11:22:46AM +0000, Niklas Cassel wrote:

> > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:

> > > From: Niklas Cassel <niklas.cassel@wdc.com>

> > > @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,

> > >          break;

> > >      case 0x14:  /* CC */

> > >          trace_pci_nvme_mmio_cfg(data & 0xffffffff);

> > > +

> > > +        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

> > > +            if (NVME_CC_EN(n->bar.cc)) {

> > 

> > I just saw this print when doing controller reset on a live system.

> > 

> > Added a debug print:

> > nvme_write_bar WRITING: 0x0 previous: 0x464061

> > 

> > so the second if-statement has to be:

> > 

> >     if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {

> > 

> > Sorry for introducing the bug in the first place.

> 

> No worries.

> 

> I don't think the check should be here at all, really. The only check for valid

> CSS should be in nvme_start_ctrl(), which I posted yesterday.


The reasoning for this additional check is this:

From CC.CC register description:

"This field shall only be changed when the controller
is disabled (CC.EN is cleared to ‘0’)."

In the QEMU model, we have functions, e.g. nvme_cmd_effects(),
that uses n->bar.cc "at runtime".

So I don't think that simply checking for valid CSS in
nvme_start_ctrl() is sufficient.

Thoughts?


Kind regards,
Niklas

>  

> > > +                NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,

> > > +                               "changing selected command set when enabled");

> > > +            } else {

> > > +                switch (NVME_CC_CSS(data)) {

> > > +                case CSS_NVM_ONLY:

> > > +                    trace_pci_nvme_css_nvm_cset_selected_by_host(data &

> > > +                                                                 0xffffffff);

> > > +                    break;

> > > +                case CSS_CSI:

> > > +                    NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);

> > > +                    trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff);

> > > +                    break;

> > > +                case CSS_ADMIN_ONLY:

> > > +                    break;

> > > +                default:

> > > +                    NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,

> > > +                                   "unknown value in CC.CSS field");

> > > +                }

> > > +            }

> > > +        }
Keith Busch Oct. 1, 2020, 3:59 p.m. UTC | #6
On Thu, Oct 01, 2020 at 03:50:35PM +0000, Niklas Cassel wrote:
> On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:

> > On Thu, Oct 01, 2020 at 11:22:46AM +0000, Niklas Cassel wrote:

> > > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:

> > > > From: Niklas Cassel <niklas.cassel@wdc.com>

> > > > @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,

> > > >          break;

> > > >      case 0x14:  /* CC */

> > > >          trace_pci_nvme_mmio_cfg(data & 0xffffffff);

> > > > +

> > > > +        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

> > > > +            if (NVME_CC_EN(n->bar.cc)) {

> > > 

> > > I just saw this print when doing controller reset on a live system.

> > > 

> > > Added a debug print:

> > > nvme_write_bar WRITING: 0x0 previous: 0x464061

> > > 

> > > so the second if-statement has to be:

> > > 

> > >     if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {

> > > 

> > > Sorry for introducing the bug in the first place.

> > 

> > No worries.

> > 

> > I don't think the check should be here at all, really. The only check for valid

> > CSS should be in nvme_start_ctrl(), which I posted yesterday.

> 

> The reasoning for this additional check is this:

> 

> From CC.CC register description:

> 

> "This field shall only be changed when the controller

> is disabled (CC.EN is cleared to ‘0’)."

> 

> In the QEMU model, we have functions, e.g. nvme_cmd_effects(),

> that uses n->bar.cc "at runtime".

> 

> So I don't think that simply checking for valid CSS in

> nvme_start_ctrl() is sufficient.

> 

> Thoughts?


The qemu controller accepts host register writes only for valid enable
and shutdown  bit transitions. Or at least it should. If not, then we
need to fix that, but that's not specific to the CSS bits.
Niklas Cassel Oct. 1, 2020, 4:23 p.m. UTC | #7
On Thu, Oct 01, 2020 at 08:59:31AM -0700, Keith Busch wrote:
> On Thu, Oct 01, 2020 at 03:50:35PM +0000, Niklas Cassel wrote:

> > On Thu, Oct 01, 2020 at 09:29:22AM -0600, Keith Busch wrote:

> > > On Thu, Oct 01, 2020 at 11:22:46AM +0000, Niklas Cassel wrote:

> > > > On Mon, Sep 28, 2020 at 11:35:19AM +0900, Dmitry Fomichev wrote:

> > > > > From: Niklas Cassel <niklas.cassel@wdc.com>

> > > > > @@ -2222,6 +2328,30 @@ static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,

> > > > >          break;

> > > > >      case 0x14:  /* CC */

> > > > >          trace_pci_nvme_mmio_cfg(data & 0xffffffff);

> > > > > +

> > > > > +        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

> > > > > +            if (NVME_CC_EN(n->bar.cc)) {

> > > > 

> > > > I just saw this print when doing controller reset on a live system.

> > > > 

> > > > Added a debug print:

> > > > nvme_write_bar WRITING: 0x0 previous: 0x464061

> > > > 

> > > > so the second if-statement has to be:

> > > > 

> > > >     if (NVME_CC_EN(n->bar.cc) && NVME_CC_EN(data)) {

> > > > 

> > > > Sorry for introducing the bug in the first place.

> > > 

> > > No worries.

> > > 

> > > I don't think the check should be here at all, really. The only check for valid

> > > CSS should be in nvme_start_ctrl(), which I posted yesterday.

> > 

> > The reasoning for this additional check is this:

> > 

> > From CC.CC register description:

> > 

> > "This field shall only be changed when the controller

> > is disabled (CC.EN is cleared to ‘0’)."

> > 

> > In the QEMU model, we have functions, e.g. nvme_cmd_effects(),

> > that uses n->bar.cc "at runtime".

> > 

> > So I don't think that simply checking for valid CSS in

> > nvme_start_ctrl() is sufficient.

> > 

> > Thoughts?

> 

> The qemu controller accepts host register writes only for valid enable

> and shutdown  bit transitions. Or at least it should. If not, then we

> need to fix that, but that's not specific to the CSS bits.


I simply added the second if-statement, (if (NVME_CC_EN(n->bar.cc))),
the rest of the NVME_CC_CSS was written by someone else.

But I see your point, all of this code:

        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
            if (NVME_CC_EN(n->bar.cc)) {
                NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
                               "changing selected command set when enabled");
            } else {
                switch (NVME_CC_CSS(data)) {
                case CSS_NVM_ONLY:
                    trace_pci_nvme_css_nvm_cset_selected_by_host(data &
                                                                 0xffffffff);
                break;
                case CSS_CSI:
                    NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
                    trace_pci_nvme_css_all_csets_sel_by_host(data &
                                                             0xffffffff);
                    break;
                case CSS_ADMIN_ONLY:
                    break;
                default:
                    NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
                                   "unknown value in CC.CSS field");
                }
            }
        }

should simply be dropped.

No need to call NVME_SET_CC_CSS() explicitly.

CC.CSS bit will be set futher down in this function anyway:

        if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {
            n->bar.cc = data;


Kind regards,
Niklas
Keith Busch Oct. 1, 2020, 5:08 p.m. UTC | #8
On Thu, Oct 01, 2020 at 04:23:56PM +0000, Niklas Cassel wrote:
> But I see your point, all of this code:

> 

>         if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {

>             if (NVME_CC_EN(n->bar.cc)) {

>                 NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,

>                                "changing selected command set when enabled");

>             } else {

>                 switch (NVME_CC_CSS(data)) {

>                 case CSS_NVM_ONLY:

>                     trace_pci_nvme_css_nvm_cset_selected_by_host(data &

>                                                                  0xffffffff);

>                 break;

>                 case CSS_CSI:

>                     NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);

>                     trace_pci_nvme_css_all_csets_sel_by_host(data &

>                                                              0xffffffff);

>                     break;

>                 case CSS_ADMIN_ONLY:

>                     break;

>                 default:

>                     NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,

>                                    "unknown value in CC.CSS field");

>                 }

>             }

>         }

> 

> should simply be dropped.

> 

> No need to call NVME_SET_CC_CSS() explicitly.

> 

> CC.CSS bit will be set futher down in this function anyway:

> 

>         if (NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc)) {

>             n->bar.cc = data;


Yep, that's how I saw it too. I folded it all into a rebase here for
this particular patch:

  http://git.infradead.org/qemu-nvme.git/commitdiff/45157cab2e700155b05f0bd28533f73d7e399ab8?hp=2015774a010011a9e8d2ab5291fd8d747f60471e

It depends on the prep patches I sent yesterday, which seem pretty
straight forward. I'll just wait another day before committing that
stuff and other fixes to the nvme-next branch. But if you want to get a
head start on the ZNS enabling parts, what I have in mind is in the
branch from the above link.
Klaus Jensen Oct. 1, 2020, 10:15 p.m. UTC | #9
On Sep 28 11:35, Dmitry Fomichev wrote:
> From: Niklas Cassel <niklas.cassel@wdc.com>

> 

> Namespace Types introduce a new command set, "I/O Command Sets",

> that allows the host to retrieve the command sets associated with

> a namespace. Introduce support for the command set and enable

> detection for the NVM Command Set.

> 

> The new workflows for identify commands rely heavily on zero-filled

> identify structs. E.g., certain CNS commands are defined to return

> a zero-filled identify struct when an inactive namespace NSID

> is supplied.

> 

> Add a helper function in order to avoid code duplication when

> reporting zero-filled identify structures.

> 

> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> ---

>  hw/block/nvme-ns.c |   3 +

>  hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++--------

>  2 files changed, 175 insertions(+), 38 deletions(-)

> 

> diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c

> index bbd7879492..31b7f986c3 100644

> --- a/hw/block/nvme-ns.c

> +++ b/hw/block/nvme-ns.c


The following looks like a rebase gone wrong.

There are some redundant checks and wrong return values.

>  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

>  {

>      NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> +    NvmeNamespace *ns;

>      uint32_t nsid = le32_to_cpu(c->nsid);

> -    uint8_t list[NVME_IDENTIFY_DATA_SIZE];

> -

> -    struct data {

> -        struct {

> -            NvmeIdNsDescr hdr;

> -            uint8_t v[16];

> -        } uuid;

> -    };

> -

> -    struct data *ns_descrs = (struct data *)list;

> +    NvmeIdNsDescr *desc;

> +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> +    static const int data_len = sizeof(list);

> +    void *list_ptr = list;


Oh maaan, please do not replace my nicely cleaned up code with pointer
arithmetics :(

>  

>      trace_pci_nvme_identify_ns_descr_list(nsid);

>  

> -    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {

> -        return NVME_INVALID_NSID | NVME_DNR;

> -    }

> -


This removal looks wrong.

>      if (unlikely(!nvme_ns(n, nsid))) {

>          return NVME_INVALID_FIELD | NVME_DNR;

>      }

>  

> -    memset(list, 0x0, sizeof(list));

> +    ns = nvme_ns(n, nsid);

> +    if (unlikely(!ns)) {

> +        return nvme_rpt_empty_id_struct(n, req);

> +    }

>  


And this doesnt look like it belongs (its checked just a few lines
before, and it returns an error status as it should).

>      /*

>       * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data

> @@ -1597,12 +1667,31 @@ static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

>       * Namespace Identification Descriptor. Add a very basic Namespace UUID

>       * here.

>       */

> -    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;

> -    ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;

> -    stl_be_p(&ns_descrs->uuid.v, nsid);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_UUID;

> +    desc->nidl = NVME_NIDL_UUID;

> +    list_ptr += sizeof(*desc);

> +    memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);

> +    list_ptr += NVME_NIDL_UUID;

>  

> -    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,

> -                    DMA_DIRECTION_FROM_DEVICE, req);

> +    desc = list_ptr;

> +    desc->nidt = NVME_NIDT_CSI;

> +    desc->nidl = NVME_NIDL_CSI;

> +    list_ptr += sizeof(*desc);

> +    *(uint8_t *)list_ptr = NVME_CSI_NVM;

> +

> +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);

> +}

> +
Dmitry Fomichev Oct. 1, 2020, 10:30 p.m. UTC | #10
> -----Original Message-----

> From: Klaus Jensen <its@irrelevant.dk>

> Sent: Thursday, October 1, 2020 6:15 PM

> To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>

> Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen

> <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe

> Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky

> <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel

> <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>;

> qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis

> <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com>

> Subject: Re: [PATCH v5 05/14] hw/block/nvme: Add support for Namespace

> Types

> 

> On Sep 28 11:35, Dmitry Fomichev wrote:

> > From: Niklas Cassel <niklas.cassel@wdc.com>

> >

> > Namespace Types introduce a new command set, "I/O Command Sets",

> > that allows the host to retrieve the command sets associated with

> > a namespace. Introduce support for the command set and enable

> > detection for the NVM Command Set.

> >

> > The new workflows for identify commands rely heavily on zero-filled

> > identify structs. E.g., certain CNS commands are defined to return

> > a zero-filled identify struct when an inactive namespace NSID

> > is supplied.

> >

> > Add a helper function in order to avoid code duplication when

> > reporting zero-filled identify structures.

> >

> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>

> > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>

> > ---

> >  hw/block/nvme-ns.c |   3 +

> >  hw/block/nvme.c    | 210 +++++++++++++++++++++++++++++++++++++-

> -------

> >  2 files changed, 175 insertions(+), 38 deletions(-)

> >

> > diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c

> > index bbd7879492..31b7f986c3 100644

> > --- a/hw/block/nvme-ns.c

> > +++ b/hw/block/nvme-ns.c

> 

> The following looks like a rebase gone wrong.

> 

> There are some redundant checks and wrong return values.

> 

> >  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest

> *req)

> >  {

> >      NvmeIdentify *c = (NvmeIdentify *)&req->cmd;

> > +    NvmeNamespace *ns;

> >      uint32_t nsid = le32_to_cpu(c->nsid);

> > -    uint8_t list[NVME_IDENTIFY_DATA_SIZE];

> > -

> > -    struct data {

> > -        struct {

> > -            NvmeIdNsDescr hdr;

> > -            uint8_t v[16];

> > -        } uuid;

> > -    };

> > -

> > -    struct data *ns_descrs = (struct data *)list;

> > +    NvmeIdNsDescr *desc;

> > +    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};

> > +    static const int data_len = sizeof(list);

> > +    void *list_ptr = list;

> 

> Oh maaan, please do not replace my nicely cleaned up code with pointer

> arithmetics :(

> 

> >

> >      trace_pci_nvme_identify_ns_descr_list(nsid);

> >

> > -    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {

> > -        return NVME_INVALID_NSID | NVME_DNR;

> > -    }

> > -

> 

> This removal looks wrong.

> 

> >      if (unlikely(!nvme_ns(n, nsid))) {

> >          return NVME_INVALID_FIELD | NVME_DNR;

> >      }

> >

> > -    memset(list, 0x0, sizeof(list));

> > +    ns = nvme_ns(n, nsid);

> > +    if (unlikely(!ns)) {

> > +        return nvme_rpt_empty_id_struct(n, req);

> > +    }

> >

> 

> And this doesnt look like it belongs (its checked just a few lines

> before, and it returns an error status as it should).

> 


This and above looks like a merge error, I am correcting this along
with moving UUID calculation to a separate commit.

> >      /*

> >       * Because the NGUID and EUI64 fields are 0 in the Identify Namespace

> data

> > @@ -1597,12 +1667,31 @@ static uint16_t

> nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)

> >       * Namespace Identification Descriptor. Add a very basic Namespace

> UUID

> >       * here.

> >       */

> > -    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;

> > -    ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;

> > -    stl_be_p(&ns_descrs->uuid.v, nsid);

> > +    desc = list_ptr;

> > +    desc->nidt = NVME_NIDT_UUID;

> > +    desc->nidl = NVME_NIDL_UUID;

> > +    list_ptr += sizeof(*desc);

> > +    memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);

> > +    list_ptr += NVME_NIDL_UUID;

> >

> > -    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,

> > -                    DMA_DIRECTION_FROM_DEVICE, req);

> > +    desc = list_ptr;

> > +    desc->nidt = NVME_NIDT_CSI;

> > +    desc->nidl = NVME_NIDL_CSI;

> > +    list_ptr += sizeof(*desc);

> > +    *(uint8_t *)list_ptr = NVME_CSI_NVM;

> > +

> > +    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE,

> req);

> > +}

> > +
diff mbox series

Patch

diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c
index bbd7879492..31b7f986c3 100644
--- a/hw/block/nvme-ns.c
+++ b/hw/block/nvme-ns.c
@@ -40,6 +40,9 @@  static void nvme_ns_init(NvmeNamespace *ns)
 
     id_ns->nsze = cpu_to_le64(nvme_ns_nlbas(ns));
 
+    ns->params.csi = NVME_CSI_NVM;
+    qemu_uuid_generate(&ns->params.uuid); /* TODO make UUIDs persistent */
+
     /* no thin provisioning */
     id_ns->ncap = id_ns->nsze;
     id_ns->nuse = id_ns->ncap;
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 29fa005fa2..4ec1ddc90a 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -1495,6 +1495,13 @@  static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
     return NVME_SUCCESS;
 }
 
+static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req)
+{
+    uint8_t id[NVME_IDENTIFY_DATA_SIZE] = {};
+
+    return nvme_dma(n, id, sizeof(id), DMA_DIRECTION_FROM_DEVICE, req);
+}
+
 static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
 {
     trace_pci_nvme_identify_ctrl();
@@ -1503,11 +1510,23 @@  static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req)
                     DMA_DIRECTION_FROM_DEVICE, req);
 }
 
+static uint16_t nvme_identify_ctrl_csi(NvmeCtrl *n, NvmeRequest *req)
+{
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+
+    trace_pci_nvme_identify_ctrl_csi(c->csi);
+
+    if (c->csi == NVME_CSI_NVM) {
+        return nvme_rpt_empty_id_struct(n, req);
+    }
+
+    return NVME_INVALID_FIELD | NVME_DNR;
+}
+
 static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeNamespace *ns;
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
-    NvmeIdNs *id_ns, inactive = { 0 };
     uint32_t nsid = le32_to_cpu(c->nsid);
 
     trace_pci_nvme_identify_ns(nsid);
@@ -1518,23 +1537,46 @@  static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req)
 
     ns = nvme_ns(n, nsid);
     if (unlikely(!ns)) {
-        id_ns = &inactive;
-    } else {
-        id_ns = &ns->id_ns;
+        return nvme_rpt_empty_id_struct(n, req);
     }
 
-    return nvme_dma(n, (uint8_t *)id_ns, sizeof(NvmeIdNs),
+    return nvme_dma(n, (uint8_t *)&ns->id_ns, sizeof(NvmeIdNs),
                     DMA_DIRECTION_FROM_DEVICE, req);
 }
 
+static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req)
+{
+    NvmeNamespace *ns;
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+    uint32_t nsid = le32_to_cpu(c->nsid);
+
+    trace_pci_nvme_identify_ns_csi(nsid, c->csi);
+
+    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
+        return NVME_INVALID_NSID | NVME_DNR;
+    }
+
+    ns = nvme_ns(n, nsid);
+    if (unlikely(!ns)) {
+        return nvme_rpt_empty_id_struct(n, req);
+    }
+
+    if (c->csi == NVME_CSI_NVM) {
+        return nvme_rpt_empty_id_struct(n, req);
+    }
+
+    return NVME_INVALID_FIELD | NVME_DNR;
+}
+
 static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
 {
+    NvmeNamespace *ns;
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
-    static const int data_len = NVME_IDENTIFY_DATA_SIZE;
     uint32_t min_nsid = le32_to_cpu(c->nsid);
-    uint32_t *list;
-    uint16_t ret;
-    int j = 0;
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
+    static const int data_len = sizeof(list);
+    uint32_t *list_ptr = (uint32_t *)list;
+    int i, j = 0;
 
     trace_pci_nvme_identify_nslist(min_nsid);
 
@@ -1548,48 +1590,76 @@  static uint16_t nvme_identify_nslist(NvmeCtrl *n, NvmeRequest *req)
         return NVME_INVALID_NSID | NVME_DNR;
     }
 
-    list = g_malloc0(data_len);
-    for (int i = 1; i <= n->num_namespaces; i++) {
-        if (i <= min_nsid || !nvme_ns(n, i)) {
+    for (i = 1; i <= n->num_namespaces; i++) {
+        ns = nvme_ns(n, i);
+        if (!ns) {
             continue;
         }
-        list[j++] = cpu_to_le32(i);
+        if (ns->params.nsid < min_nsid) {
+            continue;
+        }
+        list_ptr[j++] = cpu_to_le32(ns->params.nsid);
         if (j == data_len / sizeof(uint32_t)) {
             break;
         }
     }
-    ret = nvme_dma(n, (uint8_t *)list, data_len, DMA_DIRECTION_FROM_DEVICE,
-                   req);
-    g_free(list);
-    return ret;
+
+    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
+}
+
+static uint16_t nvme_identify_nslist_csi(NvmeCtrl *n, NvmeRequest *req)
+{
+    NvmeNamespace *ns;
+    NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+    uint32_t min_nsid = le32_to_cpu(c->nsid);
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
+    static const int data_len = sizeof(list);
+    uint32_t *list_ptr = (uint32_t *)list;
+    int i, j = 0;
+
+    trace_pci_nvme_identify_nslist_csi(min_nsid, c->csi);
+
+    if (c->csi != NVME_CSI_NVM) {
+        return NVME_INVALID_FIELD | NVME_DNR;
+    }
+
+    for (i = 1; i <= n->num_namespaces; i++) {
+        ns = nvme_ns(n, i);
+        if (!ns) {
+            continue;
+        }
+        if (ns->params.nsid < min_nsid) {
+            continue;
+        }
+        list_ptr[j++] = cpu_to_le32(ns->params.nsid);
+        if (j == data_len / sizeof(uint32_t)) {
+            break;
+        }
+    }
+
+    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
 {
     NvmeIdentify *c = (NvmeIdentify *)&req->cmd;
+    NvmeNamespace *ns;
     uint32_t nsid = le32_to_cpu(c->nsid);
-    uint8_t list[NVME_IDENTIFY_DATA_SIZE];
-
-    struct data {
-        struct {
-            NvmeIdNsDescr hdr;
-            uint8_t v[16];
-        } uuid;
-    };
-
-    struct data *ns_descrs = (struct data *)list;
+    NvmeIdNsDescr *desc;
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
+    static const int data_len = sizeof(list);
+    void *list_ptr = list;
 
     trace_pci_nvme_identify_ns_descr_list(nsid);
 
-    if (!nvme_nsid_valid(n, nsid) || nsid == NVME_NSID_BROADCAST) {
-        return NVME_INVALID_NSID | NVME_DNR;
-    }
-
     if (unlikely(!nvme_ns(n, nsid))) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
-    memset(list, 0x0, sizeof(list));
+    ns = nvme_ns(n, nsid);
+    if (unlikely(!ns)) {
+        return nvme_rpt_empty_id_struct(n, req);
+    }
 
     /*
      * Because the NGUID and EUI64 fields are 0 in the Identify Namespace data
@@ -1597,12 +1667,31 @@  static uint16_t nvme_identify_ns_descr_list(NvmeCtrl *n, NvmeRequest *req)
      * Namespace Identification Descriptor. Add a very basic Namespace UUID
      * here.
      */
-    ns_descrs->uuid.hdr.nidt = NVME_NIDT_UUID;
-    ns_descrs->uuid.hdr.nidl = NVME_NIDL_UUID;
-    stl_be_p(&ns_descrs->uuid.v, nsid);
+    desc = list_ptr;
+    desc->nidt = NVME_NIDT_UUID;
+    desc->nidl = NVME_NIDL_UUID;
+    list_ptr += sizeof(*desc);
+    memcpy(list_ptr, ns->params.uuid.data, NVME_NIDL_UUID);
+    list_ptr += NVME_NIDL_UUID;
 
-    return nvme_dma(n, list, NVME_IDENTIFY_DATA_SIZE,
-                    DMA_DIRECTION_FROM_DEVICE, req);
+    desc = list_ptr;
+    desc->nidt = NVME_NIDT_CSI;
+    desc->nidl = NVME_NIDL_CSI;
+    list_ptr += sizeof(*desc);
+    *(uint8_t *)list_ptr = NVME_CSI_NVM;
+
+    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
+}
+
+static uint16_t nvme_identify_cmd_set(NvmeCtrl *n, NvmeRequest *req)
+{
+    uint8_t list[NVME_IDENTIFY_DATA_SIZE] = {};
+    static const int data_len = sizeof(list);
+
+    trace_pci_nvme_identify_cmd_set();
+
+    NVME_SET_CSI(*list, NVME_CSI_NVM);
+    return nvme_dma(n, list, data_len, DMA_DIRECTION_FROM_DEVICE, req);
 }
 
 static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
@@ -1612,12 +1701,20 @@  static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
     switch (le32_to_cpu(c->cns)) {
     case NVME_ID_CNS_NS:
         return nvme_identify_ns(n, req);
+    case NVME_ID_CNS_CS_NS:
+        return nvme_identify_ns_csi(n, req);
     case NVME_ID_CNS_CTRL:
         return nvme_identify_ctrl(n, req);
+    case NVME_ID_CNS_CS_CTRL:
+        return nvme_identify_ctrl_csi(n, req);
     case NVME_ID_CNS_NS_ACTIVE_LIST:
         return nvme_identify_nslist(n, req);
+    case NVME_ID_CNS_CS_NS_ACTIVE_LIST:
+        return nvme_identify_nslist_csi(n, req);
     case NVME_ID_CNS_NS_DESCR_LIST:
         return nvme_identify_ns_descr_list(n, req);
+    case NVME_ID_CNS_IO_COMMAND_SET:
+        return nvme_identify_cmd_set(n, req);
     default:
         trace_pci_nvme_err_invalid_identify_cns(le32_to_cpu(c->cns));
         return NVME_INVALID_FIELD | NVME_DNR;
@@ -1799,6 +1896,9 @@  defaults:
             result |= NVME_INTVC_NOCOALESCING;
         }
 
+        break;
+    case NVME_COMMAND_SET_PROFILE:
+        result = 0;
         break;
     default:
         result = nvme_feature_default[fid];
@@ -1939,6 +2039,12 @@  static uint16_t nvme_set_feature(NvmeCtrl *n, NvmeRequest *req)
         break;
     case NVME_TIMESTAMP:
         return nvme_set_feature_timestamp(n, req);
+    case NVME_COMMAND_SET_PROFILE:
+        if (dw11 & 0x1ff) {
+            trace_pci_nvme_err_invalid_iocsci(dw11 & 0x1ff);
+            return NVME_CMD_SET_CMB_REJECTED | NVME_DNR;
+        }
+        break;
     default:
         return NVME_FEAT_NOT_CHANGEABLE | NVME_DNR;
     }
@@ -2222,6 +2328,30 @@  static void nvme_write_bar(NvmeCtrl *n, hwaddr offset, uint64_t data,
         break;
     case 0x14:  /* CC */
         trace_pci_nvme_mmio_cfg(data & 0xffffffff);
+
+        if (NVME_CC_CSS(data) != NVME_CC_CSS(n->bar.cc)) {
+            if (NVME_CC_EN(n->bar.cc)) {
+                NVME_GUEST_ERR(pci_nvme_err_change_css_when_enabled,
+                               "changing selected command set when enabled");
+            } else {
+                switch (NVME_CC_CSS(data)) {
+                case CSS_NVM_ONLY:
+                    trace_pci_nvme_css_nvm_cset_selected_by_host(data &
+                                                                 0xffffffff);
+                    break;
+                case CSS_CSI:
+                    NVME_SET_CC_CSS(n->bar.cc, CSS_CSI);
+                    trace_pci_nvme_css_all_csets_sel_by_host(data & 0xffffffff);
+                    break;
+                case CSS_ADMIN_ONLY:
+                    break;
+                default:
+                    NVME_GUEST_ERR(pci_nvme_ub_unknown_css_value,
+                                   "unknown value in CC.CSS field");
+                }
+            }
+        }
+
         /* Windows first sends data, then sends enable bit */
         if (!NVME_CC_EN(data) && !NVME_CC_EN(n->bar.cc) &&
             !NVME_CC_SHN(data) && !NVME_CC_SHN(n->bar.cc))
@@ -2810,7 +2940,11 @@  static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
     NVME_CAP_SET_MQES(n->bar.cap, 0x7ff);
     NVME_CAP_SET_CQR(n->bar.cap, 1);
     NVME_CAP_SET_TO(n->bar.cap, 0xf);
-    NVME_CAP_SET_CSS(n->bar.cap, 1);
+    /*
+     * The device now always supports NS Types, but all commands
+     * that support CSI field will only handle NVM Command Set.
+     */
+    NVME_CAP_SET_CSS(n->bar.cap, (CAP_CSS_NVM | CAP_CSS_CSI_SUPP));
     NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
     n->bar.vs = NVME_SPEC_VER;