Message ID | 20200904141956.576630-18-its@irrelevant.dk |
---|---|
State | Superseded |
Headers | show |
Series | hw/block/nvme: multiple namespaces support | expand |
+David in case On 9/4/20 4:19 PM, Klaus Jensen wrote: > From: Klaus Jensen <k.jensen@samsung.com> > > There are two reasons for changing this: > > 1. The nvme device currently uses an internal Intel device id. > > 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: > support multiple namespaces" the controller device no longer has > the quirks that the Linux kernel think it has. > > As the quirks are applied based on pci vendor and device id, change > them to get rid of the quirks. > > To keep backward compatibility, add a new 'x-use-intel-id' parameter to > the nvme device to force use of the Intel vendor and device id. This is > off by default but add a compat property to set this for 5.1 machines > and older. So now what happens if you start a 5.1 machine with a recent kernel? Simply the kernel will use unnecessary quirks, or are there more changes in behavior? > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com> > Reviewed-by: Keith Busch <kbusch@kernel.org> > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > --- > hw/block/nvme.c | 12 ++++++++++-- > hw/block/nvme.h | 1 + > hw/core/machine.c | 1 + > 3 files changed, 12 insertions(+), 2 deletions(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 453d3a89d475..8018f8679366 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -2749,6 +2749,15 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) > > pci_conf[PCI_INTERRUPT_PIN] = 1; > pci_config_set_prog_interface(pci_conf, 0x2); > + > + if (n->params.use_intel_id) { > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); > + pci_config_set_device_id(pci_conf, 0x5846); > + } else { > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT); > + pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_REDHAT_NVME); > + } > + > pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS); > pcie_endpoint_cap_init(pci_dev, 0x80); > > @@ -2903,6 +2912,7 @@ static Property nvme_props[] = { > DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3), > DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), > DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), > + DEFINE_PROP_BOOL("x-use-intel-id", NvmeCtrl, params.use_intel_id, false), > DEFINE_PROP_END_OF_LIST(), > }; > > @@ -2919,8 +2929,6 @@ static void nvme_class_init(ObjectClass *oc, void *data) > pc->realize = nvme_realize; > pc->exit = nvme_exit; > pc->class_id = PCI_CLASS_STORAGE_EXPRESS; > - pc->vendor_id = PCI_VENDOR_ID_INTEL; > - pc->device_id = 0x5845; > pc->revision = 2; > > set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); > diff --git a/hw/block/nvme.h b/hw/block/nvme.h > index 72260f2e8ea9..a734a5e1370d 100644 > --- a/hw/block/nvme.h > +++ b/hw/block/nvme.h > @@ -15,6 +15,7 @@ typedef struct NvmeParams { > uint8_t aerl; > uint32_t aer_max_queued; > uint8_t mdts; > + bool use_intel_id; > } NvmeParams; > > typedef struct NvmeAsyncEvent { > diff --git a/hw/core/machine.c b/hw/core/machine.c > index ea26d612374d..67990232528c 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -34,6 +34,7 @@ GlobalProperty hw_compat_5_1[] = { > { "vhost-user-scsi", "num_queues", "1"}, > { "virtio-blk-device", "num-queues", "1"}, > { "virtio-scsi-device", "num_queues", "1"}, > + { "nvme", "x-use-intel-id", "on"}, > }; > const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1); > >
On Sep 7 04:28, Philippe Mathieu-Daudé wrote: > +David in case > > On 9/4/20 4:19 PM, Klaus Jensen wrote: > > From: Klaus Jensen <k.jensen@samsung.com> > > > > There are two reasons for changing this: > > > > 1. The nvme device currently uses an internal Intel device id. > > > > 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: > > support multiple namespaces" the controller device no longer has > > the quirks that the Linux kernel think it has. > > > > As the quirks are applied based on pci vendor and device id, change > > them to get rid of the quirks. > > > > To keep backward compatibility, add a new 'x-use-intel-id' parameter to > > the nvme device to force use of the Intel vendor and device id. This is > > off by default but add a compat property to set this for 5.1 machines > > and older. > > So now what happens if you start a 5.1 machine with a recent kernel? > Simply the kernel will use unnecessary quirks, or are there more > changes in behavior? > Yes, the kernel will then just apply unneccesary quirks, these are: 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support anything else than values 0x0 and 0x1 for CNS (Identify Namespace and Identify Namespace). With multiple namespace support, this just means that the kernel will "scan" namespaces instead of using "Active Namespace ID list" (CNS 0x2). 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a broken Write Zeroes implementation which has since been fixed in commit 9d6459d21a6e ("nvme: fix write zeroes offset and count").
On 9/7/20 9:23 AM, Klaus Jensen wrote: > On Sep 7 04:28, Philippe Mathieu-Daudé wrote: >> +David in case >> >> On 9/4/20 4:19 PM, Klaus Jensen wrote: >>> From: Klaus Jensen <k.jensen@samsung.com> >>> >>> There are two reasons for changing this: >>> >>> 1. The nvme device currently uses an internal Intel device id. >>> >>> 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: >>> support multiple namespaces" the controller device no longer has >>> the quirks that the Linux kernel think it has. >>> >>> As the quirks are applied based on pci vendor and device id, change >>> them to get rid of the quirks. >>> >>> To keep backward compatibility, add a new 'x-use-intel-id' parameter to >>> the nvme device to force use of the Intel vendor and device id. This is >>> off by default but add a compat property to set this for 5.1 machines >>> and older. >> >> So now what happens if you start a 5.1 machine with a recent kernel? >> Simply the kernel will use unnecessary quirks, or are there more >> changes in behavior? >> > > Yes, the kernel will then just apply unneccesary quirks, these are: > > 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support > anything else than values 0x0 and 0x1 for CNS (Identify Namespace and > Identify Namespace). With multiple namespace support, this just > means that the kernel will "scan" namespaces instead of using > "Active Namespace ID list" (CNS 0x2). > > 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a > broken Write Zeroes implementation which has since been fixed in > commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). OK thanks. Can you amend that information in the commit description please? Thanks, Phil.
On Sep 7 10:58, Klaus Jensen wrote: > On Sep 7 10:36, Philippe Mathieu-Daudé wrote: > > On 9/7/20 9:23 AM, Klaus Jensen wrote: > > > On Sep 7 04:28, Philippe Mathieu-Daudé wrote: > > >> +David in case > > >> > > >> On 9/4/20 4:19 PM, Klaus Jensen wrote: > > >>> From: Klaus Jensen <k.jensen@samsung.com> > > >>> > > >>> There are two reasons for changing this: > > >>> > > >>> 1. The nvme device currently uses an internal Intel device id. > > >>> > > >>> 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: > > >>> support multiple namespaces" the controller device no longer has > > >>> the quirks that the Linux kernel think it has. > > >>> > > >>> As the quirks are applied based on pci vendor and device id, change > > >>> them to get rid of the quirks. > > >>> > > >>> To keep backward compatibility, add a new 'x-use-intel-id' parameter to > > >>> the nvme device to force use of the Intel vendor and device id. This is > > >>> off by default but add a compat property to set this for 5.1 machines > > >>> and older. > > >> > > >> So now what happens if you start a 5.1 machine with a recent kernel? > > >> Simply the kernel will use unnecessary quirks, or are there more > > >> changes in behavior? > > >> > > > > > > Yes, the kernel will then just apply unneccesary quirks, these are: > > > > > > 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support > > > anything else than values 0x0 and 0x1 for CNS (Identify Namespace and > > > Identify Namespace). With multiple namespace support, this just > > > means that the kernel will "scan" namespaces instead of using > > > "Active Namespace ID list" (CNS 0x2). > > > > > > 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a > > > broken Write Zeroes implementation which has since been fixed in > > > commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). > > > > OK thanks. Can you amend that information in the commit > > description please? > > > > Yes, absolutely. By the way. Is it correct use of an 'x-' parameter here - since it is something that we might remove in the future? I was unable to find any documentation on the purpose of the 'x-' prefix, but I was guessing it was for stuff like this.
On Sep 7 11:37, Dr. David Alan Gilbert wrote: > * Philippe Mathieu-Daudé (philmd@redhat.com) wrote: > > +David in case > > > > On 9/4/20 4:19 PM, Klaus Jensen wrote: > > > From: Klaus Jensen <k.jensen@samsung.com> > > > > > > There are two reasons for changing this: > > > > > > 1. The nvme device currently uses an internal Intel device id. > > > > > > 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: > > > support multiple namespaces" the controller device no longer has > > > the quirks that the Linux kernel think it has. > > > > > > As the quirks are applied based on pci vendor and device id, change > > > them to get rid of the quirks. > > > > > > To keep backward compatibility, add a new 'x-use-intel-id' parameter to > > > the nvme device to force use of the Intel vendor and device id. This is > > > off by default but add a compat property to set this for 5.1 machines > > > and older. > > > > So now what happens if you start a 5.1 machine with a recent kernel? > > Simply the kernel will use unnecessary quirks, or are there more > > changes in behavior? > > Seems reasonable to me...but... > > > > > > > Signed-off-by: Klaus Jensen <k.jensen@samsung.com> > > > Reviewed-by: Keith Busch <kbusch@kernel.org> > > > Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> > > > --- > > > hw/block/nvme.c | 12 ++++++++++-- > > > hw/block/nvme.h | 1 + > > > hw/core/machine.c | 1 + > > > 3 files changed, 12 insertions(+), 2 deletions(-) > > > > > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > > > index 453d3a89d475..8018f8679366 100644 > > > --- a/hw/block/nvme.c > > > +++ b/hw/block/nvme.c > > > @@ -2749,6 +2749,15 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) > > > > > > pci_conf[PCI_INTERRUPT_PIN] = 1; > > > pci_config_set_prog_interface(pci_conf, 0x2); > > > + > > > + if (n->params.use_intel_id) { > > > + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); > > > + pci_config_set_device_id(pci_conf, 0x5846); > > Wasn't that magic number 5845 down there ? > Argh! My first version of this just bumbed the intel device id and it got left there. Good find! Thank you!
On Mon, Sep 07, 2020 at 01:02:16PM +0200, Klaus Jensen wrote: > On Sep 7 11:52, Dr. David Alan Gilbert wrote: > > > It may be best to turn it into a constant in include/hw/pci/pci_ids.h if > > it corresponds to some real Intel device. > > > > Yes, but that is just the thing - it does not correspond to an > officially allocated device; which is why I think we should leave it out > of pci_ids.h. > > The end goal is to get rid of its use in the code by deprecating the > use-intel-id parameter in the future. I guess the parameter should just > be deprecated immediately? Then we can get rid of it in, what, 5.4? It's not a real device yet, but it likely will not be an nvme device once a product does release with this identifier. There may be trouble with driver binding when that happens, so deprecating it here sooner than later is my preference.
diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 453d3a89d475..8018f8679366 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -2749,6 +2749,15 @@ static void nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) pci_conf[PCI_INTERRUPT_PIN] = 1; pci_config_set_prog_interface(pci_conf, 0x2); + + if (n->params.use_intel_id) { + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_INTEL); + pci_config_set_device_id(pci_conf, 0x5846); + } else { + pci_config_set_vendor_id(pci_conf, PCI_VENDOR_ID_REDHAT); + pci_config_set_device_id(pci_conf, PCI_DEVICE_ID_REDHAT_NVME); + } + pci_config_set_class(pci_conf, PCI_CLASS_STORAGE_EXPRESS); pcie_endpoint_cap_init(pci_dev, 0x80); @@ -2903,6 +2912,7 @@ static Property nvme_props[] = { DEFINE_PROP_UINT8("aerl", NvmeCtrl, params.aerl, 3), DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), + DEFINE_PROP_BOOL("x-use-intel-id", NvmeCtrl, params.use_intel_id, false), DEFINE_PROP_END_OF_LIST(), }; @@ -2919,8 +2929,6 @@ static void nvme_class_init(ObjectClass *oc, void *data) pc->realize = nvme_realize; pc->exit = nvme_exit; pc->class_id = PCI_CLASS_STORAGE_EXPRESS; - pc->vendor_id = PCI_VENDOR_ID_INTEL; - pc->device_id = 0x5845; pc->revision = 2; set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); diff --git a/hw/block/nvme.h b/hw/block/nvme.h index 72260f2e8ea9..a734a5e1370d 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -15,6 +15,7 @@ typedef struct NvmeParams { uint8_t aerl; uint32_t aer_max_queued; uint8_t mdts; + bool use_intel_id; } NvmeParams; typedef struct NvmeAsyncEvent { diff --git a/hw/core/machine.c b/hw/core/machine.c index ea26d612374d..67990232528c 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -34,6 +34,7 @@ GlobalProperty hw_compat_5_1[] = { { "vhost-user-scsi", "num_queues", "1"}, { "virtio-blk-device", "num-queues", "1"}, { "virtio-scsi-device", "num_queues", "1"}, + { "nvme", "x-use-intel-id", "on"}, }; const size_t hw_compat_5_1_len = G_N_ELEMENTS(hw_compat_5_1);