Message ID | 20200913221436.22844-1-dmitry.fomichev@wdc.com |
---|---|
Headers | show |
Series | hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set | expand |
On Sep 14 07:14, Dmitry Fomichev wrote: > From: Ajay Joshi <ajay.joshi@wdc.com> > > A new write command, Zone Append, is added as a part of Zoned > Namespace Command Set. Upon successful completion of this command, > the controller returns the start LBA of the performed write operation > in cqe.result field. Therefore, the maximum size of this variable > needs to be changed from 32 to 64 bit, consuming the reserved 32 bit > field that follows the result in CQE struct. Since the existing > commands are expected to return a 32 bit LE value, two separate > variables, result32 and result64, are now kept in a union. > > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > Reviewed-by: Klaus Jensen <k.jensen@samsung.com> I know that I R-b'ed this, but can this be moved to the namespace types patch, since that is the TP that changes this. Also, I don't think we should touch the tracing in the block driver since it is not aware of namespace types.
> -----Original Message----- > From: Klaus Jensen <its@irrelevant.dk> > Sent: Tuesday, September 15, 2020 3:37 AM > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>; > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > On Sep 14 07:14, Dmitry Fomichev wrote: > > From: Ajay Joshi <ajay.joshi@wdc.com> > > > > A new write command, Zone Append, is added as a part of Zoned > > Namespace Command Set. Upon successful completion of this command, > > the controller returns the start LBA of the performed write operation > > in cqe.result field. Therefore, the maximum size of this variable > > needs to be changed from 32 to 64 bit, consuming the reserved 32 bit > > field that follows the result in CQE struct. Since the existing > > commands are expected to return a 32 bit LE value, two separate > > variables, result32 and result64, are now kept in a union. > > > > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > Reviewed-by: Klaus Jensen <k.jensen@samsung.com> > > I know that I R-b'ed this, but can this be moved to the namespace types > patch, since that is the TP that changes this. You probably meant the ZNS patch since result64 is first used there to return ZA starting data LBA. Sure, I can move this stuff to that patch. > > Also, I don't think we should touch the tracing in the block driver > since it is not aware of namespace types. Ok
On Sep 14 07:14, Dmitry Fomichev wrote: > A ZNS drive that is emulated by this module is currently initialized > with all zones Empty upon startup. However, actual ZNS SSDs save the > state and condition of all zones in their internal NVRAM in the event > of power loss. When such a drive is powered up again, it closes or > finishes all zones that were open at the moment of shutdown. Besides > that, the write pointer position as well as the state and condition > of all zones is preserved across power-downs. > > This commit adds the capability to have a persistent zone metadata > to the device. The new optional module property, "zone_file", > is introduced. If added to the command line, this property specifies > the name of the file that stores the zone metadata. If "zone_file" is > omitted, the device will be initialized with all zones empty, the same > as before. > > If zone metadata is configured to be persistent, then zone descriptor > extensions also persist across controller shutdowns. > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> This doesn't build on mingw. > --- > hw/block/nvme.c | 370 +++++++++++++++++++++++++++++++++++++++++++++--- > hw/block/nvme.h | 37 +++++ > 2 files changed, 386 insertions(+), 21 deletions(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index b49ae83dd5..41f4c0dacd 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -3429,7 +3557,188 @@ static int nvme_init_zone_meta(NvmeCtrl *n, NvmeNamespace *ns, > return 0; > } > > -static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) > +static int nvme_open_zone_file(NvmeCtrl *n, bool *init_meta) > +{ > + struct stat statbuf; > + size_t fsize; > + int ret; > + > + ret = stat(n->params.zone_file, &statbuf); > + if (ret && errno == ENOENT) { > + *init_meta = true; > + } else if (!S_ISREG(statbuf.st_mode)) { > + fprintf(stderr, "%s is not a regular file\n", strerror(errno)); > + return -1; > + } > + > + n->zone_file_fd = open(n->params.zone_file, > + O_RDWR | O_LARGEFILE | O_BINARY | O_CREAT, 644); mode is wrong - I think you meant for it to be octal.
On Sep 15 18:56, Dmitry Fomichev wrote: > > -----Original Message----- > > From: Klaus Jensen <its@irrelevant.dk> > > Sent: Tuesday, September 15, 2020 3:37 AM > > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > > <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>; > > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > > > On Sep 14 07:14, Dmitry Fomichev wrote: > > > From: Ajay Joshi <ajay.joshi@wdc.com> > > > > > > A new write command, Zone Append, is added as a part of Zoned > > > Namespace Command Set. Upon successful completion of this command, > > > the controller returns the start LBA of the performed write operation > > > in cqe.result field. Therefore, the maximum size of this variable > > > needs to be changed from 32 to 64 bit, consuming the reserved 32 bit > > > field that follows the result in CQE struct. Since the existing > > > commands are expected to return a 32 bit LE value, two separate > > > variables, result32 and result64, are now kept in a union. > > > > > > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> > > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > > Reviewed-by: Klaus Jensen <k.jensen@samsung.com> > > > > I know that I R-b'ed this, but can this be moved to the namespace types > > patch, since that is the TP that changes this. > > You probably meant the ZNS patch since result64 is first used there to return > ZA starting data LBA. Sure, I can move this stuff to that patch. > No, I actually did mean the NST patch since TP 4056 is the TP that "unreserves" dw1 in the CQE.
> -----Original Message----- > From: Klaus Jensen <its@irrelevant.dk> > Sent: Tuesday, September 15, 2020 3:56 PM > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > Cc: Fam Zheng <fam@euphon.net>; Kevin Wolf <kwolf@redhat.com>; > Damien Le Moal <Damien.LeMoal@wdc.com>; qemu-block@nongnu.org; > Niklas Cassel <Niklas.Cassel@wdc.com>; Klaus Jensen > <k.jensen@samsung.com>; qemu-devel@nongnu.org; Alistair Francis > <Alistair.Francis@wdc.com>; Keith Busch <kbusch@kernel.org>; Philippe > Mathieu-Daudé <philmd@redhat.com>; Matias Bjorling > <Matias.Bjorling@wdc.com> > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > On Sep 15 18:56, Dmitry Fomichev wrote: > > > -----Original Message----- > > > From: Klaus Jensen <its@irrelevant.dk> > > > Sent: Tuesday, September 15, 2020 3:37 AM > > > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > > > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > > > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > > > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > > > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > > > <Niklas.Cassel@wdc.com>; Damien Le Moal > <Damien.LeMoal@wdc.com>; > > > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > > > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > > > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > > > > > On Sep 14 07:14, Dmitry Fomichev wrote: > > > > From: Ajay Joshi <ajay.joshi@wdc.com> > > > > > > > > A new write command, Zone Append, is added as a part of Zoned > > > > Namespace Command Set. Upon successful completion of this > command, > > > > the controller returns the start LBA of the performed write operation > > > > in cqe.result field. Therefore, the maximum size of this variable > > > > needs to be changed from 32 to 64 bit, consuming the reserved 32 bit > > > > field that follows the result in CQE struct. Since the existing > > > > commands are expected to return a 32 bit LE value, two separate > > > > variables, result32 and result64, are now kept in a union. > > > > > > > > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> > > > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > > > Reviewed-by: Klaus Jensen <k.jensen@samsung.com> > > > > > > I know that I R-b'ed this, but can this be moved to the namespace types > > > patch, since that is the TP that changes this. > > > > You probably meant the ZNS patch since result64 is first used there to > return > > ZA starting data LBA. Sure, I can move this stuff to that patch. > > > > No, I actually did mean the NST patch since TP 4056 is the TP that > "unreserves" dw1 in the CQE. It is not necessary to change it in NST patch since result64 field is not used in that patch. But if you insist, I can move it to NST patch :)
> -----Original Message----- > From: Klaus Jensen <its@irrelevant.dk> > Sent: Tuesday, September 15, 2020 3:10 PM > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>; > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > Subject: Re: [PATCH v3 14/15] hw/block/nvme: Use zone metadata file for > persistence > > On Sep 14 07:14, Dmitry Fomichev wrote: > > A ZNS drive that is emulated by this module is currently initialized > > with all zones Empty upon startup. However, actual ZNS SSDs save the > > state and condition of all zones in their internal NVRAM in the event > > of power loss. When such a drive is powered up again, it closes or > > finishes all zones that were open at the moment of shutdown. Besides > > that, the write pointer position as well as the state and condition > > of all zones is preserved across power-downs. > > > > This commit adds the capability to have a persistent zone metadata > > to the device. The new optional module property, "zone_file", > > is introduced. If added to the command line, this property specifies > > the name of the file that stores the zone metadata. If "zone_file" is > > omitted, the device will be initialized with all zones empty, the same > > as before. > > > > If zone metadata is configured to be persistent, then zone descriptor > > extensions also persist across controller shutdowns. > > > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > This doesn't build on mingw. Thanks for letting me know. I'll try to look into this. Do you cross-compile with mingw64? > > > --- > > hw/block/nvme.c | 370 > +++++++++++++++++++++++++++++++++++++++++++++--- > > hw/block/nvme.h | 37 +++++ > > 2 files changed, 386 insertions(+), 21 deletions(-) > > > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > > index b49ae83dd5..41f4c0dacd 100644 > > --- a/hw/block/nvme.c > > +++ b/hw/block/nvme.c > > @@ -3429,7 +3557,188 @@ static int nvme_init_zone_meta(NvmeCtrl *n, > NvmeNamespace *ns, > > return 0; > > } > > > > -static void nvme_zoned_init_ctrl(NvmeCtrl *n, Error **errp) > > +static int nvme_open_zone_file(NvmeCtrl *n, bool *init_meta) > > +{ > > + struct stat statbuf; > > + size_t fsize; > > + int ret; > > + > > + ret = stat(n->params.zone_file, &statbuf); > > + if (ret && errno == ENOENT) { > > + *init_meta = true; > > + } else if (!S_ISREG(statbuf.st_mode)) { > > + fprintf(stderr, "%s is not a regular file\n", strerror(errno)); > > + return -1; > > + } > > + > > + n->zone_file_fd = open(n->params.zone_file, > > + O_RDWR | O_LARGEFILE | O_BINARY | O_CREAT, 644); > > mode is wrong - I think you meant for it to be octal. Nice catch, needs to be 0644...
On Sep 15 20:44, Dmitry Fomichev wrote: > > -----Original Message----- > > From: Klaus Jensen <its@irrelevant.dk> > > Sent: Tuesday, September 15, 2020 3:10 PM > > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > > <Niklas.Cassel@wdc.com>; Damien Le Moal <Damien.LeMoal@wdc.com>; > > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > > Subject: Re: [PATCH v3 14/15] hw/block/nvme: Use zone metadata file for > > persistence > > > > On Sep 14 07:14, Dmitry Fomichev wrote: > > > A ZNS drive that is emulated by this module is currently initialized > > > with all zones Empty upon startup. However, actual ZNS SSDs save the > > > state and condition of all zones in their internal NVRAM in the event > > > of power loss. When such a drive is powered up again, it closes or > > > finishes all zones that were open at the moment of shutdown. Besides > > > that, the write pointer position as well as the state and condition > > > of all zones is preserved across power-downs. > > > > > > This commit adds the capability to have a persistent zone metadata > > > to the device. The new optional module property, "zone_file", > > > is introduced. If added to the command line, this property specifies > > > the name of the file that stores the zone metadata. If "zone_file" is > > > omitted, the device will be initialized with all zones empty, the same > > > as before. > > > > > > If zone metadata is configured to be persistent, then zone descriptor > > > extensions also persist across controller shutdowns. > > > > > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > > > This doesn't build on mingw. > > Thanks for letting me know. I'll try to look into this. Do you cross-compile > with mingw64? Yup, make docker-test-mingw@fedora
On Sep 15 20:44, Dmitry Fomichev wrote: > > -----Original Message----- > > From: Klaus Jensen <its@irrelevant.dk> > > Sent: Tuesday, September 15, 2020 3:56 PM > > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > > Cc: Fam Zheng <fam@euphon.net>; Kevin Wolf <kwolf@redhat.com>; > > Damien Le Moal <Damien.LeMoal@wdc.com>; qemu-block@nongnu.org; > > Niklas Cassel <Niklas.Cassel@wdc.com>; Klaus Jensen > > <k.jensen@samsung.com>; qemu-devel@nongnu.org; Alistair Francis > > <Alistair.Francis@wdc.com>; Keith Busch <kbusch@kernel.org>; Philippe > > Mathieu-Daudé <philmd@redhat.com>; Matias Bjorling > > <Matias.Bjorling@wdc.com> > > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > > > On Sep 15 18:56, Dmitry Fomichev wrote: > > > > -----Original Message----- > > > > From: Klaus Jensen <its@irrelevant.dk> > > > > Sent: Tuesday, September 15, 2020 3:37 AM > > > > To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com> > > > > Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen > > > > <k.jensen@samsung.com>; Kevin Wolf <kwolf@redhat.com>; Philippe > > > > Mathieu-Daudé <philmd@redhat.com>; Maxim Levitsky > > > > <mlevitsk@redhat.com>; Fam Zheng <fam@euphon.net>; Niklas Cassel > > > > <Niklas.Cassel@wdc.com>; Damien Le Moal > > <Damien.LeMoal@wdc.com>; > > > > qemu-block@nongnu.org; qemu-devel@nongnu.org; Alistair Francis > > > > <Alistair.Francis@wdc.com>; Matias Bjorling <Matias.Bjorling@wdc.com> > > > > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > > > > > > > On Sep 14 07:14, Dmitry Fomichev wrote: > > > > > From: Ajay Joshi <ajay.joshi@wdc.com> > > > > > > > > > > A new write command, Zone Append, is added as a part of Zoned > > > > > Namespace Command Set. Upon successful completion of this > > command, > > > > > the controller returns the start LBA of the performed write operation > > > > > in cqe.result field. Therefore, the maximum size of this variable > > > > > needs to be changed from 32 to 64 bit, consuming the reserved 32 bit > > > > > field that follows the result in CQE struct. Since the existing > > > > > commands are expected to return a 32 bit LE value, two separate > > > > > variables, result32 and result64, are now kept in a union. > > > > > > > > > > Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com> > > > > > Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> > > > > > Reviewed-by: Klaus Jensen <k.jensen@samsung.com> > > > > > > > > I know that I R-b'ed this, but can this be moved to the namespace types > > > > patch, since that is the TP that changes this. > > > > > > You probably meant the ZNS patch since result64 is first used there to > > return > > > ZA starting data LBA. Sure, I can move this stuff to that patch. > > > > > > > No, I actually did mean the NST patch since TP 4056 is the TP that > > "unreserves" dw1 in the CQE. > > It is not necessary to change it in NST patch since result64 field is not used > in that patch. But if you insist, I can move it to NST patch :) You are right of course, but since it is a change to the "spec" related data structures that go into include/block/nvme.h, I think it belongs in "hw/block/nvme: Introduce the Namespace Types definitions".
On Tue, Sep 15, 2020 at 10:48:49PM +0200, Klaus Jensen wrote: > On Sep 15 20:44, Dmitry Fomichev wrote: > > > > It is not necessary to change it in NST patch since result64 field is not used > > in that patch. But if you insist, I can move it to NST patch :) > > You are right of course, but since it is a change to the "spec" related > data structures that go into include/block/nvme.h, I think it belongs in > "hw/block/nvme: Introduce the Namespace Types definitions". Just my $.02, unless you're making exernal APIs, I really find it easier to review internal changes inline with the patches that use it. Another example, there are patches in this series that introduce trace points, but the patch that use them come later. I find that harder to review since I need to look at two different patches to understand its value. I realize this particular patch is implementing a spec feature, but I'd prefer to see how it's used over of making a round trip to the spec.
> -----Original Message----- > From: Keith Busch <kbusch@kernel.org> > Sent: Friday, September 18, 2020 5:10 PM > To: Klaus Jensen <its@irrelevant.dk> > Cc: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>; Fam Zheng > <fam@euphon.net>; Kevin Wolf <kwolf@redhat.com>; Damien Le Moal > <Damien.LeMoal@wdc.com>; qemu-block@nongnu.org; Niklas Cassel > <Niklas.Cassel@wdc.com>; Klaus Jensen <k.jensen@samsung.com>; qemu- > devel@nongnu.org; Alistair Francis <Alistair.Francis@wdc.com>; Philippe > Mathieu-Daudé <philmd@redhat.com>; Matias Bjorling > <Matias.Bjorling@wdc.com> > Subject: Re: [PATCH v3 01/15] hw/block/nvme: Define 64 bit cqe.result > > On Tue, Sep 15, 2020 at 10:48:49PM +0200, Klaus Jensen wrote: > > On Sep 15 20:44, Dmitry Fomichev wrote: > > > > > > It is not necessary to change it in NST patch since result64 field is not used > > > in that patch. But if you insist, I can move it to NST patch :) > > > > You are right of course, but since it is a change to the "spec" related > > data structures that go into include/block/nvme.h, I think it belongs in > > "hw/block/nvme: Introduce the Namespace Types definitions". > > Just my $.02, unless you're making exernal APIs, I really find it easier > to review internal changes inline with the patches that use it. > > Another example, there are patches in this series that introduce trace > points, but the patch that use them come later. I find that harder to > review since I need to look at two different patches to understand its > value. > I decided to split the trace events to be separate because I felt that it could make the reviewing process simpler. Since the majority of the patches in this series are on the large side, I thought it would be easier to open them in separate sessions rather than to review a large single diff. Let me know if you'd like me to fold the tracing stuff together with the code... DF > I realize this particular patch is implementing a spec feature, but I'd > prefer to see how it's used over of making a round trip to the spec.