Message ID | 20221205162035.2261037-1-alvaro.karsz@solid-run.com |
---|---|
State | New |
Headers | show |
Series | [v3] virtio_blk: add VIRTIO_BLK_F_LIFETIME feature support | expand |
On 12/5/22 10:24, Jens Axboe wrote: > On 12/5/22 9:20 AM, Alvaro Karsz wrote: >> Implement the VIRTIO_BLK_F_LIFETIME feature for VirtIO block devices. >> >> This commit introduces a new ioctl command, VBLK_LIFETIME. >> >> VBLK_LIFETIME ioctl asks for the block device to provide lifetime >> information by sending a VIRTIO_BLK_T_GET_LIFETIME command to the device. >> >> lifetime information fields: >> >> - pre_eol_info: specifies the percentage of reserved blocks that are >> consumed. >> optional values following virtio spec: >> *) 0 - undefined. >> *) 1 - normal, < 80% of reserved blocks are consumed. >> *) 2 - warning, 80% of reserved blocks are consumed. >> *) 3 - urgent, 90% of reserved blocks are consumed. >> >> - device_lifetime_est_typ_a: this field refers to wear of SLC cells and >> is provided in increments of 10used, and so >> on, thru to 11 meaning estimated lifetime >> exceeded. All values above 11 are reserved. >> >> - device_lifetime_est_typ_b: this field refers to wear of MLC cells and is >> provided with the same semantics as >> device_lifetime_est_typ_a. >> >> The data received from the device will be sent as is to the user. >> No data check/decode is done by virtblk. > > Is this based on some spec? Because it looks pretty odd to me. There > can be a pretty wide range of two/three/etc level cells with wildly > different ranges of durability. And there's really not a lot of slc > for generic devices these days, if any. > This is exactly what I said on initial version about new types ... -ck
On 12/5/22 12:35, Enrico Granata wrote: > The original definitions for these fields come from JESD84-B50, which > is what eMMC storage uses. It has been a while, but I recall UFS doing > something pretty similar. > Systems that don't have a well defined notion of durability would just > not expose the flag (e.g. a spinning disk), and going for what > eMMC/UFS expose already would make implementations fairly seamless for > a lot of common embedded scenarios. > Has it been considered there might be non-embeded deployments which virtio-blk frontend can also benefit from this feature and they can have more flash cell types than present in the approved spec ? > Of course, if you see room for improvement to the spec, I'd be very > interested in hearing your thoughts > > Thanks, > - Enrico > > Thanks, > - Enrico > From a quick look :- The struct virtio_blk_lifetime is too narrow for sure only supporting two types, it at least needs to support available flash cell types to start with and the members needs to be renamed to reflect the actual type for more clarity. hope this helps ... -ck
On Mon, Dec 05, 2022 at 06:20:34PM +0200, Alvaro Karsz wrote: I don't like that the ioctl lifetime struct is passed through little-endian from the device to userspace. The point of this new ioctl is not to be a passthrough interface. The kernel should define a proper UABI struct for the ioctl and handle endianness conversion. But I think Michael is happy with this approach, so nevermind. > @@ -219,4 +247,8 @@ struct virtio_scsi_inhdr { > #define VIRTIO_BLK_S_OK 0 > #define VIRTIO_BLK_S_IOERR 1 > #define VIRTIO_BLK_S_UNSUPP 2 > + > +/* Virtblk ioctl commands */ > +#define VBLK_GET_LIFETIME _IOR('r', 0, struct virtio_blk_lifetime) Does something include <linux/ioctl.h> for _IOR()? Failure to include the necessary header file could break userspace applications that include <linux/virtio_blk.h>. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On 12/5/22 08:20, Alvaro Karsz wrote: > +/* Get lifetime information struct for each request */ > +struct virtio_blk_lifetime { > + /* > + * specifies the percentage of reserved blocks that are consumed. > + * optional values following virtio spec: > + * 0 - undefined > + * 1 - normal, < 80% of reserved blocks are consumed > + * 2 - warning, 80% of reserved blocks are consumed > + * 3 - urgent, 90% of reserved blocks are consumed > + */ > + __le16 pre_eol_info; > + /* > + * this field refers to wear of SLC cells and is provided in increments of 10used, > + * and so on, thru to 11 meaning estimated lifetime exceeded. All values above 11 > + * are reserved > + */ > + __le16 device_lifetime_est_typ_a; > + /* > + * this field refers to wear of MLC cells and is provided with the same semantics as > + * device_lifetime_est_typ_a > + */ > + __le16 device_lifetime_est_typ_b; > +}; Why does the above data structure only refer to SLC and MLC but not to TLC or QLC? How will this data structure be extended without having to introduce a new ioctl? Thanks, Bart.
Hi Bart, > Why does the above data structure only refer to SLC and MLC but not to > TLC or QLC? This has been discussed before. The data structure follows the virtio spec (https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html) > How will this data structure be extended without having to introduce a > new ioctl? There are no more lifetime parameters defined in the virtio spec. Please note that this is a virtio block specific ioctl, not a block generic one.
Bart, On 12/6/22 10:43, Bart Van Assche wrote: > On 12/5/22 08:20, Alvaro Karsz wrote: >> +/* Get lifetime information struct for each request */ >> +struct virtio_blk_lifetime { >> + /* >> + * specifies the percentage of reserved blocks that are consumed. >> + * optional values following virtio spec: >> + * 0 - undefined >> + * 1 - normal, < 80% of reserved blocks are consumed >> + * 2 - warning, 80% of reserved blocks are consumed >> + * 3 - urgent, 90% of reserved blocks are consumed >> + */ >> + __le16 pre_eol_info; >> + /* >> + * this field refers to wear of SLC cells and is provided in >> increments of 10used, >> + * and so on, thru to 11 meaning estimated lifetime exceeded. All >> values above 11 >> + * are reserved >> + */ >> + __le16 device_lifetime_est_typ_a; >> + /* >> + * this field refers to wear of MLC cells and is provided with >> the same semantics as >> + * device_lifetime_est_typ_a >> + */ >> + __le16 device_lifetime_est_typ_b; >> +}; > > Why does the above data structure only refer to SLC and MLC but not to > TLC or QLC? > > How will this data structure be extended without having to introduce a > new ioctl? > > Thanks, > > Bart. > We already discussed that see :- https://lists.linuxfoundation.org/pipermail/virtualization/2022-November/063742.html -ck
This just seems like a horrible interface. And virtio-blk should be a simple passthrough and not grow tons of side-band crap like this. If you want to pass through random misc information use virtio-scsi or nvme with shadow doorbell buffers.
On Tue, Dec 06, 2022 at 11:35:31PM -0800, Christoph Hellwig wrote: > This just seems like a horrible interface. And virtio-blk should be > a simple passthrough and not grow tons of side-band crap like this. > > If you want to pass through random misc information use virtio-scsi > or nvme with shadow doorbell buffers. Christoph you acked the spec patch adding this to virtio blk: Still not a fan of the encoding, but at least it is properly documented now: Acked-by: Christoph Hellwig <hch@lst.de> Did you change your mind then? Or do you prefer a different encoding for the ioctl then? could you help sugesting what kind? Thanks,
On Wed, Dec 07, 2022 at 05:21:48AM -0500, Michael S. Tsirkin wrote: > Christoph you acked the spec patch adding this to virtio blk: > > Still not a fan of the encoding, but at least it is properly documented > now: > > Acked-by: Christoph Hellwig <hch@lst.de> > > Did you change your mind then? Or do you prefer a different encoding for > the ioctl then? could you help sugesting what kind? Well, it is good enough documented for a spec. I don't think it is a useful feature for Linux where virtio-blk is our minimum viable paravirtualized block driver.
On Wed, Dec 07, 2022 at 08:31:28AM -0800, Christoph Hellwig wrote: > On Wed, Dec 07, 2022 at 05:21:48AM -0500, Michael S. Tsirkin wrote: > > Christoph you acked the spec patch adding this to virtio blk: > > > > Still not a fan of the encoding, but at least it is properly documented > > now: > > > > Acked-by: Christoph Hellwig <hch@lst.de> > > > > Did you change your mind then? Or do you prefer a different encoding for > > the ioctl then? could you help sugesting what kind? > > Well, it is good enough documented for a spec. I don't think it is > a useful feature for Linux where virtio-blk is our minimum viable > paravirtualized block driver. No idea what this means, sorry. Now that's in the spec I expect (some) devices to implement it and if they do I see no reason not to expose the data to userspace. Alvaro could you pls explain the use-case? Christoph has doubts that it's useful. Do you have a device implementing this?
> Alvaro could you pls explain the use-case? Christoph has doubts that > it's useful. Do you have a device implementing this? Our HW exposes virtio devices, virtio block included. The block device backend can be a eMMC/uSD card. The HW can report health values (power, temp, current, voltage) and I thought that it would be nice to be able to report lifetime values as well.
Michael, On 12/7/22 12:28, Michael S. Tsirkin wrote: > On Wed, Dec 07, 2022 at 08:31:28AM -0800, Christoph Hellwig wrote: >> On Wed, Dec 07, 2022 at 05:21:48AM -0500, Michael S. Tsirkin wrote: >>> Christoph you acked the spec patch adding this to virtio blk: >>> >>> Still not a fan of the encoding, but at least it is properly documented >>> now: >>> >>> Acked-by: Christoph Hellwig <hch@lst.de> >>> >>> Did you change your mind then? Or do you prefer a different encoding for >>> the ioctl then? could you help sugesting what kind? >> >> Well, it is good enough documented for a spec. I don't think it is >> a useful feature for Linux where virtio-blk is our minimum viable >> paravirtualized block driver. > > No idea what this means, sorry. Now that's in the spec I expect (some) > devices to implement it and if they do I see no reason not to expose the > data to userspace. > Even if any device implements is it can always use passthru commands. See below for more info... > Alvaro could you pls explain the use-case? Christoph has doubts that > it's useful. Do you have a device implementing this? > From what I know, virtio-blk should be kept minimal and should not add any storage specific IOCTLs or features that will end up loosing its generic nature. The IOCTL we are trying to add is Flash storage specific which goes against the nature of generic storage and makes it non-generic. In case we approve this it will open the door for non-generic code/IOCTL in the virtio-blk and that needs to be avoided. For any storage specific features or IOCTL (flash/HDD) it should be using it's own frontend such as virtio-scsi or e.g. nvme and not virtio-blk. Hope this helps. -ck
On 12/11/22 01:49, Alvaro Karsz wrote: >> Alvaro could you pls explain the use-case? Christoph has doubts that >> it's useful. Do you have a device implementing this? > > Our HW exposes virtio devices, virtio block included. > The block device backend can be a eMMC/uSD card. > > The HW can report health values (power, temp, current, voltage) and I > thought that it would be nice to be able to report lifetime values as > well. Why not use block layer passthru interface to fetch this instead of adding non-generic IOCTL ? -ck
On Tue, Dec 13, 2022 at 04:58:47AM +0000, Chaitanya Kulkarni wrote: > Michael, > > On 12/7/22 12:28, Michael S. Tsirkin wrote: > > On Wed, Dec 07, 2022 at 08:31:28AM -0800, Christoph Hellwig wrote: > >> On Wed, Dec 07, 2022 at 05:21:48AM -0500, Michael S. Tsirkin wrote: > >>> Christoph you acked the spec patch adding this to virtio blk: > >>> > >>> Still not a fan of the encoding, but at least it is properly documented > >>> now: > >>> > >>> Acked-by: Christoph Hellwig <hch@lst.de> > >>> > >>> Did you change your mind then? Or do you prefer a different encoding for > >>> the ioctl then? could you help sugesting what kind? > >> > >> Well, it is good enough documented for a spec. I don't think it is > >> a useful feature for Linux where virtio-blk is our minimum viable > >> paravirtualized block driver. > > > > No idea what this means, sorry. Now that's in the spec I expect (some) > > devices to implement it and if they do I see no reason not to expose the > > data to userspace. > > > > Even if any device implements is it can always use passthru commands. > See below for more info... > > > Alvaro could you pls explain the use-case? Christoph has doubts that > > it's useful. Do you have a device implementing this? > > > > From what I know, virtio-blk should be kept minimal and should not > add any storage specific IOCTLs or features that will end up loosing > its generic nature. > > The IOCTL we are trying to add is Flash storage specific which > goes against the nature of generic storage and makes it non-generic. > In case we approve this it will open the door for non-generic > code/IOCTL in the virtio-blk and that needs to be avoided. Wrt these fields that horse has bolted, it's in the spec. > For any storage specific features or IOCTL (flash/HDD) it should > be using it's own frontend such as virtio-scsi or e.g. nvme and > not virtio-blk. > > Hope this helps. > > -ck I don't understand what you are suggesting, sorry. It's a hardware device. It can't just "switch to a different frontend".
On Mon, Dec 05, 2022 at 06:20:34PM +0200, Alvaro Karsz wrote: > Implement the VIRTIO_BLK_F_LIFETIME feature for VirtIO block devices. > > This commit introduces a new ioctl command, VBLK_LIFETIME. > > VBLK_LIFETIME ioctl asks for the block device to provide lifetime > information by sending a VIRTIO_BLK_T_GET_LIFETIME command to the device. > > lifetime information fields: > > - pre_eol_info: specifies the percentage of reserved blocks that are > consumed. > optional values following virtio spec: > *) 0 - undefined. > *) 1 - normal, < 80% of reserved blocks are consumed. > *) 2 - warning, 80% of reserved blocks are consumed. > *) 3 - urgent, 90% of reserved blocks are consumed. > > - device_lifetime_est_typ_a: this field refers to wear of SLC cells and > is provided in increments of 10used, and so > on, thru to 11 meaning estimated lifetime > exceeded. All values above 11 are reserved. > > - device_lifetime_est_typ_b: this field refers to wear of MLC cells and is > provided with the same semantics as > device_lifetime_est_typ_a. > > The data received from the device will be sent as is to the user. > No data check/decode is done by virtblk. > > Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> > -- > v2: > - Remove (void *) casting. > - Fix comments format in virtio_blk.h. > - Set ioprio value for legacy devices for REQ_OP_DRV_IN > operations. > > v3: > - Initialize struct virtio_blk_lifetime to 0 instead of memset. > - Rename ioctl from VBLK_LIFETIME to VBLK_GET_LIFETIME. > - Return EOPNOTSUPP insted of ENOTTY if ioctl is not supported. > - Check if process is CAP_SYS_ADMIN capable in lifetime ioctl. > - Check if vdev is not NULL before accessing it in lifetime ioctl. > -- > --- > drivers/block/virtio_blk.c | 106 ++++++++++++++++++++++++++++++-- > include/uapi/linux/virtio_blk.h | 32 ++++++++++ > 2 files changed, 133 insertions(+), 5 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 19da5defd73..392d57fd6b7 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -101,6 +101,18 @@ static inline blk_status_t virtblk_result(struct virtblk_req *vbr) > } > } > > +static inline int virtblk_ioctl_result(struct virtblk_req *vbr) > +{ > + switch (vbr->status) { > + case VIRTIO_BLK_S_OK: > + return 0; > + case VIRTIO_BLK_S_UNSUPP: > + return -EOPNOTSUPP; > + default: > + return -EIO; > + } > +} > + > static inline struct virtio_blk_vq *get_virtio_blk_vq(struct blk_mq_hw_ctx *hctx) > { > struct virtio_blk *vblk = hctx->queue->queuedata; > @@ -218,6 +230,7 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device *vdev, > u32 type; > > vbr->out_hdr.sector = 0; > + vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req)); > > switch (req_op(req)) { > case REQ_OP_READ: > @@ -244,15 +257,14 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device *vdev, > type = VIRTIO_BLK_T_SECURE_ERASE; > break; > case REQ_OP_DRV_IN: > - type = VIRTIO_BLK_T_GET_ID; > - break; > + /* type is set in virtblk_get_id/virtblk_ioctl_lifetime */ > + return 0; > default: > WARN_ON_ONCE(1); > return BLK_STS_IOERR; > } > > vbr->out_hdr.type = cpu_to_virtio32(vdev, type); > - vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req)); > > if (type == VIRTIO_BLK_T_DISCARD || type == VIRTIO_BLK_T_WRITE_ZEROES || > type == VIRTIO_BLK_T_SECURE_ERASE) { > @@ -459,12 +471,16 @@ static int virtblk_get_id(struct gendisk *disk, char *id_str) > struct virtio_blk *vblk = disk->private_data; > struct request_queue *q = vblk->disk->queue; > struct request *req; > + struct virtblk_req *vbr; > int err; > > req = blk_mq_alloc_request(q, REQ_OP_DRV_IN, 0); > if (IS_ERR(req)) > return PTR_ERR(req); > > + vbr = blk_mq_rq_to_pdu(req); > + vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_GET_ID); > + > err = blk_rq_map_kern(q, req, id_str, VIRTIO_BLK_ID_BYTES, GFP_KERNEL); > if (err) > goto out; > @@ -508,6 +524,85 @@ static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo) > return ret; > } > > +/* Get lifetime information from device */ > +static int virtblk_ioctl_lifetime(struct virtio_blk *vblk, unsigned long arg) > +{ > + struct request_queue *q = vblk->disk->queue; > + struct request *req = NULL; > + struct virtblk_req *vbr; > + struct virtio_blk_lifetime lifetime = {}; > + int ret; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + /* The virtio_blk_lifetime struct fields follow virtio spec. > + * There is no check/decode on values received from the device. > + * The data is sent as is to the user. > + */ > + > + /* This ioctl is allowed only if VIRTIO_BLK_F_LIFETIME > + * feature is negotiated. > + */ > + if (!virtio_has_feature(vblk->vdev, VIRTIO_BLK_F_LIFETIME)) > + return -EOPNOTSUPP; > + > + req = blk_mq_alloc_request(q, REQ_OP_DRV_IN, 0); > + if (IS_ERR(req)) > + return PTR_ERR(req); > + > + /* Write the correct type */ > + vbr = blk_mq_rq_to_pdu(req); > + vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_GET_LIFETIME); > + > + ret = blk_rq_map_kern(q, req, &lifetime, sizeof(lifetime), GFP_KERNEL); > + if (ret) > + goto out; > + > + blk_execute_rq(req, false); > + > + ret = virtblk_ioctl_result(blk_mq_rq_to_pdu(req)); > + if (ret) > + goto out; > + > + /* Pass the data to the user */ > + if (copy_to_user((void __user *)arg, &lifetime, sizeof(lifetime))) { > + ret = -EFAULT; > + goto out; > + } > + > +out: > + blk_mq_free_request(req); > + return ret; > +} > + > +static int virtblk_ioctl(struct block_device *bd, fmode_t mode, > + unsigned int cmd, unsigned long arg) > +{ > + struct virtio_blk *vblk = bd->bd_disk->private_data; > + int ret; > + > + mutex_lock(&vblk->vdev_mutex); > + > + if (!vblk->vdev) { > + ret = -ENXIO; > + goto exit; > + } > + > + switch (cmd) { > + case VBLK_GET_LIFETIME: > + ret = virtblk_ioctl_lifetime(vblk, arg); > + break; > + default: > + ret = -EOPNOTSUPP; > + break; > + } > + > +exit: > + mutex_unlock(&vblk->vdev_mutex); > + return ret; > +} > + > static void virtblk_free_disk(struct gendisk *disk) > { > struct virtio_blk *vblk = disk->private_data; > @@ -520,6 +615,7 @@ static void virtblk_free_disk(struct gendisk *disk) > static const struct block_device_operations virtblk_fops = { > .owner = THIS_MODULE, > .getgeo = virtblk_getgeo, > + .ioctl = virtblk_ioctl, > .free_disk = virtblk_free_disk, > }; > > @@ -1239,7 +1335,7 @@ static unsigned int features_legacy[] = { > VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, > VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE, > VIRTIO_BLK_F_MQ, VIRTIO_BLK_F_DISCARD, VIRTIO_BLK_F_WRITE_ZEROES, > - VIRTIO_BLK_F_SECURE_ERASE, > + VIRTIO_BLK_F_SECURE_ERASE, VIRTIO_BLK_F_LIFETIME, > } > ; > static unsigned int features[] = { > @@ -1247,7 +1343,7 @@ static unsigned int features[] = { > VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, > VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE, > VIRTIO_BLK_F_MQ, VIRTIO_BLK_F_DISCARD, VIRTIO_BLK_F_WRITE_ZEROES, > - VIRTIO_BLK_F_SECURE_ERASE, > + VIRTIO_BLK_F_SECURE_ERASE, VIRTIO_BLK_F_LIFETIME, > }; > > static struct virtio_driver virtio_blk = { > diff --git a/include/uapi/linux/virtio_blk.h b/include/uapi/linux/virtio_blk.h > index 58e70b24b50..e755d62d2ea 100644 > --- a/include/uapi/linux/virtio_blk.h > +++ b/include/uapi/linux/virtio_blk.h > @@ -40,6 +40,7 @@ > #define VIRTIO_BLK_F_MQ 12 /* support more than one vq */ > #define VIRTIO_BLK_F_DISCARD 13 /* DISCARD is supported */ > #define VIRTIO_BLK_F_WRITE_ZEROES 14 /* WRITE ZEROES is supported */ > +#define VIRTIO_BLK_F_LIFETIME 15 /* Storage lifetime information is supported */ > #define VIRTIO_BLK_F_SECURE_ERASE 16 /* Secure Erase is supported */ > > /* Legacy feature bits */ > @@ -165,6 +166,9 @@ struct virtio_blk_config { > /* Get device ID command */ > #define VIRTIO_BLK_T_GET_ID 8 > > +/* Get lifetime information command */ > +#define VIRTIO_BLK_T_GET_LIFETIME 10 > + > /* Discard command */ > #define VIRTIO_BLK_T_DISCARD 11 > > @@ -206,6 +210,30 @@ struct virtio_blk_discard_write_zeroes { > __le32 flags; > }; > > +/* Get lifetime information struct for each request */ > +struct virtio_blk_lifetime { > + /* > + * specifies the percentage of reserved blocks that are consumed. > + * optional values following virtio spec: > + * 0 - undefined > + * 1 - normal, < 80% of reserved blocks are consumed > + * 2 - warning, 80% of reserved blocks are consumed > + * 3 - urgent, 90% of reserved blocks are consumed > + */ > + __le16 pre_eol_info; > + /* > + * this field refers to wear of SLC cells and is provided in increments of 10used, > + * and so on, thru to 11 meaning estimated lifetime exceeded. All values above 11 > + * are reserved > + */ > + __le16 device_lifetime_est_typ_a; > + /* > + * this field refers to wear of MLC cells and is provided with the same semantics as > + * device_lifetime_est_typ_a > + */ > + __le16 device_lifetime_est_typ_b; > +}; > + > #ifndef VIRTIO_BLK_NO_LEGACY > struct virtio_scsi_inhdr { > __virtio32 errors; > @@ -219,4 +247,8 @@ struct virtio_scsi_inhdr { > #define VIRTIO_BLK_S_OK 0 > #define VIRTIO_BLK_S_IOERR 1 > #define VIRTIO_BLK_S_UNSUPP 2 > + > +/* Virtblk ioctl commands */ > +#define VBLK_GET_LIFETIME _IOR('r', 0, struct virtio_blk_lifetime) > + > #endif /* _LINUX_VIRTIO_BLK_H */ So this makes the header not self-contained: you need to pull in ioctl.h. And that's a bit annoying to people who are used to just have spec defines in this header. Maybe add a new one virtio_blk_ioctl.h ? And I'd still like to change VBLK_ prefix here to something like VIRTIO_BLK_IOCTL_ And maybe document that this is just for specific type of backend device, and mention the types which benefit, in code comment if not in the ioctl name. Thanks! > -- > 2.32.0
Hi Michael, Sorry, I had no time to create a new version. I'll do it today. Alvaro
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 19da5defd73..392d57fd6b7 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -101,6 +101,18 @@ static inline blk_status_t virtblk_result(struct virtblk_req *vbr) } } +static inline int virtblk_ioctl_result(struct virtblk_req *vbr) +{ + switch (vbr->status) { + case VIRTIO_BLK_S_OK: + return 0; + case VIRTIO_BLK_S_UNSUPP: + return -EOPNOTSUPP; + default: + return -EIO; + } +} + static inline struct virtio_blk_vq *get_virtio_blk_vq(struct blk_mq_hw_ctx *hctx) { struct virtio_blk *vblk = hctx->queue->queuedata; @@ -218,6 +230,7 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device *vdev, u32 type; vbr->out_hdr.sector = 0; + vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req)); switch (req_op(req)) { case REQ_OP_READ: @@ -244,15 +257,14 @@ static blk_status_t virtblk_setup_cmd(struct virtio_device *vdev, type = VIRTIO_BLK_T_SECURE_ERASE; break; case REQ_OP_DRV_IN: - type = VIRTIO_BLK_T_GET_ID; - break; + /* type is set in virtblk_get_id/virtblk_ioctl_lifetime */ + return 0; default: WARN_ON_ONCE(1); return BLK_STS_IOERR; } vbr->out_hdr.type = cpu_to_virtio32(vdev, type); - vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req)); if (type == VIRTIO_BLK_T_DISCARD || type == VIRTIO_BLK_T_WRITE_ZEROES || type == VIRTIO_BLK_T_SECURE_ERASE) { @@ -459,12 +471,16 @@ static int virtblk_get_id(struct gendisk *disk, char *id_str) struct virtio_blk *vblk = disk->private_data; struct request_queue *q = vblk->disk->queue; struct request *req; + struct virtblk_req *vbr; int err; req = blk_mq_alloc_request(q, REQ_OP_DRV_IN, 0); if (IS_ERR(req)) return PTR_ERR(req); + vbr = blk_mq_rq_to_pdu(req); + vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_GET_ID); + err = blk_rq_map_kern(q, req, id_str, VIRTIO_BLK_ID_BYTES, GFP_KERNEL); if (err) goto out; @@ -508,6 +524,85 @@ static int virtblk_getgeo(struct block_device *bd, struct hd_geometry *geo) return ret; } +/* Get lifetime information from device */ +static int virtblk_ioctl_lifetime(struct virtio_blk *vblk, unsigned long arg) +{ + struct request_queue *q = vblk->disk->queue; + struct request *req = NULL; + struct virtblk_req *vbr; + struct virtio_blk_lifetime lifetime = {}; + int ret; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + /* The virtio_blk_lifetime struct fields follow virtio spec. + * There is no check/decode on values received from the device. + * The data is sent as is to the user. + */ + + /* This ioctl is allowed only if VIRTIO_BLK_F_LIFETIME + * feature is negotiated. + */ + if (!virtio_has_feature(vblk->vdev, VIRTIO_BLK_F_LIFETIME)) + return -EOPNOTSUPP; + + req = blk_mq_alloc_request(q, REQ_OP_DRV_IN, 0); + if (IS_ERR(req)) + return PTR_ERR(req); + + /* Write the correct type */ + vbr = blk_mq_rq_to_pdu(req); + vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_GET_LIFETIME); + + ret = blk_rq_map_kern(q, req, &lifetime, sizeof(lifetime), GFP_KERNEL); + if (ret) + goto out; + + blk_execute_rq(req, false); + + ret = virtblk_ioctl_result(blk_mq_rq_to_pdu(req)); + if (ret) + goto out; + + /* Pass the data to the user */ + if (copy_to_user((void __user *)arg, &lifetime, sizeof(lifetime))) { + ret = -EFAULT; + goto out; + } + +out: + blk_mq_free_request(req); + return ret; +} + +static int virtblk_ioctl(struct block_device *bd, fmode_t mode, + unsigned int cmd, unsigned long arg) +{ + struct virtio_blk *vblk = bd->bd_disk->private_data; + int ret; + + mutex_lock(&vblk->vdev_mutex); + + if (!vblk->vdev) { + ret = -ENXIO; + goto exit; + } + + switch (cmd) { + case VBLK_GET_LIFETIME: + ret = virtblk_ioctl_lifetime(vblk, arg); + break; + default: + ret = -EOPNOTSUPP; + break; + } + +exit: + mutex_unlock(&vblk->vdev_mutex); + return ret; +} + static void virtblk_free_disk(struct gendisk *disk) { struct virtio_blk *vblk = disk->private_data; @@ -520,6 +615,7 @@ static void virtblk_free_disk(struct gendisk *disk) static const struct block_device_operations virtblk_fops = { .owner = THIS_MODULE, .getgeo = virtblk_getgeo, + .ioctl = virtblk_ioctl, .free_disk = virtblk_free_disk, }; @@ -1239,7 +1335,7 @@ static unsigned int features_legacy[] = { VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE, VIRTIO_BLK_F_MQ, VIRTIO_BLK_F_DISCARD, VIRTIO_BLK_F_WRITE_ZEROES, - VIRTIO_BLK_F_SECURE_ERASE, + VIRTIO_BLK_F_SECURE_ERASE, VIRTIO_BLK_F_LIFETIME, } ; static unsigned int features[] = { @@ -1247,7 +1343,7 @@ static unsigned int features[] = { VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE, VIRTIO_BLK_F_MQ, VIRTIO_BLK_F_DISCARD, VIRTIO_BLK_F_WRITE_ZEROES, - VIRTIO_BLK_F_SECURE_ERASE, + VIRTIO_BLK_F_SECURE_ERASE, VIRTIO_BLK_F_LIFETIME, }; static struct virtio_driver virtio_blk = { diff --git a/include/uapi/linux/virtio_blk.h b/include/uapi/linux/virtio_blk.h index 58e70b24b50..e755d62d2ea 100644 --- a/include/uapi/linux/virtio_blk.h +++ b/include/uapi/linux/virtio_blk.h @@ -40,6 +40,7 @@ #define VIRTIO_BLK_F_MQ 12 /* support more than one vq */ #define VIRTIO_BLK_F_DISCARD 13 /* DISCARD is supported */ #define VIRTIO_BLK_F_WRITE_ZEROES 14 /* WRITE ZEROES is supported */ +#define VIRTIO_BLK_F_LIFETIME 15 /* Storage lifetime information is supported */ #define VIRTIO_BLK_F_SECURE_ERASE 16 /* Secure Erase is supported */ /* Legacy feature bits */ @@ -165,6 +166,9 @@ struct virtio_blk_config { /* Get device ID command */ #define VIRTIO_BLK_T_GET_ID 8 +/* Get lifetime information command */ +#define VIRTIO_BLK_T_GET_LIFETIME 10 + /* Discard command */ #define VIRTIO_BLK_T_DISCARD 11 @@ -206,6 +210,30 @@ struct virtio_blk_discard_write_zeroes { __le32 flags; }; +/* Get lifetime information struct for each request */ +struct virtio_blk_lifetime { + /* + * specifies the percentage of reserved blocks that are consumed. + * optional values following virtio spec: + * 0 - undefined + * 1 - normal, < 80% of reserved blocks are consumed + * 2 - warning, 80% of reserved blocks are consumed + * 3 - urgent, 90% of reserved blocks are consumed + */ + __le16 pre_eol_info; + /* + * this field refers to wear of SLC cells and is provided in increments of 10used, + * and so on, thru to 11 meaning estimated lifetime exceeded. All values above 11 + * are reserved + */ + __le16 device_lifetime_est_typ_a; + /* + * this field refers to wear of MLC cells and is provided with the same semantics as + * device_lifetime_est_typ_a + */ + __le16 device_lifetime_est_typ_b; +}; + #ifndef VIRTIO_BLK_NO_LEGACY struct virtio_scsi_inhdr { __virtio32 errors; @@ -219,4 +247,8 @@ struct virtio_scsi_inhdr { #define VIRTIO_BLK_S_OK 0 #define VIRTIO_BLK_S_IOERR 1 #define VIRTIO_BLK_S_UNSUPP 2 + +/* Virtblk ioctl commands */ +#define VBLK_GET_LIFETIME _IOR('r', 0, struct virtio_blk_lifetime) + #endif /* _LINUX_VIRTIO_BLK_H */
Implement the VIRTIO_BLK_F_LIFETIME feature for VirtIO block devices. This commit introduces a new ioctl command, VBLK_LIFETIME. VBLK_LIFETIME ioctl asks for the block device to provide lifetime information by sending a VIRTIO_BLK_T_GET_LIFETIME command to the device. lifetime information fields: - pre_eol_info: specifies the percentage of reserved blocks that are consumed. optional values following virtio spec: *) 0 - undefined. *) 1 - normal, < 80% of reserved blocks are consumed. *) 2 - warning, 80% of reserved blocks are consumed. *) 3 - urgent, 90% of reserved blocks are consumed. - device_lifetime_est_typ_a: this field refers to wear of SLC cells and is provided in increments of 10used, and so on, thru to 11 meaning estimated lifetime exceeded. All values above 11 are reserved. - device_lifetime_est_typ_b: this field refers to wear of MLC cells and is provided with the same semantics as device_lifetime_est_typ_a. The data received from the device will be sent as is to the user. No data check/decode is done by virtblk. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> -- v2: - Remove (void *) casting. - Fix comments format in virtio_blk.h. - Set ioprio value for legacy devices for REQ_OP_DRV_IN operations. v3: - Initialize struct virtio_blk_lifetime to 0 instead of memset. - Rename ioctl from VBLK_LIFETIME to VBLK_GET_LIFETIME. - Return EOPNOTSUPP insted of ENOTTY if ioctl is not supported. - Check if process is CAP_SYS_ADMIN capable in lifetime ioctl. - Check if vdev is not NULL before accessing it in lifetime ioctl. -- --- drivers/block/virtio_blk.c | 106 ++++++++++++++++++++++++++++++-- include/uapi/linux/virtio_blk.h | 32 ++++++++++ 2 files changed, 133 insertions(+), 5 deletions(-)