Message ID | 20201129181926.897775-2-hch@lst.de |
---|---|
State | Superseded |
Headers | show |
Series | [1/4] block: add a hard-readonly flag to struct gendisk | expand |
On 11/29/20 12:19 PM, Christoph Hellwig wrote: > Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading > partition") addressed a long-standing problem with user read-only Little nit I noticed below. -Alex . . . > diff --git a/block/genhd.c b/block/genhd.c > index 565cf36a5f1864..5e746223b6fa0f 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -1625,31 +1625,35 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro) > kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp); > } > > -void set_disk_ro(struct gendisk *disk, int flag) > +/** > + * set_disk_ro - set a gendisk read-only > + * @disk: The disk device > + * @state: true or false s/state/read_only/ > + * > + * This function is used to indicate whether a given disk device should have its > + * read-only flag set. set_disk_ro() is typically used by device drivers to > + * indicate whether the underlying physical device is write-protected. > + */ > +void set_disk_ro(struct gendisk *disk, bool read_only) . . .
On 11/29/20 7:19 PM, Christoph Hellwig wrote: > Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading > partition") addressed a long-standing problem with user read-only > policy being overridden as a result of a device-initiated revalidate. > The commit has since been reverted due to a regression that left some > USB devices read-only indefinitely. > > To fix the underlying problems with revalidate we need to keep track > of hardware state and user policy separately. > > The gendisk has been updated to reflect the current hardware state set > by the device driver. This is done to allow returning the device to > the hardware state once the user clears the BLKROSET flag. > > The resulting semantics are as follows: > > - If BLKROSET is used to set a whole-disk device read-only, any > partitions will end up in a read-only state until the user > explicitly clears the flag. > > - If BLKROSET sets a given partition read-only, that partition will > remain read-only even if the underlying storage stack initiates a > revalidate. However, the BLKRRPART ioctl will cause the partition > table to be dropped and any user policy on partitions will be lost. > > - If BLKROSET has not been set, both the whole disk device and any > partitions will reflect the current write-protect state of the > underlying device. > > Based on a patch from Martin K. Petersen <martin.petersen@oracle.com>. > > Reported-by: Oleksii Kurochko <olkuroch@cisco.com> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221 > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > block/blk-core.c | 2 +- > block/genhd.c | 34 +++++++++++++++++++--------------- > block/partitions/core.c | 3 +-- > include/linux/genhd.h | 6 ++++-- > 4 files changed, 25 insertions(+), 20 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
Hi Christoph! > - If BLKROSET is used to set a whole-disk device read-only, any > partitions will end up in a read-only state until the user > explicitly clears the flag. This no longer appears to be the case with your tweak. It's very common for database folks to twiddle the read-only state of block devices and partitions. I know that our users will find it very counter-intuitive that setting /dev/sda read-only won't prevent writes to /dev/sda1. > int bdev_read_only(struct block_device *bdev) > { > if (!bdev) > return 0; > - return bdev->bd_read_only; > + return bdev->bd_read_only || > + test_bit(GD_READ_ONLY, &bdev->bd_disk->state); > } I suggest doing bd->bd_read_only || get_disk_ro(...) here. That does take part0 into account. > static inline int get_disk_ro(struct gendisk *disk) > { > - return disk->part0->bd_read_only; > + return disk->part0->bd_read_only || > + test_bit(GD_READ_ONLY, &disk->state); > } > > extern void disk_block_events(struct gendisk *disk); -- Martin K. Petersen Oracle Linux Engineering
On Wed, Dec 02, 2020 at 11:04:33PM -0500, Martin K. Petersen wrote: > > Hi Christoph! > > > - If BLKROSET is used to set a whole-disk device read-only, any > > partitions will end up in a read-only state until the user > > explicitly clears the flag. > > This no longer appears to be the case with your tweak. True. > > It's very common for database folks to twiddle the read-only state of > block devices and partitions. I know that our users will find it very > counter-intuitive that setting /dev/sda read-only won't prevent writes > to /dev/sda1. What I'm worried about it is that this would be a huge change from the historic behavior.
Christoph, >> It's very common for database folks to twiddle the read-only state of >> block devices and partitions. I know that our users will find it very >> counter-intuitive that setting /dev/sda read-only won't prevent >> writes to /dev/sda1. > > What I'm worried about it is that this would be a huge change from the > historic behavior. But that's what my users complained about and what the patch tried to address. Also, the existing behavior is inconsistent in the sense that doing: # blockdev --setro /dev/sda # echo foo > /dev/sda1 permits writes. But: # blockdev --setro /dev/sda <something triggers revalidate> # echo foo > /dev/sda1 doesn't. And a subsequent: # blockdev --setrw /dev/sda # echo foo > /dev/sda1 doesn't work either since sda1's read-only policy has been inherited from the whole-disk device. You need to do: # blockdev --rereadpt after setting the whole-disk device rw to effectuate the same change on the partitions, otherwise they are stuck being read-only indefinitely. However, setting the read-only policy on a partition does *not* require the revalidate step. As a matter of fact, doing the revalidate will blow away the policy setting you just made. So the user needs to take different actions depending on whether they are trying to read-protect a whole-disk device or a partition. Despite using the same ioctl. That is really confusing. The intent of my patch was to ensure that: - Hardware-initiated read-only state changes would not alter the user's whole-disk or partition policy settings. - Setting a policy on the whole-disk device would prevent writes to the whole device as the user clearly intended. I have lost count how many times our customers have had data clobbered because of ambiguity of the existing whole-disk device policy. The current behavior violates the principle of least surprise by letting the user think they write protected the whole disk when they actually didn't. -- Martin K. Petersen Oracle Linux Engineering
diff --git a/block/blk-core.c b/block/blk-core.c index cee568389b7e11..0763d1eb85ce15 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -695,7 +695,7 @@ static inline bool bio_check_ro(struct bio *bio, struct block_device *part) { const int op = bio_op(bio); - if (part->bd_read_only && op_is_write(op)) { + if (op_is_write(op) && bdev_read_only(part)) { char b[BDEVNAME_SIZE]; if (op_is_flush(bio->bi_opf) && !bio_sectors(bio)) diff --git a/block/genhd.c b/block/genhd.c index 565cf36a5f1864..5e746223b6fa0f 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1625,31 +1625,35 @@ static void set_disk_ro_uevent(struct gendisk *gd, int ro) kobject_uevent_env(&disk_to_dev(gd)->kobj, KOBJ_CHANGE, envp); } -void set_disk_ro(struct gendisk *disk, int flag) +/** + * set_disk_ro - set a gendisk read-only + * @disk: The disk device + * @state: true or false + * + * This function is used to indicate whether a given disk device should have its + * read-only flag set. set_disk_ro() is typically used by device drivers to + * indicate whether the underlying physical device is write-protected. + */ +void set_disk_ro(struct gendisk *disk, bool read_only) { - struct disk_part_iter piter; - struct block_device *part; - - if (disk->part0->bd_read_only != flag) { - set_disk_ro_uevent(disk, flag); - disk->part0->bd_read_only = flag; + if (read_only) { + if (test_and_set_bit(GD_READ_ONLY, &disk->state)) + return; + } else { + if (!test_and_clear_bit(GD_READ_ONLY, &disk->state)) + return; } - - disk_part_iter_init(&piter, disk, DISK_PITER_INCL_EMPTY); - while ((part = disk_part_iter_next(&piter))) - part->bd_read_only = flag; - disk_part_iter_exit(&piter); + set_disk_ro_uevent(disk, read_only); } - EXPORT_SYMBOL(set_disk_ro); int bdev_read_only(struct block_device *bdev) { if (!bdev) return 0; - return bdev->bd_read_only; + return bdev->bd_read_only || + test_bit(GD_READ_ONLY, &bdev->bd_disk->state); } - EXPORT_SYMBOL(bdev_read_only); /* diff --git a/block/partitions/core.c b/block/partitions/core.c index deca253583bd3f..5a9633183343c0 100644 --- a/block/partitions/core.c +++ b/block/partitions/core.c @@ -194,7 +194,7 @@ static ssize_t part_start_show(struct device *dev, static ssize_t part_ro_show(struct device *dev, struct device_attribute *attr, char *buf) { - return sprintf(buf, "%d\n", dev_to_bdev(dev)->bd_read_only); + return sprintf(buf, "%d\n", bdev_read_only(dev_to_bdev(dev))); } static ssize_t part_alignment_offset_show(struct device *dev, @@ -360,7 +360,6 @@ static struct block_device *add_partition(struct gendisk *disk, int partno, bdev->bd_start_sect = start; bdev_set_nr_sectors(bdev, len); - bdev->bd_read_only = get_disk_ro(disk); if (info) { err = -ENOMEM; diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 809aaa32d53cba..a62ccbfac54b48 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -163,6 +163,7 @@ struct gendisk { int flags; unsigned long state; #define GD_NEED_PART_SCAN 0 +#define GD_READ_ONLY 1 struct kobject *slave_dir; struct timer_rand_state *random; @@ -249,11 +250,12 @@ static inline void add_disk_no_queue_reg(struct gendisk *disk) extern void del_gendisk(struct gendisk *gp); extern struct block_device *bdget_disk(struct gendisk *disk, int partno); -extern void set_disk_ro(struct gendisk *disk, int flag); +void set_disk_ro(struct gendisk *disk, bool read_only); static inline int get_disk_ro(struct gendisk *disk) { - return disk->part0->bd_read_only; + return disk->part0->bd_read_only || + test_bit(GD_READ_ONLY, &disk->state); } extern void disk_block_events(struct gendisk *disk);
Commit 20bd1d026aac ("scsi: sd: Keep disk read-only when re-reading partition") addressed a long-standing problem with user read-only policy being overridden as a result of a device-initiated revalidate. The commit has since been reverted due to a regression that left some USB devices read-only indefinitely. To fix the underlying problems with revalidate we need to keep track of hardware state and user policy separately. The gendisk has been updated to reflect the current hardware state set by the device driver. This is done to allow returning the device to the hardware state once the user clears the BLKROSET flag. The resulting semantics are as follows: - If BLKROSET is used to set a whole-disk device read-only, any partitions will end up in a read-only state until the user explicitly clears the flag. - If BLKROSET sets a given partition read-only, that partition will remain read-only even if the underlying storage stack initiates a revalidate. However, the BLKRRPART ioctl will cause the partition table to be dropped and any user policy on partitions will be lost. - If BLKROSET has not been set, both the whole disk device and any partitions will reflect the current write-protect state of the underlying device. Based on a patch from Martin K. Petersen <martin.petersen@oracle.com>. Reported-by: Oleksii Kurochko <olkuroch@cisco.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201221 Signed-off-by: Christoph Hellwig <hch@lst.de> --- block/blk-core.c | 2 +- block/genhd.c | 34 +++++++++++++++++++--------------- block/partitions/core.c | 3 +-- include/linux/genhd.h | 6 ++++-- 4 files changed, 25 insertions(+), 20 deletions(-)