mbox series

[0/7] add simple copy support

Message ID 20210817101423.12367-1-selvakuma.s1@samsung.com
Headers show
Series add simple copy support | expand

Message

SelvaKumar S Aug. 17, 2021, 10:14 a.m. UTC
This started out as an attempt to support NVMe Simple Copy Command (SCC),
and evolved during the RFC review process.

The patchset, at this point, contains -
1. SCC support in NVMe driver
2. Block-layer infra for copy-offload operation
3. ioctl interface to user-space
4. copy-emulation infra in the block-layer
5. copy-offload plumbing to dm-kcopyd (thus creating couple of in-kernel
	users such as dm-clone)


The SCC specification, i.e. TP4065a can be found in following link

https://nvmexpress.org/wp-content/uploads/NVM-Express-1.4-Ratified-TPs.zip

Simple copy is a copy offloading feature and can be used to copy multiple
contiguous ranges (source_ranges) of LBA's to a single destination LBA
within the device, reducing traffic between host and device.

We define a block ioctl for copy and copy payload format similar to
discard. For device supporting native simple copy, we attach the control
information as payload to the bio and submit to the device. Copy emulation
is implemented incase underlaying device does not support copy offload or
based on sysfs choice. Copy emulation is done by reading each source range
into buffer and writing it to the destination.

At present this implementation does not support copy offload for stacked/dm
devices, rather copy operation is completed through emulation.

One of the in-kernel use case for copy-offload is implemented by this
patchset. dm-kcopyd infra has been changed to leverage the copy-offload if
it is natively available. Other future use-cases could be F2FS GC, BTRFS
relocation/balance and copy_file_range.

Following limits are added to queue limits and are exposed via sysfs to
userspace, which user can use to form a payload.

 - copy_offload:
	configurable, can be used set emulation or copy offload
		0 to disable copy offload,
		1 to enable copy offloading support. Offload can be only
			enabled, if underlaying device supports offload

 - max_copy_sectors:
	total copy length supported by copy offload feature in device.
	0 indicates copy offload is not supported.

 - max_copy_nr_ranges:
	maximum number of source range entries supported by copy offload
			feature in device

 - max_copy_range_sectors:
	maximum copy length per source range entry

*blkdev_issue_copy* takes source bdev, no of sources, array of source
ranges (in sectors), destination bdev and destination offset(in sectors).
If both source and destination block devices are same and queue parameter
copy_offload is 1, then copy is done through native copy offloading.
Copy emulation is used in otherwise.

Changes from  RFC v5

1. Handle copy larger than maximum copy limits
2. Create copy context and submit copy offload asynchronously
3. Remove BLKDEV_COPY_NOEMULATION opt-in option of copy offload and
check for copy support before submission from dm and other layers
4. Allocate maximum possible allocatable buffer for copy emulation
rather failing very large copy offload.
5. Fix copy_offload sysfs to be either have 0 or 1

Changes from RFC v4

1. Extend dm-kcopyd to leverage copy-offload, while copying within the
same device. The other approach was to have copy-emulation by moving
dm-kcopyd to block layer. But it also required moving core dm-io infra,
causing a massive churn across multiple dm-targets.
2. Remove export in bio_map_kern()
3. Change copy_offload sysfs to accept 0 or else
4. Rename copy support flag to QUEUE_FLAG_SIMPLE_COPY
5. Rename payload entries, add source bdev field to be used while
partition remapping, remove copy_size
6. Change the blkdev_issue_copy() interface to accept destination and
source values in sector rather in bytes
7. Add payload to bio using bio_map_kern() for copy_offload case
8. Add check to return error if one of the source range length is 0
9. Add BLKDEV_COPY_NOEMULATION flag to allow user to not try copy
emulation incase of copy offload is not supported. Caller can his use
his existing copying logic to complete the io.
10. Bug fix copy checks and reduce size of rcu_lock()

Changes from RFC v3

1. gfp_flag fixes.
2. Export bio_map_kern() and use it to allocate and add pages to bio.
3. Move copy offload, reading to buf, writing from buf to separate functions
4. Send read bio of copy offload by chaining them and submit asynchronously
5. Add gendisk->part0 and part->bd_start_sect changes to blk_check_copy().
6. Move single source range limit check to blk_check_copy()
7. Rename __blkdev_issue_copy() to blkdev_issue_copy and remove old helper.
8. Change blkdev_issue_copy() interface generic to accepts destination bdev
        to support XCOPY as well.
9. Add invalidate_kernel_vmap_range() after reading data for vmalloc'ed memory.
10. Fix buf allocoation logic to allocate buffer for the total size of copy.
11. Reword patch commit description.

Changes from RFC v2

1. Add emulation support for devices not supporting copy.
2. Add *copy_offload* sysfs entry to enable and disable copy_offload
        in devices supporting simple copy.
3. Remove simple copy support for stacked devices.

Changes from RFC v1:

1. Fix memory leak in __blkdev_issue_copy
2. Unmark blk_check_copy inline
3. Fix line break in blk_check_copy_eod
4. Remove p checks and made code more readable
5. Don't use bio_set_op_attrs and remove op and set
   bi_opf directly
6. Use struct_size to calculate total_size
7. Fix partition remap of copy destination
8. Remove mcl,mssrl,msrc from nvme_ns
9. Initialize copy queue limits to 0 in nvme_config_copy
10. Remove return in QUEUE_FLAG_COPY check
11. Remove unused OCFS


Nitesh Shetty (4):
  block: Introduce queue limits for copy-offload support
  block: copy offload support infrastructure
  block: Introduce a new ioctl for simple copy
  block: add emulation for simple copy

SelvaKumar S (3):
  block: make bio_map_kern() non static
  nvme: add simple copy support
  dm kcopyd: add simple copy offload support

 block/blk-core.c          |  84 ++++++++-
 block/blk-lib.c           | 352 ++++++++++++++++++++++++++++++++++++++
 block/blk-map.c           |   2 +-
 block/blk-settings.c      |   4 +
 block/blk-sysfs.c         |  51 ++++++
 block/blk-zoned.c         |   1 +
 block/bounce.c            |   1 +
 block/ioctl.c             |  33 ++++
 drivers/md/dm-kcopyd.c    |  56 +++++-
 drivers/nvme/host/core.c  |  83 +++++++++
 drivers/nvme/host/trace.c |  19 ++
 include/linux/bio.h       |   1 +
 include/linux/blk_types.h |  20 +++
 include/linux/blkdev.h    |  21 +++
 include/linux/nvme.h      |  43 ++++-
 include/uapi/linux/fs.h   |  20 +++
 16 files changed, 775 insertions(+), 16 deletions(-)