From patchwork Mon Apr 21 02:14:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885308 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D982518DB3D; Mon, 21 Apr 2025 02:25:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202334; cv=none; b=a5zyugJDx8FK6giEINLYbYcqIwLNyqbipLufs/KoCcjKhx6ZAjr9vMLbJYYi47hvljs6qa9mkpHDzH+rfSc0BVr8NDubuF5fV9slTMCI4+waegryyntDmeYsA22XYHd04tGT+YSOrFwQq0yQEuaUKNXIJ55OJu6KSuoM2tuYp/g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202334; c=relaxed/simple; bh=d8NKu11c0L/6hqcAgEPl8hg1PgWhSyhd6iLgSLVkYnU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QY+M6q9f2PxqBNVwJ0Tj33XrYeXuiG+qG9p0E7nL+rJkM9ijbwb+2tXsBK5zkTY7xab7FYsQwYzfULbG0ZLFUl41UA5mk6s7eZaRrwa+1rctDFdqMF8M/bZGfn5yuImWfsVBeWymmbZgX3qfVOD3UDV72tKNxRGfIEgWs3noFk8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2l12ypz4f3mHR; Mon, 21 Apr 2025 10:25:03 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 839BD1A058E; Mon, 21 Apr 2025 10:25:28 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S5; Mon, 21 Apr 2025 10:25:28 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 01/11] block: introduce BLK_FEAT_WRITE_ZEROES_UNMAP to queue limits features Date: Mon, 21 Apr 2025 10:14:59 +0800 Message-ID: <20250421021509.2366003-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S5 X-Coremail-Antispam: 1UD129KBjvJXoWxKF45JFW3uw4rCFWDAw1rXrb_yoWxWr1kpF y8KFyUtry0gF17u3Z7A3W7uF1a939Ykry3G39Fkw1ruw42gr1xWF1vqFyYq348J3s3Way0 qa1UKr98uayUCrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0pRlJPiUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Currently, disks primarily implement the write zeroes command (aka REQ_OP_WRITE_ZEROES) through two mechanisms: the first involves physically writing zeros to the disk media (e.g., HDDs), while the second performs an unmap operation on the logical blocks, effectively putting them into a deallocated state (e.g., SSDs). The first method is generally slow, while the second method is typically very fast. For example, on certain NVMe SSDs that support NVME_NS_DEAC, submitting REQ_OP_WRITE_ZEROES requests with the NVME_WZ_DEAC bit can accelerate the write zeros operation by placing disk blocks into a deallocated state (this is a best effort, not a mandatory requirement, some devices may partially fall back to writing physical zeroes due to factors such as receiving unaligned commands). However, it is difficult to determine whether the storage device supports unmap write zeroes. We cannot determine this by querying bdev_limits(bdev)->max_write_zeroes_sectors. Therefore, add a new queue limit feature, BLK_FEAT_WRITE_ZEROES_UNMAP and the corresponding sysfs entry, to indicate whether the block device explicitly supports the unmapped write zeroes command. Each device driver should set this bit if it is certain that the attached disk supports this command. If the bit is not set, the disk either does not support it, or its support status is unknown. For the stacked devices cases, the BLK_FEAT_WRITE_ZEROES_UNMAP should be supported both by the stacking driver and all underlying devices. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- Documentation/ABI/stable/sysfs-block | 18 ++++++++++++++++++ block/blk-settings.c | 6 ++++++ block/blk-sysfs.c | 3 +++ include/linux/blkdev.h | 8 ++++++++ 4 files changed, 35 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block index 3879963f0f01..6531cdfcaacf 100644 --- a/Documentation/ABI/stable/sysfs-block +++ b/Documentation/ABI/stable/sysfs-block @@ -763,6 +763,24 @@ Description: 0, write zeroes is not supported by the device. +What: /sys/block//queue/write_zeroes_unmap +Date: January 2025 +Contact: Zhang Yi +Description: + [RO] Devices that explicitly support the unmap write zeroes + operation in which a single write zeroes request with the unmap + bit set to zero out the range of contiguous blocks on storage + by freeing blocks, rather than writing physical zeroes to the + media. If the write_zeroes_unmap is set to 1, this indicates + that the device explicitly supports the write zero command. + However, this may be a best-effort optimization rather than a + mandatory requirement, some devices may partially fall back to + writing physical zeroes due to factors such as receiving + unaligned commands. If the parameter is set to 0, the device + either does not support this operation, or its support status is + unknown. + + What: /sys/block//queue/zone_append_max_bytes Date: May 2020 Contact: linux-block@vger.kernel.org diff --git a/block/blk-settings.c b/block/blk-settings.c index 6b2dbe645d23..3331d07bd5d9 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -697,6 +697,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->features &= ~BLK_FEAT_NOWAIT; if (!(b->features & BLK_FEAT_POLL)) t->features &= ~BLK_FEAT_POLL; + if (!(b->features & BLK_FEAT_WRITE_ZEROES_UNMAP)) + t->features &= ~BLK_FEAT_WRITE_ZEROES_UNMAP; t->flags |= (b->flags & BLK_FLAG_MISALIGNED); @@ -819,6 +821,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->zone_write_granularity = 0; t->max_zone_append_sectors = 0; } + + if (!t->max_write_zeroes_sectors) + t->features &= ~BLK_FEAT_WRITE_ZEROES_UNMAP; + blk_stack_atomic_writes_limits(t, b, start); return ret; diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index a2882751f0d2..7a9c20bd3779 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -261,6 +261,7 @@ static ssize_t queue_##_name##_show(struct gendisk *disk, char *page) \ QUEUE_SYSFS_FEATURE_SHOW(fua, BLK_FEAT_FUA); QUEUE_SYSFS_FEATURE_SHOW(dax, BLK_FEAT_DAX); +QUEUE_SYSFS_FEATURE_SHOW(write_zeroes_unmap, BLK_FEAT_WRITE_ZEROES_UNMAP); static ssize_t queue_poll_show(struct gendisk *disk, char *page) { @@ -510,6 +511,7 @@ QUEUE_LIM_RO_ENTRY(queue_atomic_write_unit_min, "atomic_write_unit_min_bytes"); QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes"); QUEUE_LIM_RO_ENTRY(queue_max_write_zeroes_sectors, "write_zeroes_max_bytes"); +QUEUE_LIM_RO_ENTRY(queue_write_zeroes_unmap, "write_zeroes_unmap"); QUEUE_LIM_RO_ENTRY(queue_max_zone_append_sectors, "zone_append_max_bytes"); QUEUE_LIM_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity"); @@ -656,6 +658,7 @@ static struct attribute *queue_attrs[] = { &queue_atomic_write_unit_min_entry.attr, &queue_atomic_write_unit_max_entry.attr, &queue_max_write_zeroes_sectors_entry.attr, + &queue_write_zeroes_unmap_entry.attr, &queue_max_zone_append_sectors_entry.attr, &queue_zone_write_granularity_entry.attr, &queue_rotational_entry.attr, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e39c45bc0a97..7c8752578e36 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -342,6 +342,9 @@ typedef unsigned int __bitwise blk_features_t; #define BLK_FEAT_ATOMIC_WRITES \ ((__force blk_features_t)(1u << 16)) +/* supports unmap write zeroes command */ +#define BLK_FEAT_WRITE_ZEROES_UNMAP ((__force blk_features_t)(1u << 17)) + /* * Flags automatically inherited when stacking limits. */ @@ -1341,6 +1344,11 @@ static inline unsigned int bdev_write_zeroes_sectors(struct block_device *bdev) return bdev_limits(bdev)->max_write_zeroes_sectors; } +static inline bool bdev_write_zeroes_unmap(struct block_device *bdev) +{ + return bdev_limits(bdev)->features & BLK_FEAT_WRITE_ZEROES_UNMAP; +} + static inline bool bdev_nonrot(struct block_device *bdev) { return blk_queue_nonrot(bdev_get_queue(bdev)); From patchwork Mon Apr 21 02:15:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 882969 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C6A31EB3E; Mon, 21 Apr 2025 02:44:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745203456; cv=none; b=uLEtcou5IVfdf6IVkIPbSYdj+eGrwG3nQceEx0v0JnUwhcUmNsSCA5bfV+lyjqLniFSOFgpD5GsEVmZ8Hq8U0kJihddPVgT3YHZ/cl/64NQ/Jvhk0L3PFCRXxoIBSxwI6anb5VDFLeA7uVPTR7Sh0YeuFOq+Z+7hVYPxsKyzu5A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745203456; c=relaxed/simple; bh=pB7viRc1r2JGTE6pXLj5K8q7cEpzje4VepltOd+V8BI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tBLDN4z/1HxE9K4F/5v3Oi3qW5fqQ4Mg17ToAS15oiyYuK1ZrJEDo81MnU7Px7DBMuKrPD6A3Ejdaea9iFrpmM9NmaIYxKpGdGk+cvMPoamXcdcyOzYM63RqN+5EeOmrcqbqS93QRGLEvxx6AnEfM5fylcF3Q/n9SMXJfhDhqCs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2n35y2z4f3jY1; Mon, 21 Apr 2025 10:25:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 47A121A018D; Mon, 21 Apr 2025 10:25:29 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S6; Mon, 21 Apr 2025 10:25:28 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 02/11] nvme: set BLK_FEAT_WRITE_ZEROES_UNMAP if device supports DEAC bit Date: Mon, 21 Apr 2025 10:15:00 +0800 Message-ID: <20250421021509.2366003-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S6 X-Coremail-Antispam: 1UD129KBjvJXoW7ury8Ar4xJF4kuF15CrWUCFg_yoW8Cw4kpF W3Xa42k348WF47C3sxXw47AFy3Gw1kGa4jgFn7K345Wr13A34Fgr1rKa43XFWkX3s3uayY yFZ8G3s7Ca15XwUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42 IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIev Ja73UjIFyTuYvjTRNiSHDUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi When the device supports the Write Zeroes command and the DEAC bit, it indicates that the deallocate bit in the Write Zeroes command is supported, and the bytes read from a deallocated logical block are zeroes. This means the device supports unmap Write Zeroes, so set the BLK_FEAT_WRITE_ZEROES_UNMAP feature to the device's queue limit. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- drivers/nvme/host/core.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index b502ac07483b..b2cece376f30 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2223,22 +2223,25 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns, if (!nvme_init_integrity(ns->head, &lim, info)) capacity = 0; - ret = queue_limits_commit_update(ns->disk->queue, &lim); - if (ret) { - blk_mq_unfreeze_queue(ns->disk->queue, memflags); - goto out; - } - - set_capacity_and_notify(ns->disk, capacity); - /* * Only set the DEAC bit if the device guarantees that reads from * deallocated data return zeroes. While the DEAC bit does not * require that, it must be a no-op if reads from deallocated data * do not return zeroes. */ - if ((id->dlfeat & 0x7) == 0x1 && (id->dlfeat & (1 << 3))) + if ((id->dlfeat & 0x7) == 0x1 && (id->dlfeat & (1 << 3))) { ns->head->features |= NVME_NS_DEAC; + if (lim.max_write_zeroes_sectors) + lim.features |= BLK_FEAT_WRITE_ZEROES_UNMAP; + } + + ret = queue_limits_commit_update(ns->disk->queue, &lim); + if (ret) { + blk_mq_unfreeze_queue(ns->disk->queue, memflags); + goto out; + } + + set_capacity_and_notify(ns->disk, capacity); set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info)); set_bit(NVME_NS_READY, &ns->flags); blk_mq_unfreeze_queue(ns->disk->queue, memflags); From patchwork Mon Apr 21 02:15:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885304 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87319259C86; Mon, 21 Apr 2025 02:25:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202341; cv=none; b=Pr67VmpUG66GIqxy1CgYMfq/EVL4d7LonEdwPTZ8JkvwrZ+VNPdDqvIengS30UpIGC+q9BMVl9Qo3HYI++AHgQlIcwXwcwCJCaiKHXR79LuEFXcWlVn+QXlExXMw76fmUvrpO/D+QKydhfn+JpuPjlnqM4M7iISzvDmYkX/TdHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202341; c=relaxed/simple; bh=fy/slsfDl3KPmahbZ6hLrrX4W0yEveTB/lL82mqAX0A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AHGaKSgZ7pl3M3jKmiT12VV5sjmIVQF1yGIZ5OMQjsdTLn5u6pGhQHPqlMQyF+46QOUU5RhQIJK850hbycjAP9ggCJ2c708KkUUZ174qF4S/j8XkWAN2E5o+fNsYnNbqUiBhXqUvuHFMc2PBCTyurqBDP+oOIfTdKNC4aIUYK/c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2p1ZCqz4f3jY6; Mon, 21 Apr 2025 10:25:06 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 122321A1B61; Mon, 21 Apr 2025 10:25:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S7; Mon, 21 Apr 2025 10:25:29 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 03/11] nvme-multipath: add BLK_FEAT_WRITE_ZEROES_UNMAP support Date: Mon, 21 Apr 2025 10:15:01 +0800 Message-ID: <20250421021509.2366003-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S7 X-Coremail-Antispam: 1UD129KBjvdXoWrKFy5Kr4Uuw4Uur4ktrW3GFg_yoWDArcEv3 W5Xr47ZFs3CF1vvrn5ur13CryvkrnxZr1xu3WSgry3Jry8Xr13Jas2vrnIyw1Yk3WY9rs2 9w1kXF15GrWUtjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbkkFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUWwA2048vs2IY02 0Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxkF7I0En4kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j 6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7 AF67AKxVW8ZVWrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE 2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0pRl_MsUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Set the BLK_FEAT_WRITE_ZEROES_UNMAP feature while creating multipath stacking queue limits by default. This feature shall be disabled if any attached namespace does not support it. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- drivers/nvme/host/multipath.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 05eccd96d34a..1880ad09559a 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -638,7 +638,8 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) blk_set_stacking_limits(&lim); lim.dma_alignment = 3; - lim.features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL; + lim.features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL | + BLK_FEAT_WRITE_ZEROES_UNMAP; if (head->ids.csi == NVME_CSI_ZNS) lim.features |= BLK_FEAT_ZONED; From patchwork Mon Apr 21 02:15:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 882974 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C7A6E2561D7; Mon, 21 Apr 2025 02:25:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202335; cv=none; b=gRD3me+IE/U+PC5x4h/v+A5HPHALObz0Smm2roGYpge01397mPIXSEbEYxWcalpTaOmqR00Y5lkDo3mgAJYZ2w0INi9AdfF7WN8o01QaIj0opSWpJo8izRkrWN8rnTK7dzoePJz1UXX88SrB9ZHxoKGcxOVzO46W3oz9KLXgMxE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202335; c=relaxed/simple; bh=hfKFNPbETFGs/Xuenk3gewjpcnxeJb9ZL7s2EGkEUak=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tcaknngMMbaJ/gDEyPkJT6rD3YS+45wNwLYvCWF3xDK2x7DulO4pylMwbE3zgThPl0aPVrBGlGskEvjBTNI8HxF1mImVBSRefxv24yd9YPjMLndrsct3onbY7nqk/5mI2lJFOZF7pDzofgU5k0AANf0xnc7NVSteGj9SOzXJL3c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2w2TpHz4f3jt8; Mon, 21 Apr 2025 10:25:12 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D3C4D1A1B4D; Mon, 21 Apr 2025 10:25:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S8; Mon, 21 Apr 2025 10:25:30 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 04/11] nvmet: set WZDS and DRB if device supports BLK_FEAT_WRITE_ZEROES_UNMAP Date: Mon, 21 Apr 2025 10:15:02 +0800 Message-ID: <20250421021509.2366003-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S8 X-Coremail-Antispam: 1UD129KBjvdXoWrZFy7WFyUCryxtFW7GF15urg_yoWDCwcEv3 W8ur47Wa1kur1kKFWa9F15ZryUKw48ZF1IqFs2qa4S9r9rWrs5uw1vvF9Iya4UAF4rXrsx ur17XFsagFWxKjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbvAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAVCq3wA2048vs2 IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28E F7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr 1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxkF7I0En4kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY 6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17 CEb7AF67AKxVW8ZVWrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF 0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsG vfC2KfnxnUUI43ZEXa7sRiHUDtUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Set WZDS and DRB bit to the namespace dlfeat if the underlying block device supports BLK_FEAT_WRITE_ZEROES_UNMAP, make the nvme target device supports unmaped write zeroes command. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- drivers/nvme/target/io-cmd-bdev.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c index 83be0657e6df..052da6174548 100644 --- a/drivers/nvme/target/io-cmd-bdev.c +++ b/drivers/nvme/target/io-cmd-bdev.c @@ -46,6 +46,10 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id) id->npda = id->npdg; /* NOWS = Namespace Optimal Write Size */ id->nows = to0based(bdev_io_opt(bdev) / bdev_logical_block_size(bdev)); + + /* Set WZDS and DRB if device supports unmapped write zeroes */ + if (bdev_write_zeroes_unmap(bdev)) + id->dlfeat = (1 << 3) | 0x1; } void nvmet_bdev_ns_disable(struct nvmet_ns *ns) From patchwork Mon Apr 21 02:15:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885307 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1C0B256C7D; Mon, 21 Apr 2025 02:25:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202336; cv=none; b=KK3HrWfhoDGQlb0zClfJPb4LId4M9syVDBGjNndWMnX+jYPlCtjEOd+HzI5xgHwG9fK4QymDcCGhOHEqTcXhP7Hg9nFfrR75npM37rwoppBUxhduVlS/lf9UHYkH01ylwfBMBTqkhQakLDzaZesvOQChSLoXRm7cxSYGI69T/uk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202336; c=relaxed/simple; bh=X4pMOYTSWhX0IJM2gqJZG979HxvaHLl2kbgCXf1BEYM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d6GUQwqVbB8deMiiDb65jtuxHHg59y0wvo0034B8CcwKBCmnMk7ffP0bByANohvz3TtaN/2O13simmSSV9WmGVmY1jQY6X2ad7u6H9hmOtYZfSlzehDyFD3IURLIRGm3aAw/fAdpxDurhNgXnmAeL2FDCzqDvAE3DjBvGG75v/Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2x0sr7z4f3jv6; Mon, 21 Apr 2025 10:25:13 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 9E4141A06DC; Mon, 21 Apr 2025 10:25:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S9; Mon, 21 Apr 2025 10:25:31 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 05/11] scsi: sd: set BLK_FEAT_WRITE_ZEROES_UNMAP if device supports unmap zeroing mode Date: Mon, 21 Apr 2025 10:15:03 +0800 Message-ID: <20250421021509.2366003-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S9 X-Coremail-Antispam: 1UD129KBjvdXoWrZFW3tF18AF48ur4kZFW5Wrg_yoWDZwc_Ca 1fur9rJr4vyFyIkrWfXF43Cryvvan7uFyjgrnFqrySyrZ7Zrs5J3Wvvr1av3WUJayUK343 Aa9rXr1Syr1DJjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbvAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAVCq3wA2048vs2 IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28E F7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr 1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxkF7I0En4kS14v26r4a6rW5MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY 6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17 CEb7AF67AKxVW8ZVWrXwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF 0xvE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsG vfC2KfnxnUUI43ZEXa7sRiHUDtUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi When the device supports the Write Zeroes command and the zeroing mode is set to SD_ZERO_WS16_UNMAP or SD_ZERO_WS10_UNMAP, this means that the device supports unmap Write Zeroes, so set the corresponding BLK_FEAT_WRITE_ZEROES_UNMAP feature to the device's queue limit. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- drivers/scsi/sd.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 950d8c9fb884..652630b410de 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1118,6 +1118,11 @@ static void sd_config_write_same(struct scsi_disk *sdkp, else sdkp->zeroing_mode = SD_ZERO_WRITE; + if (sdkp->max_ws_blocks && + (sdkp->zeroing_mode == SD_ZERO_WS16_UNMAP || + sdkp->zeroing_mode == SD_ZERO_WS10_UNMAP)) + lim->features |= BLK_FEAT_WRITE_ZEROES_UNMAP; + if (sdkp->max_ws_blocks && sdkp->physical_block_size > logical_block_size) { /* From patchwork Mon Apr 21 02:15:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 882970 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7DD0525A659; Mon, 21 Apr 2025 02:25:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202342; cv=none; b=MVucAtCEfc2KDKQtCO9i2xKVhWdOCr1ebLuC9f7X1mRBI5LRE4UBuZVv4bh5VXnDNsEC3rCKYspqCjMXyMibg8codMDVtagWed883GOifM/Qy/2DdPhxWCTHge4wlcAz3yadHxIaMHMJqMcRfj8qNKgRpDo9imh40LqfLyej/zk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202342; c=relaxed/simple; bh=1eah6+l3LHoQXYxuCvzywyHjC7gWndEgK9lEiAcaiQg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O9pTfPiqn3WhtgZFP+tgCd4FQm3fUgmqP7MBfkM5BPm/j0CbtLE8WcJjCWke7zqI+jM05pk0IotQdAMZtnQFhdEELKCJHBpMu46XSdL5HBBl/n9GucXYmgkbQo3cxHx5UflHWsGkqbACDaXMfyaG0WHuyJYW0QDwwwhUQUX5Mlo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2r46bqz4f3jXd; Mon, 21 Apr 2025 10:25:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 6A7F41A018D; Mon, 21 Apr 2025 10:25:32 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S10; Mon, 21 Apr 2025 10:25:32 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 06/11] dm: add BLK_FEAT_WRITE_ZEROES_UNMAP support Date: Mon, 21 Apr 2025 10:15:04 +0800 Message-ID: <20250421021509.2366003-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S10 X-Coremail-Antispam: 1UD129KBjvJXoW7Ww4rJr4DAw1fAryDJw4rAFb_yoW8CFyUp3 ZrWa4ayry5tF47C3Z5uFyI9FyYqa1YkFy7KrZrCws5u3W5GryUWF47ta47X3yDJFy7Xay3 Ka4jkr9ruF4rCwUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Set the BLK_FEAT_WRITE_ZEROES_UNMAP feature on stacking queue limits by default. This feature shall be disabled if any underlying device does not support it. Signed-off-by: Zhang Yi Reviewed-by: Benjamin Marzinski --- drivers/md/dm-table.c | 7 +++++-- drivers/md/dm.c | 1 + 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 35100a435c88..56141d585ef8 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -598,7 +598,8 @@ int dm_split_args(int *argc, char ***argvp, char *input) static void dm_set_stacking_limits(struct queue_limits *limits) { blk_set_stacking_limits(limits); - limits->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL; + limits->features |= BLK_FEAT_IO_STAT | BLK_FEAT_NOWAIT | BLK_FEAT_POLL | + BLK_FEAT_WRITE_ZEROES_UNMAP; } /* @@ -1852,8 +1853,10 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, limits->discard_alignment = 0; } - if (!dm_table_supports_write_zeroes(t)) + if (!dm_table_supports_write_zeroes(t)) { limits->max_write_zeroes_sectors = 0; + limits->features &= ~BLK_FEAT_WRITE_ZEROES_UNMAP; + } if (!dm_table_supports_secure_erase(t)) limits->max_secure_erase_sectors = 0; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 5ab7574c0c76..b59c3dbeaaf1 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1096,6 +1096,7 @@ void disable_write_zeroes(struct mapped_device *md) /* device doesn't really support WRITE ZEROES, disable it */ limits->max_write_zeroes_sectors = 0; + limits->features &= ~BLK_FEAT_WRITE_ZEROES_UNMAP; } static bool swap_bios_limit(struct dm_target *ti, struct bio *bio) From patchwork Mon Apr 21 02:15:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 882973 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D051C2571B0; Mon, 21 Apr 2025 02:25:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202337; cv=none; b=l6Fer50o/YFo4YKfsn+8DFsOlUdXo4Qp0uYh+4BFhxx/JG43Kq/oRrFM9C31VxoY9Mwppa79x0LPS/fzsPiiYukdzvLKWDclCcQS5Ws21zWFsPapirskGEzU/Wf5Gi3PLXBNf2fVze+Jx9hrUIcdB8C0ofV2U7Gkx7inrtNinzg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202337; c=relaxed/simple; bh=RJ5rsc6/wjBPB905LyP4tfh4gFfFqZBmsP4Wy4Fovc8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CRkdnQiwhvmf54i3n+ge2XnCWGpLqc3D61vW5wYbEZULGkDVo4g3MB+4RyD2SpQ+Z1ASUisHEe8h201oBJoNJ2ICjaXoGMgeBDO8wU/ixhDze7KIYbjfr0Am4frGrydHcGrMnVH8Hkk6U4JeBRDIIMugeAFHNU8pphPvNTUIkwk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2q5qndz4f3lgS; Mon, 21 Apr 2025 10:25:07 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 332EA1A018D; Mon, 21 Apr 2025 10:25:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S11; Mon, 21 Apr 2025 10:25:32 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 07/11] fs: statx add write zeroes unmap attribute Date: Mon, 21 Apr 2025 10:15:05 +0800 Message-ID: <20250421021509.2366003-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S11 X-Coremail-Antispam: 1UD129KBjvJXoWxJw45Zr4fWry3uw4DAw45GFg_yoW5Aw45pF WkGFy8GF4FgFWUuw4vk3W7Zw15t3WYkr4UA3yUKw10kFWDtr1IgF93CFyjyrnYqrZ7C3y3 XF9rK347uay2k3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Add a new attribute flag to statx to determine whether a bdev or a file supports the unmap write zeroes command. Signed-off-by: Zhang Yi --- block/bdev.c | 4 ++++ fs/ext4/inode.c | 9 ++++++--- include/uapi/linux/stat.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 4844d1e27b6f..29b0e5feb138 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -1304,6 +1304,10 @@ void bdev_statx(struct path *path, struct kstat *stat, queue_atomic_write_unit_max_bytes(bd_queue)); } + if (bdev_write_zeroes_unmap(bdev)) + stat->attributes |= STATX_ATTR_WRITE_ZEROES_UNMAP; + stat->attributes_mask |= STATX_ATTR_WRITE_ZEROES_UNMAP; + stat->blksize = bdev_io_min(bdev); blkdev_put_no_open(bdev); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 94c7d2d828a6..38caf2f39c6d 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5653,6 +5653,7 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path, struct inode *inode = d_inode(path->dentry); struct ext4_inode *raw_inode; struct ext4_inode_info *ei = EXT4_I(inode); + struct block_device *bdev = inode->i_sb->s_bdev; unsigned int flags; if ((request_mask & STATX_BTIME) && @@ -5672,8 +5673,6 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path, stat->result_mask |= STATX_DIOALIGN; if (dio_align == 1) { - struct block_device *bdev = inode->i_sb->s_bdev; - /* iomap defaults */ stat->dio_mem_align = bdev_dma_alignment(bdev) + 1; stat->dio_offset_align = bdev_logical_block_size(bdev); @@ -5695,6 +5694,9 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path, generic_fill_statx_atomic_writes(stat, awu_min, awu_max); } + if (S_ISREG(inode->i_mode) && bdev_write_zeroes_unmap(bdev)) + stat->attributes |= STATX_ATTR_WRITE_ZEROES_UNMAP; + flags = ei->i_flags & EXT4_FL_USER_VISIBLE; if (flags & EXT4_APPEND_FL) stat->attributes |= STATX_ATTR_APPEND; @@ -5714,7 +5716,8 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path, STATX_ATTR_ENCRYPTED | STATX_ATTR_IMMUTABLE | STATX_ATTR_NODUMP | - STATX_ATTR_VERITY); + STATX_ATTR_VERITY | + STATX_ATTR_WRITE_ZEROES_UNMAP); generic_fillattr(idmap, request_mask, inode, stat); return 0; diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h index f78ee3670dd5..279ce7e34df7 100644 --- a/include/uapi/linux/stat.h +++ b/include/uapi/linux/stat.h @@ -251,6 +251,7 @@ struct statx { #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */ #define STATX_ATTR_DAX 0x00200000 /* File is currently in DAX state */ #define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */ +#define STATX_ATTR_WRITE_ZEROES_UNMAP 0x00800000 /* File supports unmap write zeroes */ #endif /* _UAPI_LINUX_STAT_H */ From patchwork Mon Apr 21 02:15:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885306 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E85CB2580C0; Mon, 21 Apr 2025 02:25:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202339; cv=none; b=d+6yhXU7gxg7b38L9yAJtBu8Di5w5S1rrJgnz4GNcTpOuF2bjrGeZRani9Xn7uXJflz69hqktkv4E0S6qz4XVBvNp8A6mP+jmQjl066h18WJXxdmqc+x9JW/UTFX8IsUcsNBbbhXisXsQQsXCEfR6mC5N/6UtOHq+HjHflKU1EE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202339; c=relaxed/simple; bh=EEjshBMhyitJOEtKMoK/y0e86zqye9ckHaVTqcZYQdY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kGrotiz8lZzH7rjdADk/NiLxGPuYGhHDCocxRwk8QUUr9UJeDf8/6TjP0Cx+o27d0p58mmeSWqMMiHTQDhwPTjJhRHekX9qZQxGmK0ynjXFE4cw4yr20NcAP8OMwfhDGWZ0UffpWxG+/54JBZjyT3lJYY/4r52wKplh1QstSiJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2r4HP1z4f3mHS; Mon, 21 Apr 2025 10:25:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id F1CDD1A058E; Mon, 21 Apr 2025 10:25:33 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S12; Mon, 21 Apr 2025 10:25:33 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 08/11] fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate Date: Mon, 21 Apr 2025 10:15:06 +0800 Message-ID: <20250421021509.2366003-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S12 X-Coremail-Antispam: 1UD129KBjvJXoWxAryrGFW7Jr18Aw1kArW3GFg_yoWrCr1fpF W3GF1rGrW0ga4rG3s3Can7ur98Zw4kWr43ZrW2gr4UZr45Xr1IgFsFgFyYva4xJrZrAF4Y qrnIgr98ua47A3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi With the development of flash-based storage devices, we can quickly write zeros to SSDs using the WRITE_ZERO command if the devices do not actually write physical zeroes to the media. Therefore, we can use this command to quickly preallocate a real all-zero file with written extents. This approach should be beneficial for subsequent pure overwriting within this file, as it can save on block allocation and, consequently, significant metadata changes, which should greatly improve overwrite performance on certain filesystems. Therefore, introduce a new operation FALLOC_FL_WRITE_ZEROES to fallocate. This flag is used to convert a specified range of a file to zeros by issuing a zeroing operation. Blocks should be allocated for the regions that span holes in the file, and the entire range is converted to written extents. If the underlying device supports the actual offload write zeroes command, the process of zeroing out operation can be accelerated. If it does not, we currently don't prevent the file system from writing actual zeros to the device. This provides users with a new method to quickly generate a zeroed file, users no longer need to write zero data to create a file with written extents. Users can determine whether a file or a bdev supports the unmap write zeroes command by using the statx(2) and checking if the STATX_ATTR_WRITE_ZEROES_UNMAP flag is set. Users can also check whether a disk supports the unmap write zeroes command through querying this sysfs interface: /sys/block//queue/write_zeroes_unmap Finally, this flag cannot be specified in conjunction with the FALLOC_FL_KEEP_SIZE since allocating written extents beyond file EOF is not permitted. In addition, filesystems that always require out-of-place writes should not support this flag since they still need to allocated new blocks during subsequent overwrites. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- fs/open.c | 1 + include/linux/falloc.h | 3 ++- include/uapi/linux/falloc.h | 18 ++++++++++++++++++ 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/fs/open.c b/fs/open.c index a9063cca9911..08b5daaf4df5 100644 --- a/fs/open.c +++ b/fs/open.c @@ -278,6 +278,7 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len) break; case FALLOC_FL_COLLAPSE_RANGE: case FALLOC_FL_INSERT_RANGE: + case FALLOC_FL_WRITE_ZEROES: if (mode & FALLOC_FL_KEEP_SIZE) return -EOPNOTSUPP; break; diff --git a/include/linux/falloc.h b/include/linux/falloc.h index 3f49f3df6af5..7c38c6b76b60 100644 --- a/include/linux/falloc.h +++ b/include/linux/falloc.h @@ -36,7 +36,8 @@ struct space_resv { FALLOC_FL_COLLAPSE_RANGE | \ FALLOC_FL_ZERO_RANGE | \ FALLOC_FL_INSERT_RANGE | \ - FALLOC_FL_UNSHARE_RANGE) + FALLOC_FL_UNSHARE_RANGE | \ + FALLOC_FL_WRITE_ZEROES) /* on ia32 l_start is on a 32-bit boundary */ #if defined(CONFIG_X86_64) diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h index 5810371ed72b..265aae7ff8c1 100644 --- a/include/uapi/linux/falloc.h +++ b/include/uapi/linux/falloc.h @@ -78,4 +78,22 @@ */ #define FALLOC_FL_UNSHARE_RANGE 0x40 +/* + * FALLOC_FL_WRITE_ZEROES is used to convert a specified range of a file to + * zeros by issuing a zeroing operation. Blocks should be allocated for the + * regions that span holes in the file, and the entire range is converted to + * written extents. This flag is beneficial for subsequent pure overwriting + * within this range, as it can save on block allocation and, consequently, + * significant metadata changes. Therefore, filesystems that always require + * out-of-place writes should not support this flag. + * + * Different filesystems may implement different limitations on the + * granularity of the zeroing operation. Most will preferably be accelerated + * by submitting write zeroes command if the backing storage supports, which + * may not physically write zeros to the media. + * + * This flag cannot be specified in conjunction with the FALLOC_FL_KEEP_SIZE. + */ +#define FALLOC_FL_WRITE_ZEROES 0x80 + #endif /* _UAPI_FALLOC_H_ */ From patchwork Mon Apr 21 02:15:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885305 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C06B12586F6; Mon, 21 Apr 2025 02:25:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202339; cv=none; b=Akdwuxo5OhixgxVT4HxxYRVh6RhToFEHf8OinyYN90y+NbI9gWpB58FRl7ehouoT1uY2jwzssxRt6L/DEM3Zov4I3UgoQ1iz/5+wJUsZPZEDH1pyGRjZKE1/YPkGxgd6MEMgZuCUjyuRV+3xc9a8eRdXy41EFFni7GPFfM3imm8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202339; c=relaxed/simple; bh=MHsp3GImk+QB3EU2nX95rhG8V3LyTRlpS1cXQ4twGsk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R+0tqowy9993tvZ2UkkWU0fDdmBnLAFXp9D8uJze9qPOWRzx2wk8HLMCuGHX1zvg68CCHMiKJ6B4mZYSUf9mmIClL+Qa1+egx0HPlQR3gebof/ggv6iy0fQav8/SNeK5JUmOI//vwAveml1KoGzk//dQ29FH/QgvEYFySsM/rhA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq301vqVz4f3jZd; Mon, 21 Apr 2025 10:25:16 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id BFD671A058E; Mon, 21 Apr 2025 10:25:34 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S13; Mon, 21 Apr 2025 10:25:34 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 09/11] block: factor out common part in blkdev_fallocate() Date: Mon, 21 Apr 2025 10:15:07 +0800 Message-ID: <20250421021509.2366003-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S13 X-Coremail-Antispam: 1UD129KBjvJXoW7tF4fAr45Cw48CFWUWFyDJrb_yoW8tF15pr W3W3Z8GFs5W34qqF1fWF4xu345ta1Dtr4ruFW2qwn3u3yIyr97Kr4qkr1rWrWxKFy8Aw45 GFWa9ry29r17C3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Only the flags passed to blkdev_issue_zeroout() differ among the two zeroing branches in blkdev_fallocate(). Therefore, do cleanup by factoring them out. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- block/fops.c | 32 ++++++++++++++------------------ 1 file changed, 14 insertions(+), 18 deletions(-) diff --git a/block/fops.c b/block/fops.c index be9f1dbea9ce..77a5465309e7 100644 --- a/block/fops.c +++ b/block/fops.c @@ -812,6 +812,7 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, struct block_device *bdev = I_BDEV(inode); loff_t end = start + len - 1; loff_t isize; + unsigned int flags; int error; /* Fail if we don't recognize the flags. */ @@ -838,34 +839,29 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, filemap_invalidate_lock(inode->i_mapping); - /* - * Invalidate the page cache, including dirty pages, for valid - * de-allocate mode calls to fallocate(). - */ switch (mode) { case FALLOC_FL_ZERO_RANGE: case FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE: - error = truncate_bdev_range(bdev, file_to_blk_mode(file), start, end); - if (error) - goto fail; - - error = blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, - len >> SECTOR_SHIFT, GFP_KERNEL, - BLKDEV_ZERO_NOUNMAP); + flags = BLKDEV_ZERO_NOUNMAP; break; case FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE: - error = truncate_bdev_range(bdev, file_to_blk_mode(file), start, end); - if (error) - goto fail; - - error = blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, - len >> SECTOR_SHIFT, GFP_KERNEL, - BLKDEV_ZERO_NOFALLBACK); + flags = BLKDEV_ZERO_NOFALLBACK; break; default: error = -EOPNOTSUPP; + goto fail; } + /* + * Invalidate the page cache, including dirty pages, for valid + * de-allocate mode calls to fallocate(). + */ + error = truncate_bdev_range(bdev, file_to_blk_mode(file), start, end); + if (error) + goto fail; + + error = blkdev_issue_zeroout(bdev, start >> SECTOR_SHIFT, + len >> SECTOR_SHIFT, GFP_KERNEL, flags); fail: filemap_invalidate_unlock(inode->i_mapping); return error; From patchwork Mon Apr 21 02:15:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 882971 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F7252594B7; Mon, 21 Apr 2025 02:25:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202340; cv=none; b=KUNOOlLE2YX1BY4tHvRdJJ//H628CSJiYSnPTvTC/OFMN0pukGq5us21xwqoSiKKK/wfXGNueqDSRTkH/qsmL5YVVZWcfoWzSqv0p3wPk916KAhrZPwFiAAlZXIZBB1CfuUa6O7PZ6kmYVn8BKXVcQnkdb9tUtjyKN+3QGj51VY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202340; c=relaxed/simple; bh=dbx1O5LRvizIxHVuFcCjz8LdYe/dQeErtAxlf0hID7k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=flVQBppNXuyvFDHtaVgYVnsSgHz0tJUruW4PA/LTGNifKAgzFooxxL+Npzp2NRHGZN7Y4V4FS9pq+hfAA+0Oiwl4gWM+sl+PtEjPiWMwsKP6+yV1PwP1+VeWjEnLbj1gytkJSFOr32QcoAGJhlnMmrebWadGBuGI4PkU7iqg0NQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2t1Yxkz4f3mHR; Mon, 21 Apr 2025 10:25:10 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8ACD81A1B81; Mon, 21 Apr 2025 10:25:35 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S14; Mon, 21 Apr 2025 10:25:35 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 10/11] block: add FALLOC_FL_WRITE_ZEROES support Date: Mon, 21 Apr 2025 10:15:08 +0800 Message-ID: <20250421021509.2366003-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S14 X-Coremail-Antispam: 1UD129KBjvJXoW7Zry7Wr4xCr18Cw13ury3Arb_yoW8Xry7pF WDXF15JFZYgas3GF1fCF4q9rn5Za1kJryrGa1xK3WF9wsxJF97Kr1DWF45ta4UWFy7Aw4k Zr9Igr98ua17ArDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Add support for FALLOC_FL_WRITE_ZEROES. It directly calls blkdev_issue_zeroout() with flags set to 0. The underlying process will attempt to use the fastest method for issuing zeroes. First, the block layer will try to issue a write zeroes command if the storage device supports it; if not, it will fall back to issuing zeroed data. Then, the storage device driver may attempt to submit an unmap write zero command if the device supports it; if not, the driver may fall back to submitting a no-unmap write zeroes command. Signed-off-by: Zhang Yi --- block/fops.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/block/fops.c b/block/fops.c index 77a5465309e7..e590c8997689 100644 --- a/block/fops.c +++ b/block/fops.c @@ -803,7 +803,7 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) #define BLKDEV_FALLOC_FL_SUPPORTED \ (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | \ - FALLOC_FL_ZERO_RANGE) + FALLOC_FL_ZERO_RANGE | FALLOC_FL_WRITE_ZEROES) static long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len) @@ -847,6 +847,9 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, case FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE: flags = BLKDEV_ZERO_NOFALLBACK; break; + case FALLOC_FL_WRITE_ZEROES: + flags = 0; + break; default: error = -EOPNOTSUPP; goto fail; From patchwork Mon Apr 21 02:15:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 885303 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5217525D218; Mon, 21 Apr 2025 02:25:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202344; cv=none; b=hvkS8lvGv3oxryMVVvzwKFX4lD61EU1jMzrsvWub0gGjQ77hmm/CbzZEicizTNo/gYuKQ3ZwRtpvfkURBlAGijiOIWJPfYj0H942Gd0/Aep0v/XCmP1Cjl01NPdrR+vWva1HAbspZ43+DkomFoobN2EdxAIbfQoZxKfjfPsQAdA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745202344; c=relaxed/simple; bh=9f2Eq2hpn0twdx0WmSd0JnnIWSra7GGCc/Vkdu8g6Ds=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oYYJt5YQ8r5aK4LzVFc8huCrRH8+VvvsymUTtjGIkXtCoWqor1w0hTZX0/qIIFaI2XsEw79LIxpM7RonDZ3pXDNEINsDk17hSBnr2Sc1SyxXNE9oG2fudbclKGxFqqLMJyzX6siAfioIBl3ELSFrRGJfJfpjF3IL5f0bbj6PJyE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4Zgq2w3T4Yz4f3jY9; Mon, 21 Apr 2025 10:25:12 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 51DB91A06D7; Mon, 21 Apr 2025 10:25:36 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP4 (Coremail) with SMTP id gCh0CgA3m1+MrAVoFxZkKA--.3102S15; Mon, 21 Apr 2025 10:25:35 +0800 (CST) From: Zhang Yi To: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-block@vger.kernel.org, dm-devel@lists.linux.dev, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, tytso@mit.edu, djwong@kernel.org, john.g.garry@oracle.com, bmarzins@redhat.com, chaitanyak@nvidia.com, shinichiro.kawasaki@wdc.com, brauner@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [RFC PATCH v4 11/11] ext4: add FALLOC_FL_WRITE_ZEROES support Date: Mon, 21 Apr 2025 10:15:09 +0800 Message-ID: <20250421021509.2366003-12-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> References: <20250421021509.2366003-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: gCh0CgA3m1+MrAVoFxZkKA--.3102S15 X-Coremail-Antispam: 1UD129KBjvJXoW3Xry7Ar13Xw13XFy8XFW3GFg_yoW7GF4UpF Z8XF1rKFWIq3429r4fCw4kurn0q3WkKry5WrWSgry093yUJr1fKFn09Fy8uas0gFW8AF45 Xa1Y9ryDK3W7A37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVW8ZVWrXwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_GFv_WrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0pRQJ5wUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ From: Zhang Yi Add support for FALLOC_FL_WRITE_ZEROES. This first allocates blocks as unwritten, then issues a zero command outside of the running journal handle, and finally converts them to a written state. Signed-off-by: Zhang Yi --- fs/ext4/extents.c | 59 ++++++++++++++++++++++++++++++------- include/trace/events/ext4.h | 3 +- 2 files changed, 50 insertions(+), 12 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index c616a16a9f36..a147714403af 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4483,6 +4483,8 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, struct ext4_map_blocks map; unsigned int credits; loff_t epos, old_size = i_size_read(inode); + unsigned int blkbits = inode->i_blkbits; + bool alloc_zero = false; BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); map.m_lblk = offset; @@ -4495,6 +4497,17 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, if (len <= EXT_UNWRITTEN_MAX_LEN) flags |= EXT4_GET_BLOCKS_NO_NORMALIZE; + /* + * Do the actual write zero during a running journal transaction + * costs a lot. First allocate an unwritten extent and then + * convert it to written after zeroing it out. + */ + if (flags & EXT4_GET_BLOCKS_ZERO) { + flags &= ~EXT4_GET_BLOCKS_ZERO; + flags |= EXT4_GET_BLOCKS_UNWRIT_EXT; + alloc_zero = true; + } + /* * credits to insert 1 extent into extent tree */ @@ -4531,9 +4544,7 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, * allow a full retry cycle for any remaining allocations */ retries = 0; - map.m_lblk += ret; - map.m_len = len = len - ret; - epos = (loff_t)map.m_lblk << inode->i_blkbits; + epos = (loff_t)(map.m_lblk + ret) << blkbits; inode_set_ctime_current(inode); if (new_size) { if (epos > new_size) @@ -4553,6 +4564,21 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, ret2 = ret3 ? ret3 : ret2; if (unlikely(ret2)) break; + + if (alloc_zero && + (map.m_flags & (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN))) { + ret2 = ext4_issue_zeroout(inode, map.m_lblk, map.m_pblk, + map.m_len); + if (likely(!ret2)) + ret2 = ext4_convert_unwritten_extents(NULL, + inode, (loff_t)map.m_lblk << blkbits, + (loff_t)map.m_len << blkbits); + if (ret2) + break; + } + + map.m_lblk += ret; + map.m_len = len = len - ret; } if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry; @@ -4618,7 +4644,11 @@ static long ext4_zero_range(struct file *file, loff_t offset, if (end_lblk > start_lblk) { ext4_lblk_t zero_blks = end_lblk - start_lblk; - flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | EXT4_EX_NOCACHE); + if (mode & FALLOC_FL_WRITE_ZEROES) + flags = EXT4_GET_BLOCKS_CREATE_ZERO | EXT4_EX_NOCACHE; + else + flags |= (EXT4_GET_BLOCKS_CONVERT_UNWRITTEN | + EXT4_EX_NOCACHE); ret = ext4_alloc_file_blocks(file, start_lblk, zero_blks, new_size, flags); if (ret) @@ -4730,8 +4760,8 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) /* Return error if mode is not supported */ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | - FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE | - FALLOC_FL_INSERT_RANGE)) + FALLOC_FL_ZERO_RANGE | FALLOC_FL_COLLAPSE_RANGE | + FALLOC_FL_INSERT_RANGE | FALLOC_FL_WRITE_ZEROES)) return -EOPNOTSUPP; inode_lock(inode); @@ -4762,16 +4792,23 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) if (ret) goto out_invalidate_lock; - if (mode & FALLOC_FL_PUNCH_HOLE) + switch (mode & FALLOC_FL_MODE_MASK) { + case FALLOC_FL_PUNCH_HOLE: ret = ext4_punch_hole(file, offset, len); - else if (mode & FALLOC_FL_COLLAPSE_RANGE) + break; + case FALLOC_FL_COLLAPSE_RANGE: ret = ext4_collapse_range(file, offset, len); - else if (mode & FALLOC_FL_INSERT_RANGE) + break; + case FALLOC_FL_INSERT_RANGE: ret = ext4_insert_range(file, offset, len); - else if (mode & FALLOC_FL_ZERO_RANGE) + break; + case FALLOC_FL_ZERO_RANGE: + case FALLOC_FL_WRITE_ZEROES: ret = ext4_zero_range(file, offset, len, mode); - else + break; + default: ret = -EOPNOTSUPP; + } out_invalidate_lock: filemap_invalidate_unlock(mapping); diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 156908641e68..6f9cf2811733 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -92,7 +92,8 @@ TRACE_DEFINE_ENUM(ES_REFERENCED_B); { FALLOC_FL_KEEP_SIZE, "KEEP_SIZE"}, \ { FALLOC_FL_PUNCH_HOLE, "PUNCH_HOLE"}, \ { FALLOC_FL_COLLAPSE_RANGE, "COLLAPSE_RANGE"}, \ - { FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"}) + { FALLOC_FL_ZERO_RANGE, "ZERO_RANGE"}, \ + { FALLOC_FL_WRITE_ZEROES, "WRITE_ZEROES"}) TRACE_DEFINE_ENUM(EXT4_FC_REASON_XATTR); TRACE_DEFINE_ENUM(EXT4_FC_REASON_CROSS_RENAME);