diff mbox series

scsi: sd_zbc: Limit the report zones buffer size to UIO_MAXIOV

Message ID 20250411203600.84477-1-ssiwinski@atto.com
State New
Headers show
Series scsi: sd_zbc: Limit the report zones buffer size to UIO_MAXIOV | expand

Commit Message

Steve Siwinski April 11, 2025, 8:36 p.m. UTC
The report zones buffer size is currently limited by the HBA's
maximum segment count to ensure the buffer can be mapped. However,
the user-space SG_IO interface further limits the number of iovec
entries to UIO_MAXIOV when allocating a bio.

To avoid allocation of buffers too large to be mapped, further
restrict the maximum buffer size to UIO_MAXIOV * PAGE_SIZE.

This ensures that the buffer size complies with both kernel
and user-space constraints.

Signed-off-by: Steve Siwinski <ssiwinski@atto.com>
---
 drivers/scsi/sd_zbc.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Christoph Hellwig April 14, 2025, 5:52 a.m. UTC | #1
On Fri, Apr 11, 2025 at 04:36:00PM -0400, Steve Siwinski wrote:
> The report zones buffer size is currently limited by the HBA's
> maximum segment count to ensure the buffer can be mapped. However,
> the user-space SG_IO interface further limits the number of iovec
> entries to UIO_MAXIOV when allocating a bio.

Why does the userspace SG_IO interface matter here?
sd_zbc_alloc_report_buffer is only used for the in-kernel
->report_zones call.

> 
> To avoid allocation of buffers too large to be mapped, further
> restrict the maximum buffer size to UIO_MAXIOV * PAGE_SIZE.
> 
> This ensures that the buffer size complies with both kernel
> and user-space constraints.
> 
> Signed-off-by: Steve Siwinski <ssiwinski@atto.com>
> ---
>  drivers/scsi/sd_zbc.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 7a447ff600d2..a19e76ec8fb6 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -180,12 +180,15 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
>  	 * Furthermore, since the report zone command cannot be split, make
>  	 * sure that the allocated buffer can always be mapped by limiting the
>  	 * number of pages allocated to the HBA max segments limit.
> +	 * Since max segments can be larger than the max sgio entries, further
> +	 * limit the allocated buffer to the UIO_MAXIOV.
>  	 */
>  	nr_zones = min(nr_zones, sdkp->zone_info.nr_zones);
>  	bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE);
>  	bufsize = min_t(size_t, bufsize,
>  			queue_max_hw_sectors(q) << SECTOR_SHIFT);
>  	bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
> +	bufsize = min_t(size_t, bufsize, UIO_MAXIOV * PAGE_SIZE);
>  
>  	while (bufsize >= SECTOR_SIZE) {
>  		buf = kvzalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);
> -- 
> 2.43.5
> 
> 
---end quoted text---
Damien Le Moal April 18, 2025, 9:29 p.m. UTC | #2
On 4/19/25 05:46, SSiwinski@atto.com wrote:
> 
> "Christoph Hellwig" <hch@infradead.org> wrote on 04/14/2025 01:52:31 AM:
> 
>> On Fri, Apr 11, 2025 at 04:36:00PM -0400, Steve Siwinski wrote:
>>> The report zones buffer size is currently limited by the HBA's
>>> maximum segment count to ensure the buffer can be mapped. However,
>>> the user-space SG_IO interface further limits the number of iovec
>>> entries to UIO_MAXIOV when allocating a bio.
>>
>> Why does the userspace SG_IO interface matter here?
>> sd_zbc_alloc_report_buffer is only used for the in-kernel
>> ->report_zones call.
> 
> I was referring to the userspace SG_IO limitation (UIO_MAXIOV) in
> bio_kmalloc(), which gets called when the report zones command is
> executed and the buffer mapped in bio_map_kern().
> 
> Perhaps my wording here was poor and this is really a limitation of bio?

sd_zbc_alloc_report_buffer() is called only from sd_zbc_report_zones() which is
the disk ->report_zones() operations, which is NOT called for passthrough
commands. So modifying sd_zbc_alloc_report_buffer() will not help in any way
solving your issue with an SG_IO passthrough report zones command issued by the
user.

For reference, libzbc uses ioctl(SG_GET_SG_TABLESIZE) * sysconf(_SC_PAGESIZE) as
the max buffer size and allocates page aligned buffers to avoid these SG_IO
buffer mapping limitations.
Siwinski, Steve April 24, 2025, 3:33 p.m. UTC | #3
On 4/18/2025 5:29 PM, Damien Le Moal wrote:
> On 4/19/25 05:46, SSiwinski@atto.com wrote:
>>
>> "Christoph Hellwig" <hch@infradead.org> wrote on 04/14/2025 01:52:31 AM:
>>
>>> On Fri, Apr 11, 2025 at 04:36:00PM -0400, Steve Siwinski wrote:
>>>> The report zones buffer size is currently limited by the HBA's
>>>> maximum segment count to ensure the buffer can be mapped. However,
>>>> the user-space SG_IO interface further limits the number of iovec
>>>> entries to UIO_MAXIOV when allocating a bio.
>>>
>>> Why does the userspace SG_IO interface matter here?
>>> sd_zbc_alloc_report_buffer is only used for the in-kernel
>>> ->report_zones call.
>>
>> I was referring to the userspace SG_IO limitation (UIO_MAXIOV) in
>> bio_kmalloc(), which gets called when the report zones command is
>> executed and the buffer mapped in bio_map_kern().
>>
>> Perhaps my wording here was poor and this is really a limitation of bio?
> 
> sd_zbc_alloc_report_buffer() is called only from sd_zbc_report_zones() which is
> the disk ->report_zones() operations, which is NOT called for passthrough
> commands. So modifying sd_zbc_alloc_report_buffer() will not help in any way
> solving your issue with an SG_IO passthrough report zones command issued by the
> user.
> 
> For reference, libzbc uses ioctl(SG_GET_SG_TABLESIZE) * sysconf(_SC_PAGESIZE) as
> the max buffer size and allocates page aligned buffers to avoid these SG_IO
> buffer mapping limitations.
> 

My issue is not with passthough report zones.

The report zones command is failing on driver load and causing the drive 
to fail to appear as a block device. If queue_max_segments is set to a 
value over 1024, then nr_vecs in bio_alloc() will be greater than 
UIO_MAXIOV and bio_alloc() will return NULL.

This causes the error.
```
sd 8:0:0:0: [sdb] REPORT ZONES start lba 0 failed
sd 8:0:0:0: [sdb] REPORT ZONES: Result: hostbyte=0xff driverbyte=DRIVER_OK
sdb: failed to revalidate zones
```

You can reproduce this by setting the max_sgl_entries parameter to 2k or 
greater in the mpt3sas driver. Other drivers can also reproduce this 
behavior.


This electronic transmission and any attachments hereto are intended only for the use of the individual or entity to which it is addressed and may contain confidential information belonging to ATTO Technology, Inc. If you have reason to believe that you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or the taking of any action in reliance on the contents of this electronic transmission is strictly prohibited. If you have reason to believe that you have received this transmission in error, please notify ATTO immediately by return e-mail and delete and destroy this communication.
Damien Le Moal April 25, 2025, 1:42 a.m. UTC | #4
On 4/25/25 00:33, Siwinski, Steve wrote:
> My issue is not with passthough report zones.
> 
> The report zones command is failing on driver load and causing the drive 
> to fail to appear as a block device. If queue_max_segments is set to a 
> value over 1024, then nr_vecs in bio_alloc() will be greater than 
> UIO_MAXIOV and bio_alloc() will return NULL.

OK... A remainder about the path:

sd_zbc_do_report_zones() -> scsi_execute_cmd() -> blk_rq_map_kern() ->
bio_map_kern() -> bio_kmalloc()

and the fact that bio_kmalloc() does not allow more than UIO_MAXIOV segments
would have made things clear from the beginning. I had to look it up again to
understand why UIO_MAXIOV matters.

> This causes the error.
> ```
> sd 8:0:0:0: [sdb] REPORT ZONES start lba 0 failed
> sd 8:0:0:0: [sdb] REPORT ZONES: Result: hostbyte=0xff driverbyte=DRIVER_OK
> sdb: failed to revalidate zones
> ```
> 
> You can reproduce this by setting the max_sgl_entries parameter to 2k or 
> greater in the mpt3sas driver. Other drivers can also reproduce this 
> behavior.

Well, I think that the problem you uncovered here is a lot more fundamental than
just ZBC report zones. If the drive has a queue_max_segments() value larger than
UIO_MAXIOV, any attempt to map a large buffer for any command (e.g. a read) will
also fail. So this limit inconsistency seems wrong...

Christoph ? Since you were touching the vmalloc-ed BIO mapping code, do you have
any idea about this ? The quick and dirty fix would be to do:

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 7a447ff600d2..3cb897b25878 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -169,6 +169,7 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
                                        unsigned int nr_zones, size_t *buflen)
 {
        struct request_queue *q = sdkp->disk->queue;
+       size_t max_segs;
        size_t bufsize;
        void *buf;

@@ -185,7 +186,8 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
        bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE);
        bufsize = min_t(size_t, bufsize,
                        queue_max_hw_sectors(q) << SECTOR_SHIFT);
-       bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
+       max_segs = min(queue_max_segments(q), UIO_MAXIOV);
+       bufsize = min_t(size_t, bufsize, max_segs << PAGE_SHIFT);

        while (bufsize >= SECTOR_SIZE) {
                buf = kvzalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);


But that feels wrong...
Christoph Hellwig April 30, 2025, 2:06 p.m. UTC | #5
On Fri, Apr 25, 2025 at 10:42:46AM +0900, Damien Le Moal wrote:
> and the fact that bio_kmalloc() does not allow more than UIO_MAXIOV segments
> would have made things clear from the beginning. I had to look it up again to
> understand why UIO_MAXIOV matters.

We shouldn't really allow mapping unlimited number of segments.
So the original patch looks mostly fine, except that we should
really replace that UIO_MAXIOV symbolic name with a more descriptive
one in the block layer, keeping the same value for it.
diff mbox series

Patch

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 7a447ff600d2..a19e76ec8fb6 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -180,12 +180,15 @@  static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
 	 * Furthermore, since the report zone command cannot be split, make
 	 * sure that the allocated buffer can always be mapped by limiting the
 	 * number of pages allocated to the HBA max segments limit.
+	 * Since max segments can be larger than the max sgio entries, further
+	 * limit the allocated buffer to the UIO_MAXIOV.
 	 */
 	nr_zones = min(nr_zones, sdkp->zone_info.nr_zones);
 	bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE);
 	bufsize = min_t(size_t, bufsize,
 			queue_max_hw_sectors(q) << SECTOR_SHIFT);
 	bufsize = min_t(size_t, bufsize, queue_max_segments(q) << PAGE_SHIFT);
+	bufsize = min_t(size_t, bufsize, UIO_MAXIOV * PAGE_SIZE);
 
 	while (bufsize >= SECTOR_SIZE) {
 		buf = kvzalloc(bufsize, GFP_KERNEL | __GFP_NORETRY);