diff mbox series

scsi: ufs: Increase the maximum data buffer size

Message ID 20220720170323.1599006-1-bvanassche@acm.org
State Superseded
Headers show
Series scsi: ufs: Increase the maximum data buffer size | expand

Commit Message

Bart Van Assche July 20, 2022, 5:03 p.m. UTC
Measurements have shown that for some UFS devices the maximum sequential
I/O throughput is achieved with a transfer size above 512 KiB. Hence
increase the maximum size of the data buffer associated with a single
request from SCSI_DEFAULT_MAX_SECTORS (1024) * 512 bytes = 512 KiB into
1 GiB.

Note: the maximum data buffer size supported by the UFSHCI specification
is 65535 * 256 MiB or about 16 TiB.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/ufs/core/ufshcd.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Avri Altman July 22, 2022, 8:30 a.m. UTC | #1
> Measurements have shown that for some UFS devices the maximum
> sequential
> I/O throughput is achieved with a transfer size above 512 KiB. Hence
> increase the maximum size of the data buffer associated with a single
> request from SCSI_DEFAULT_MAX_SECTORS (1024) * 512 bytes = 512 KiB
> into
> 1 GiB.
Did you choose 1GB to align with BLK_DEF_MAX_SECTORS?
If so why not just use it?
Can you share those performance measurements?
For some reason, I always thought that SR performance is saturated somewhere around 1MB.

Thanks,
Avri 
> 
> Note: the maximum data buffer size supported by the UFSHCI specification
> is 65535 * 256 MiB or about 16 TiB.
Avri Altman July 22, 2022, 10:19 a.m. UTC | #2
> 
> > Measurements have shown that for some UFS devices the maximum
> > sequential
> > I/O throughput is achieved with a transfer size above 512 KiB. Hence
> > increase the maximum size of the data buffer associated with a single
> > request from SCSI_DEFAULT_MAX_SECTORS (1024) * 512 bytes = 512 KiB
> > into
> > 1 GiB.
> Did you choose 1GB to align with BLK_DEF_MAX_SECTORS?
> If so why not just use it?
> Can you share those performance measurements?
> For some reason, I always thought that SR performance is saturated
> somewhere around 1MB.
1GB seems excessive, so I tried to look it up in the UFS spec.
There can be at most 255 RTTs per command upiu (bDeviceRTTCap is a single byte - in UFS4.0 as well),
Each RTT correspond to a data-out/data-in UPIU which can hold up to 64KB.
So at most 255x64KB, Or did I get it wrong?

> > Note: the maximum data buffer size supported by the UFSHCI specification
> > is 65535 * 256 MiB or about 16 TiB.
Can you help me find this limit in UFSHCI?


Thanks,
Avri
Bart Van Assche July 22, 2022, 5:25 p.m. UTC | #3
On 7/22/22 01:30, Avri Altman wrote:
>> Measurements have shown that for some UFS devices the maximum
>> sequential
>> I/O throughput is achieved with a transfer size above 512 KiB. Hence
>> increase the maximum size of the data buffer associated with a single
>> request from SCSI_DEFAULT_MAX_SECTORS (1024) * 512 bytes = 512 KiB
>> into
>> 1 GiB.
 >
> Did you choose 1GB to align with BLK_DEF_MAX_SECTORS?

No particular reason.

> Can you share those performance measurements?
> For some reason, I always thought that SR performance is saturated somewhere around 1MB.

That's also what I see on my test setup (I only tried one UFS brand - 
results may differ for other UFS device brands):

$ i=12; while ((i<=30)); do ./fio --rw=read --ioengine=psync --direct=1 
--ioscheduler=none --size=100% --time_based=1 --runtime=30 
--filename=/dev/block/sda --name=ufs --gtod_reduce=1 --bs=$((1<<i)); 
((i++)); done 2>&1 | grep read:
   read: IOPS=3714, BW=14.5MiB/s (15.2MB/s)(435MiB/30017msec)
   read: IOPS=2659, BW=20.8MiB/s (21.8MB/s)(623MiB/30003msec)
   read: IOPS=2488, BW=38.9MiB/s (40.8MB/s)(1167MiB/30016msec)
   read: IOPS=2102, BW=65.7MiB/s (68.9MB/s)(1972MiB/30006msec)
   read: IOPS=1635, BW=102MiB/s (107MB/s)(3068MiB/30019msec)
   read: IOPS=1630, BW=204MiB/s (214MB/s)(6120MiB/30035msec)
   read: IOPS=1228, BW=307MiB/s (322MB/s)(9232MiB/30061msec)
   read: IOPS=752, BW=376MiB/s (395MB/s)(11.0GiB/30008msec)
   read: IOPS=472, BW=473MiB/s (496MB/s)(13.9GiB/30043msec)
   read: IOPS=107, BW=216MiB/s (226MB/s)(6524MiB/30249msec)
   read: IOPS=66, BW=267MiB/s (280MB/s)(8184MiB/30666msec)
   read: IOPS=38, BW=305MiB/s (319MB/s)(9200MiB/30210msec)
   read: IOPS=18, BW=292MiB/s (306MB/s)(9184MiB/31454msec)
   read: IOPS=9, BW=302MiB/s (316MB/s)(9.94GiB/33731msec)
   read: IOPS=5, BW=326MiB/s (342MB/s)(11.9GiB/37278msec)
   read: IOPS=2, BW=277MiB/s (290MB/s)(15.8GiB/58277msec)
   read: IOPS=1, BW=302MiB/s (316MB/s)(15.5GiB/52626msec)
   read: IOPS=0, BW=284MiB/s (298MB/s)(31.0GiB/111736msec)

Bart.
Bart Van Assche July 22, 2022, 5:33 p.m. UTC | #4
On 7/22/22 03:19, Avri Altman wrote:
>>> Note: the maximum data buffer size supported by the UFSHCI specification
>>> is 65535 * 256 MiB or about 16 TiB.
 >
> Can you help me find this limit in UFSHCI?

 From the UFSHCI 3.0 specification:
* PRDT length is a sixteen bit field so the maximum value is 65535 
(entries).
* The maximum length of a single descriptor is 256 KiB. See also the DBC 
(Data Byte Count) field.

So the maximum amount of data that can be transferred at once is 65535 * 
256 KiB or about 16 GiB (and not what I wrote in my previous message).

Thanks,

Bart.
Avri Altman July 24, 2022, 7:27 a.m. UTC | #5
> 
> On 7/22/22 03:19, Avri Altman wrote:
> >>> Note: the maximum data buffer size supported by the UFSHCI
> >>> specification is 65535 * 256 MiB or about 16 TiB.
>  >
> > Can you help me find this limit in UFSHCI?
> 
>  From the UFSHCI 3.0 specification:
> * PRDT length is a sixteen bit field so the maximum value is 65535 (entries).
> * The maximum length of a single descriptor is 256 KiB. See also the DBC (Data
> Byte Count) field.
> 
> So the maximum amount of data that can be transferred at once is 65535 *
> 256 KiB or about 16 GiB (and not what I wrote in my previous message).
The transfer-length of rw10 is 2 bytes, and I am not even sure that rw16 support is mandatory.
Maybe it would be more practical, instead of 16GB to use 255MiB instead?

Thanks,
Avri
diff mbox series

Patch

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 36b7212e9cb5..ea7e8f329d0f 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -8365,6 +8365,7 @@  static struct scsi_host_template ufshcd_driver_template = {
 	.cmd_per_lun		= UFSHCD_CMD_PER_LUN,
 	.can_queue		= UFSHCD_CAN_QUEUE,
 	.max_segment_size	= PRDT_DATA_BYTE_COUNT_MAX,
+	.max_sectors		= (1 << 30) / SECTOR_SIZE, /* 1 GiB */
 	.max_host_blocked	= 1,
 	.track_queue_depth	= 1,
 	.sdev_groups		= ufshcd_driver_groups,