Message ID | 1c4d66ca-fe1a-3d1a-d7f9-4981d2fc9eb1@huaweicloud.com |
---|---|
State | New |
Headers | show |
Series | Why is MEGASAS_SAS_QD set to 256? | expand |
On 24/11/2022 03:45, Yu Kuai wrote: > Hi, > > While upgrading kernel from 4.19 to 5.10, I found that fio 1 thread 4k > sequential io performance is dropped(160Mib -> 100 Mib), root cause is > that queue_depth is changed from 64 to 256. > > commit 6e73550670ed1c07779706bb6cf61b99c871fc42 > scsi: megaraid_sas: Update optimal queue depth for SAS and NVMe devices > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h > b/drivers/scsi/megaraid/megaraid_sas.h > index bd8184072bed..ddfbe6f6667a 100644 > --- a/drivers/scsi/megaraid/megaraid_sas.h > +++ b/drivers/scsi/megaraid/megaraid_sas.h > @@ -2233,9 +2233,9 @@ enum MR_PD_TYPE { > > /* JBOD Queue depth definitions */ > #define MEGASAS_SATA_QD 32 > -#define MEGASAS_SAS_QD 64 > +#define MEGASAS_SAS_QD 256 > #define MEGASAS_DEFAULT_PD_QD 64 > -#define MEGASAS_NVME_QD 32 > +#define MEGASAS_NVME_QD 64 > > > And with the default nr_requests 256, 256 queue_depth will make the > elevator has no effect, specifically io can't be merged in this test > case. Hence it doesn't make sense to me to set default queue_depth to > 256. > > Is there any reason why MEGASAS_SAS_QD is changed to 64? > > Thanks, > Kuai > Which type of drive do you use? JFYI, in case missed, there was this discussion on SCSI queue depth a while ago: https://lore.kernel.org/linux-scsi/4b50f067-a368-2197-c331-a8c981f5cd02@huawei.com/ Thanks, John
Hi, 在 2022/11/25 20:33, John Garry 写道: > On 24/11/2022 03:45, Yu Kuai wrote: >> Hi, >> >> While upgrading kernel from 4.19 to 5.10, I found that fio 1 thread 4k >> sequential io performance is dropped(160Mib -> 100 Mib), root cause is >> that queue_depth is changed from 64 to 256. >> >> commit 6e73550670ed1c07779706bb6cf61b99c871fc42 >> scsi: megaraid_sas: Update optimal queue depth for SAS and NVMe devices >> >> diff --git a/drivers/scsi/megaraid/megaraid_sas.h >> b/drivers/scsi/megaraid/megaraid_sas.h >> index bd8184072bed..ddfbe6f6667a 100644 >> --- a/drivers/scsi/megaraid/megaraid_sas.h >> +++ b/drivers/scsi/megaraid/megaraid_sas.h >> @@ -2233,9 +2233,9 @@ enum MR_PD_TYPE { >> >> /* JBOD Queue depth definitions */ >> #define MEGASAS_SATA_QD 32 >> -#define MEGASAS_SAS_QD 64 >> +#define MEGASAS_SAS_QD 256 >> #define MEGASAS_DEFAULT_PD_QD 64 >> -#define MEGASAS_NVME_QD 32 >> +#define MEGASAS_NVME_QD 64 >> >> >> And with the default nr_requests 256, 256 queue_depth will make the >> elevator has no effect, specifically io can't be merged in this test >> case. Hence it doesn't make sense to me to set default queue_depth to >> 256. >> >> Is there any reason why MEGASAS_SAS_QD is changed to 64? >> >> Thanks, >> Kuai >> > > Which type of drive do you use? SAS SSDs BTW, I also test with nvme as well, the default elevator is deadline and queue_depth seems too small, and performance is far from optimal. Current default values don't seem good to me... 😒 Thanks, Kuai > > JFYI, in case missed, there was this discussion on SCSI queue depth a > while ago: > https://lore.kernel.org/linux-scsi/4b50f067-a368-2197-c331-a8c981f5cd02@huawei.com/ > > > Thanks, > John > > > . >
On Sat, Nov 26, 2022 at 09:15:53AM +0800, Yu Kuai wrote: > Hi, > > 在 2022/11/25 20:33, John Garry 写道: > > On 24/11/2022 03:45, Yu Kuai wrote: > > > Hi, > > > > > > While upgrading kernel from 4.19 to 5.10, I found that fio 1 thread 4k > > > sequential io performance is dropped(160Mib -> 100 Mib), root cause is > > > that queue_depth is changed from 64 to 256. > > > > > > commit 6e73550670ed1c07779706bb6cf61b99c871fc42 > > > scsi: megaraid_sas: Update optimal queue depth for SAS and NVMe devices > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas.h > > > b/drivers/scsi/megaraid/megaraid_sas.h > > > index bd8184072bed..ddfbe6f6667a 100644 > > > --- a/drivers/scsi/megaraid/megaraid_sas.h > > > +++ b/drivers/scsi/megaraid/megaraid_sas.h > > > @@ -2233,9 +2233,9 @@ enum MR_PD_TYPE { > > > > > > /* JBOD Queue depth definitions */ > > > #define MEGASAS_SATA_QD 32 > > > -#define MEGASAS_SAS_QD 64 > > > +#define MEGASAS_SAS_QD 256 > > > #define MEGASAS_DEFAULT_PD_QD 64 > > > -#define MEGASAS_NVME_QD 32 > > > +#define MEGASAS_NVME_QD 64 > > > > > > > > > And with the default nr_requests 256, 256 queue_depth will make the > > > elevator has no effect, specifically io can't be merged in this test > > > case. Hence it doesn't make sense to me to set default queue_depth to > > > 256. > > > > > > Is there any reason why MEGASAS_SAS_QD is changed to 64? > > > > > > Thanks, > > > Kuai > > > > > > > Which type of drive do you use? > > SAS SSDs > > BTW, I also test with nvme as well, the default elevator is deadline and > queue_depth seems too small, and performance is far from optimal. > > Current default values don't seem good to me... 😒 If you want aggressive merge on sequential IO workload, the queue depth need to be a bit less, then more requests can be staggered into scheduler queue, and merge chance is increased. If you want good perf on random IO perf, the queue depth needs to be deep enough to have enough parallelism for saturating SSD internal. But we don't recognize sequential/random IO pattern, and usually fixed queue depth is used. Thanks, Ming
Hi, Ming 在 2022/11/26 10:18, Ming Lei 写道: > > If you want aggressive merge on sequential IO workload, the queue depth need > to be a bit less, then more requests can be staggered into scheduler queue, > and merge chance is increased. But if nr_requests >= queue_depth, it seems to me elevator will have no effect, no request can be merged or sorted by scheduler, right? > > If you want good perf on random IO perf, the queue depth needs to > be deep enough to have enough parallelism for saturating SSD internal. > > But we don't recognize sequential/random IO pattern, and usually fixed > queue depth is used. Is it possible to use none elevator and set large queue_depth if nvme is used in this case? Thansk, Kuai > > Thanks, > Ming > > . >
On Sat, Nov 26, 2022 at 02:08:02PM +0800, Yu Kuai wrote: > Hi, Ming > > 在 2022/11/26 10:18, Ming Lei 写道: > > > > If you want aggressive merge on sequential IO workload, the queue depth need > > to be a bit less, then more requests can be staggered into scheduler queue, > > and merge chance is increased. > > But if nr_requests >= queue_depth, it seems to me elevator will have no > effect, no request can be merged or sorted by scheduler, right? Yeah. If nr_requests <= queue_depth, every request can be queued to driver/device, so requests won't be merged by scheduler. But plug merge still works if IOs are submitted as batch. > > > > If you want good perf on random IO perf, the queue depth needs to > > be deep enough to have enough parallelism for saturating SSD internal. > > > > But we don't recognize sequential/random IO pattern, and usually fixed > > queue depth is used. > > Is it possible to use none elevator and set large queue_depth if nvme is > used in this case? Yeah, if the storage is SSD, usually none with bigger queue_depth should help, and usually 256 should be enough to saturate one single SSD for one well implemented driver. Thanks Ming
Hi, 在 2022/11/27 17:42, Ming Lei 写道: > On Sat, Nov 26, 2022 at 02:08:02PM +0800, Yu Kuai wrote: >> Hi, Ming >> >> 在 2022/11/26 10:18, Ming Lei 写道: >>> >>> If you want aggressive merge on sequential IO workload, the queue depth need >>> to be a bit less, then more requests can be staggered into scheduler queue, >>> and merge chance is increased. >> >> But if nr_requests >= queue_depth, it seems to me elevator will have no >> effect, no request can be merged or sorted by scheduler, right? > > Yeah. > > If nr_requests <= queue_depth, every request can be queued to > driver/device, so requests won't be merged by scheduler. > > But plug merge still works if IOs are submitted as batch. Yes, io can still be merged by plug. I just find it a little werid to set default elevator as deadline, and default queue_depth to 256. Which means deadline here is useless. > >>> >>> If you want good perf on random IO perf, the queue depth needs to >>> be deep enough to have enough parallelism for saturating SSD internal. >>> >>> But we don't recognize sequential/random IO pattern, and usually fixed >>> queue depth is used. >> >> Is it possible to use none elevator and set large queue_depth if nvme is >> used in this case? > > Yeah, if the storage is SSD, usually none with bigger queue_depth should > help, and usually 256 should be enough to saturate one single SSD for > one well implemented driver. Yes, I'm testing with multiple SSDs / NVMEs, and I can't get optimal performance by default. Thanks, Kuai > > > Thanks > Ming > > . >
diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index bd8184072bed..ddfbe6f6667a 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2233,9 +2233,9 @@ enum MR_PD_TYPE { /* JBOD Queue depth definitions */ #define MEGASAS_SATA_QD 32 -#define MEGASAS_SAS_QD 64 +#define MEGASAS_SAS_QD 256 #define MEGASAS_DEFAULT_PD_QD 64 -#define MEGASAS_NVME_QD 32 +#define MEGASAS_NVME_QD 64 And with the default nr_requests 256, 256 queue_depth will make the