Message ID | 20210827075045.642269-1-damien.lemoal@wdc.com |
---|---|
Headers | show |
Series | Initial support for multi-actuator HDDs | expand |
On Fri, Aug 27, 2021 at 02:28:58PM +0000, Tim Walker wrote: > There is nothing in the spec that requires the ranges to be contiguous > or non-overlapping. Yikes, that is a pretty stupid standard. Almost as bad as allowing non-uniform sized non-power of two sized zones :) > It's easy to imagine a HDD architecture that allows multiple heads to access the same sectors on the disk. It's also easy to imagine a workload scenario where parallel access to the same disk could be useful. (Think of a typical storage design that sequentially writes new user data gradually filling the disk, while simultaneously supporting random user reads over the written data.) But for those drivers you do not actually need this scheme at all. Storage devices that support higher concurrency are bog standard with SSDs and if you want to go back storage arrays. The only interesting case is when these ranges are separate so that the access can be carved up based on the boundary. Now I don't want to give people ideas with overlapping but not identical, which would be just horrible.
On 2021/08/28 2:38, Phillip Susi wrote: > > Tim Walker <tim.t.walker@seagate.com> writes: > >> The IO Scheduler is a useful place to implement per-actuator load >> management, but with the LBA-to-actuator mapping available to user >> space (via sysfs) it could also be done at the user level. Or pretty >> much anywhere else where we have knowledge and control of the various >> streams. > > I suppose there may be some things user space could do with the > information, but mainly doesn't it have to be done in the IO scheduler? Correct, if the user does not use a file system then optimizations will depend on the user application and the IO scheduler. > As it stands now, it is going to try to avoid seeking between the two > regions even though the drive can service a contiguous stream from both > just fine, right? Correct. But any IO scheduler optimization will kick-in only and only if the user is accessing the drive at a queue depth beyond the drive max QD, 32 for SATA. If the drive is exercised at a QD less than its maximum, the scheduler does not hold on to requests (at least mq-deadline does not, not sure about bfq). So even with only this patch set (no optimizations at the kernel level), the user can still make things work as expected, that is, get multiple streams of IOs to execute in parallel.
On 2021/08/28 1:43, Christoph Hellwig wrote: > On Fri, Aug 27, 2021 at 02:28:58PM +0000, Tim Walker wrote: >> There is nothing in the spec that requires the ranges to be contiguous >> or non-overlapping. > > Yikes, that is a pretty stupid standard. Almost as bad as allowing > non-uniform sized non-power of two sized zones :) > >> It's easy to imagine a HDD architecture that allows multiple heads to access the same sectors on the disk. It's also easy to imagine a workload scenario where parallel access to the same disk could be useful. (Think of a typical storage design that sequentially writes new user data gradually filling the disk, while simultaneously supporting random user reads over the written data.) > > But for those drivers you do not actually need this scheme at all. Agree. > Storage devices that support higher concurrency are bog standard with > SSDs and if you want to go back storage arrays. The only interesting > case is when these ranges are separate so that the access can be carved > up based on the boundary. Now I don't want to give people ideas with > overlapping but not identical, which would be just horrible. Agree too. And looking at my patch again, the function disk_check_iaranges() in patch 1 only checks that the overall sector range of all access ranges is form 0 to capacity - 1, but it does not check for holes nor overlap. I need to change that and ignore any disk that reports overlapping ranges or ranges with holes in the LBA space. Holes would be horrible and if we have overlap, then the drive can optimize by itself. Will resend a V7 with corrections for that.