Message ID | 20230709202859.138387-1-michael.christie@oracle.com |
---|---|
Headers | show |
Series | vhost-scsi: Fix IO hangs when using windows | expand |
On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > The following patches were made over Linus's tree and fix an issue > where windows guests will send iovecs with offset/lengths that result > in IOs that are not aligned to 512. The LIO layer will then send them > to Linux's FS/block layer but it requires 512 byte alignment, so > depending on the FS/block driver being used we will get IO errors or > hung IO. > > The following patches have vhost-scsi detect when windows sends these > IOs and copy them to a bounce buffer. It then does some cleanup in > the related code. Thanks, tagged! Mike, you are the main developer on vhost/scsi recently. Do you want to be listed in MAINTAINERS too? This implies you will be expected to review patches/bug reports though. Thanks,
On 7/10/23 12:52 AM, Michael S. Tsirkin wrote: > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >> The following patches were made over Linus's tree and fix an issue >> where windows guests will send iovecs with offset/lengths that result >> in IOs that are not aligned to 512. The LIO layer will then send them >> to Linux's FS/block layer but it requires 512 byte alignment, so >> depending on the FS/block driver being used we will get IO errors or >> hung IO. >> >> The following patches have vhost-scsi detect when windows sends these >> IOs and copy them to a bounce buffer. It then does some cleanup in >> the related code. > > > Thanks, tagged! > Mike, you are the main developer on vhost/scsi recently. > Do you want to be listed in MAINTAINERS too? > This implies you will be expected to review patches/bug reports > though. That sounds good. Thanks.
On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > The following patches were made over Linus's tree and fix an issue > where windows guests will send iovecs with offset/lengths that result > in IOs that are not aligned to 512. The LIO layer will then send them > to Linux's FS/block layer but it requires 512 byte alignment, so > depending on the FS/block driver being used we will get IO errors or > hung IO. > > The following patches have vhost-scsi detect when windows sends these > IOs and copy them to a bounce buffer. It then does some cleanup in > the related code. Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must follow the usual constraints on SCSI block limits. Would Windows send mis-aligned I/O to a non-virtio-scsi SCSI HBA? Are you sure this is not a bug in the Windows guest driver where block limits are being misconfigured? Stefan
On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >> The following patches were made over Linus's tree and fix an issue >> where windows guests will send iovecs with offset/lengths that result >> in IOs that are not aligned to 512. The LIO layer will then send them >> to Linux's FS/block layer but it requires 512 byte alignment, so >> depending on the FS/block driver being used we will get IO errors or >> hung IO. >> >> The following patches have vhost-scsi detect when windows sends these >> IOs and copy them to a bounce buffer. It then does some cleanup in >> the related code. > > Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > follow the usual constraints on SCSI block limits. Would Windows send > mis-aligned I/O to a non-virtio-scsi SCSI HBA? It's like linux where you can config settings like that. > > Are you sure this is not a bug in the Windows guest driver where block > limits are being misconfigured?
On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: > On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > > On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > >> The following patches were made over Linus's tree and fix an issue > >> where windows guests will send iovecs with offset/lengths that result > >> in IOs that are not aligned to 512. The LIO layer will then send them > >> to Linux's FS/block layer but it requires 512 byte alignment, so > >> depending on the FS/block driver being used we will get IO errors or > >> hung IO. > >> > >> The following patches have vhost-scsi detect when windows sends these > >> IOs and copy them to a bounce buffer. It then does some cleanup in > >> the related code. > > > > Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > > follow the usual constraints on SCSI block limits. Would Windows send > > mis-aligned I/O to a non-virtio-scsi SCSI HBA? > > It's like linux where you can config settings like that. > > > > Are you sure this is not a bug in the Windows guest driver where block > > limits are being misconfigured? > > From what our windows dev told us the guest drivers like here: > > https://github.com/virtio-win > > don't set the windows AlignmentMask to 512. They tried that and it > resulted in windows crash dump crashing because it doesn't like the > hard alignment requirement. > > We thought other apps would have trouble as well, so we tried to add > bounce buffer support to the windows driver, but I think people thought > it was going to be uglier than this patch and in the normal alignment > case might also affect performance. There was some windows driver/layering > and buffer/cmd details that I don't fully understand and took their word > for because I don't know a lot about windows. > > In the end we still have to add checks to vhost-scsi to protect against > bad drivers, so we thought we might as well just add bounce buffer support > to vhost-scsi. CCing virtio-win developers so they can confirm how the vioscsi driver is supposed to handle request alignment. My expectation is that the virtio-scsi device will fail mis-aligned I/O requests. Stefan
On 7/12/23 9:26 AM, Stefan Hajnoczi wrote: > On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: >> On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: >>> On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: >>>> The following patches were made over Linus's tree and fix an issue >>>> where windows guests will send iovecs with offset/lengths that result >>>> in IOs that are not aligned to 512. The LIO layer will then send them >>>> to Linux's FS/block layer but it requires 512 byte alignment, so >>>> depending on the FS/block driver being used we will get IO errors or >>>> hung IO. >>>> >>>> The following patches have vhost-scsi detect when windows sends these >>>> IOs and copy them to a bounce buffer. It then does some cleanup in >>>> the related code. >>> >>> Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must >>> follow the usual constraints on SCSI block limits. Would Windows send >>> mis-aligned I/O to a non-virtio-scsi SCSI HBA? >> >> It's like linux where you can config settings like that. >> >>>> Are you sure this is not a bug in the Windows guest driver where block >>> limits are being misconfigured? >> >> From what our windows dev told us the guest drivers like here: >> >> https://github.com/virtio-win >> >> don't set the windows AlignmentMask to 512. They tried that and it >> resulted in windows crash dump crashing because it doesn't like the >> hard alignment requirement. >> >> We thought other apps would have trouble as well, so we tried to add >> bounce buffer support to the windows driver, but I think people thought >> it was going to be uglier than this patch and in the normal alignment >> case might also affect performance. There was some windows driver/layering >> and buffer/cmd details that I don't fully understand and took their word >> for because I don't know a lot about windows. >> >> In the end we still have to add checks to vhost-scsi to protect against >> bad drivers, so we thought we might as well just add bounce buffer support >> to vhost-scsi. > > CCing virtio-win developers so they can confirm how the vioscsi driver > is supposed to handle request alignment. > > My expectation is that the virtio-scsi device will fail mis-aligned I/O > requests. I don't think you can just change the driver's behavior to fail now, because apps send mis-aligned IO and its working as long as they have less than 256 bio vecs. We see mis-aligned IOs during boot and also from random non window's apps. If we just start to fail then it would be a regression when the app no longer works or the OS fails to start up.
On Wed, Jul 12, 2023 at 11:05:11AM -0500, Mike Christie wrote: > On 7/12/23 9:26 AM, Stefan Hajnoczi wrote: > > On Tue, Jul 11, 2023 at 04:01:22PM -0500, Mike Christie wrote: > >> On 7/11/23 1:34 PM, Stefan Hajnoczi wrote: > >>> On Sun, Jul 09, 2023 at 03:28:57PM -0500, Mike Christie wrote: > >>>> The following patches were made over Linus's tree and fix an issue > >>>> where windows guests will send iovecs with offset/lengths that result > >>>> in IOs that are not aligned to 512. The LIO layer will then send them > >>>> to Linux's FS/block layer but it requires 512 byte alignment, so > >>>> depending on the FS/block driver being used we will get IO errors or > >>>> hung IO. > >>>> > >>>> The following patches have vhost-scsi detect when windows sends these > >>>> IOs and copy them to a bounce buffer. It then does some cleanup in > >>>> the related code. > >>> > >>> Hang on, virtio-scsi is a SCSI HBA and READs/WRITEs submitted must > >>> follow the usual constraints on SCSI block limits. Would Windows send > >>> mis-aligned I/O to a non-virtio-scsi SCSI HBA? > >> > >> It's like linux where you can config settings like that. > >> > >>>> Are you sure this is not a bug in the Windows guest driver where block > >>> limits are being misconfigured? > >> > >> From what our windows dev told us the guest drivers like here: > >> > >> https://github.com/virtio-win > >> > >> don't set the windows AlignmentMask to 512. They tried that and it > >> resulted in windows crash dump crashing because it doesn't like the > >> hard alignment requirement. > >> > >> We thought other apps would have trouble as well, so we tried to add > >> bounce buffer support to the windows driver, but I think people thought > >> it was going to be uglier than this patch and in the normal alignment > >> case might also affect performance. There was some windows driver/layering > >> and buffer/cmd details that I don't fully understand and took their word > >> for because I don't know a lot about windows. > >> > >> In the end we still have to add checks to vhost-scsi to protect against > >> bad drivers, so we thought we might as well just add bounce buffer support > >> to vhost-scsi. > > > > CCing virtio-win developers so they can confirm how the vioscsi driver > > is supposed to handle request alignment. > > > > My expectation is that the virtio-scsi device will fail mis-aligned I/O > > requests. > > I don't think you can just change the driver's behavior to fail now, > because apps send mis-aligned IO and its working as long as they have less > than 256 bio vecs. > > We see mis-aligned IOs during boot and also from random non window's apps. > If we just start to fail then it would be a regression when the app no > longer works or the OS fails to start up. I was wrong: The virtio-scsi specification contains no alignment requirements for I/O buffers. It is fine for the driver to submit iovecs with any memory alignment. The QEMU code allocates a bounce buffer if the iovecs submitted by the driver do not match the minimum alignment requirements on the host (e.g. O_DIRECT requirements). It makes sense that vhost_scsi needs to use a bounce buffer in cases where the underlying storage has stricter memory alignment requirements. Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
On Thu, Jul 13, 2023 at 03:55:45PM +1000, Vadim Rozenfeld wrote: > Currently we use 4-byte alignmed (FILE_LONG_ALIGNMENT) in both Windows > virtio blk and scsi miniport drivers. > It shouldn't be a problem to change it to 512 by setting AlignmentMask > field of PORT_CONFIGURATION_INFORMATION structure > ( > https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/storport/ns-storport-_port_configuration_information > ) > to FILE_512_BYTE_ALIGNMENT. > I don't see any problem with changing the alignment parameter in our > drivers. But it will take us some time to test it properly. Hi Vadim, After looking at this again, I confused memory address/size alignment with request offset/length alignment. The virtio-scsi device does not have memory alignment constraints, so FILE_LONG_ALIGNMENT is okay and there's no need to change it. Thanks, Stefan