Message ID | 20210422222429.183108-2-mst@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [PULL,1/2] amd_iommu: Fix pte_override_page_mask() | expand |
On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote: > > From: Jean-Philippe Brucker <jean-philippe@linaro.org> > > AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page > size. Quoting the AMD IOMMU specification: "When the Next Level bits [of > a pte] are 7h, the size of the page is determined by the first zero bit > in the page address, starting from bit 12." > > So if the lowest bits of the page address is 0, the page is 8kB. If the > lowest bits are 011, the page is 32kB. Currently pte_override_page_mask() > doesn't compute the right value for this page size and amdvi_translate() > can return the wrong guest-physical address. With a Linux guest, DMA > from SATA devices accesses the wrong memory and causes probe failure: > > qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \ > -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0 > [ 6.613093] ata1.00: qc timeout (cmd 0xec) > [ 6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) > > Fix the page mask. > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Jean-Philippe, do you know if this is a regression since 5.2? I'm guessing not given that the function in question has been that way since the amd_iommu was introduced in 2016. thanks -- PMM
On Fri, Apr 23, 2021 at 02:01:19PM +0100, Peter Maydell wrote: > On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > From: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page > > size. Quoting the AMD IOMMU specification: "When the Next Level bits [of > > a pte] are 7h, the size of the page is determined by the first zero bit > > in the page address, starting from bit 12." > > > > So if the lowest bits of the page address is 0, the page is 8kB. If the > > lowest bits are 011, the page is 32kB. Currently pte_override_page_mask() > > doesn't compute the right value for this page size and amdvi_translate() > > can return the wrong guest-physical address. With a Linux guest, DMA > > from SATA devices accesses the wrong memory and causes probe failure: > > > > qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \ > > -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0 > > [ 6.613093] ata1.00: qc timeout (cmd 0xec) > > [ 6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) > > > > Fix the page mask. > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org> > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > Jean-Philippe, do you know if this is a regression since 5.2? I don't think so, I can reproduce it with v5.2.0. > I'm guessing not given that the function in question has been that > way since the amd_iommu was introduced in 2016. There has been a lot of work on the AMD IOMMU driver in Linux recently. Maybe that exacerbated the problem but I can't find a relevant change. It's also possible that this path hasn't been exercised before - I just happened to run a SATA device under AMD IOMMU this week to debug an unrelated Linux issue. The other devices in the VM don't seem to have a problem doing DMA. Thanks, Jean
On Fri, 23 Apr 2021 at 14:35, Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > On Fri, Apr 23, 2021 at 02:01:19PM +0100, Peter Maydell wrote: > > On Thu, 22 Apr 2021 at 23:24, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > From: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page > > > size. Quoting the AMD IOMMU specification: "When the Next Level bits [of > > > a pte] are 7h, the size of the page is determined by the first zero bit > > > in the page address, starting from bit 12." > > > > > > So if the lowest bits of the page address is 0, the page is 8kB. If the > > > lowest bits are 011, the page is 32kB. Currently pte_override_page_mask() > > > doesn't compute the right value for this page size and amdvi_translate() > > > can return the wrong guest-physical address. With a Linux guest, DMA > > > from SATA devices accesses the wrong memory and causes probe failure: > > > > > > qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \ > > > -device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0 > > > [ 6.613093] ata1.00: qc timeout (cmd 0xec) > > > [ 6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4) > > > > > > Fix the page mask. > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org> > > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > > > > Jean-Philippe, do you know if this is a regression since 5.2? > > I don't think so, I can reproduce it with v5.2.0. OK, thanks; I think I favour not putting this into rc5, then. -- PMM
On Fri, Apr 23, 2021 at 05:11:33PM +0100, Peter Maydell wrote: > > > Jean-Philippe, do you know if this is a regression since 5.2? > > > > I don't think so, I can reproduce it with v5.2.0. > > OK, thanks; I think I favour not putting this into rc5, then. No problem, please let me know if I should resend after the next release Thanks, Jean
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c index 74a93a5d93..43b6e9bf51 100644 --- a/hw/i386/amd_iommu.c +++ b/hw/i386/amd_iommu.c @@ -860,8 +860,8 @@ static inline uint8_t get_pte_translation_mode(uint64_t pte) static inline uint64_t pte_override_page_mask(uint64_t pte) { - uint8_t page_mask = 12; - uint64_t addr = (pte & AMDVI_DEV_PT_ROOT_MASK) ^ AMDVI_DEV_PT_ROOT_MASK; + uint8_t page_mask = 13; + uint64_t addr = (pte & AMDVI_DEV_PT_ROOT_MASK) >> 12; /* find the first zero bit */ while (addr & 1) { page_mask++;