Message ID | 20200827114917.1851111-1-ppandit@redhat.com |
---|---|
State | New |
Headers | show |
Series | pci: check bus pointer before dereference | expand |
+-- On Thu, 27 Aug 2020, P J P wrote --+ | While mapping IRQ level in pci_change_irq_level() routine, | it does not check if pci_get_bus() returned a valid pointer. | It may lead to a NULL pointer dereference issue. Add check to | avoid it. | | -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 | ==1183858==Hint: address points to the zero page. | #0 pci_change_irq_level hw/pci/pci.c:259 | #1 pci_irq_handler hw/pci/pci.c:1445 | #2 pci_set_irq hw/pci/pci.c:1463 | #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 | #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 | #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 | #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 | #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 | #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 | ... | | Reported-by: Ruhr-University <bugs-syssec@rub.de> | Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> | --- | hw/pci/pci.c | 3 +++ | 1 file changed, 3 insertions(+) | | diff --git a/hw/pci/pci.c b/hw/pci/pci.c | index de0fae10ab..df5a2c3294 100644 | --- a/hw/pci/pci.c | +++ b/hw/pci/pci.c | @@ -253,6 +253,9 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change) | PCIBus *bus; | for (;;) { | bus = pci_get_bus(pci_dev); | + if (!bus) { | + return; | + } | irq_num = bus->map_irq(pci_dev, irq_num); | if (bus->set_irq) | break; | Ping...! -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
P J P <ppandit@redhat.com> 于2020年8月27日周四 下午7:52写道: > > From: Prasad J Pandit <pjp@fedoraproject.org> > > While mapping IRQ level in pci_change_irq_level() routine, > it does not check if pci_get_bus() returned a valid pointer. > It may lead to a NULL pointer dereference issue. Add check to > avoid it. > > -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 > ==1183858==Hint: address points to the zero page. > #0 pci_change_irq_level hw/pci/pci.c:259 > #1 pci_irq_handler hw/pci/pci.c:1445 > #2 pci_set_irq hw/pci/pci.c:1463 > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 > ... > > Reported-by: Ruhr-University <bugs-syssec@rub.de> > Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> > --- > hw/pci/pci.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index de0fae10ab..df5a2c3294 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -253,6 +253,9 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change) > PCIBus *bus; > for (;;) { > bus = pci_get_bus(pci_dev); > + if (!bus) { Hi Prasad, I think in normal this 'bus' will be not NULL. I have look at the link in the commit msg. I find it is another DMA to MMIO issue which we have discussed a lot but didn't come up with an satisfying solution. Maybe we can try to the DMA to MMIO issue direction. CC: Peter, Jason and Alex Thanks, Li Qiang > + return; > + } > irq_num = bus->map_irq(pci_dev, irq_num); > if (bus->set_irq) > break; > -- > 2.26.2 > >
+Igor On 9/15/20 3:51 PM, Li Qiang wrote: > P J P <ppandit@redhat.com> 于2020年8月27日周四 下午7:52写道: >> >> From: Prasad J Pandit <pjp@fedoraproject.org> >> >> While mapping IRQ level in pci_change_irq_level() routine, >> it does not check if pci_get_bus() returned a valid pointer. >> It may lead to a NULL pointer dereference issue. Add check to >> avoid it. >> >> -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 >> ==1183858==Hint: address points to the zero page. >> #0 pci_change_irq_level hw/pci/pci.c:259 >> #1 pci_irq_handler hw/pci/pci.c:1445 >> #2 pci_set_irq hw/pci/pci.c:1463 >> #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 >> #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 >> #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 >> #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 >> #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 >> #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 >> ... >> >> Reported-by: Ruhr-University <bugs-syssec@rub.de> >> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> >> --- >> hw/pci/pci.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index de0fae10ab..df5a2c3294 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -253,6 +253,9 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change) >> PCIBus *bus; >> for (;;) { >> bus = pci_get_bus(pci_dev); >> + if (!bus) { > > Hi Prasad, > > I think in normal this 'bus' will be not NULL. > I have look at the link in the commit msg. > I find it is another DMA to MMIO issue which we have discussed a lot > but didn't come up with an > satisfying solution. > > Maybe we can try to the DMA to MMIO issue direction. > CC: Peter, Jason and Alex > > Thanks, > Li Qiang > > >> + return; Nack, this should be an abort(). As usual, question is how we got here. As Li said, it is another DMA to MMIO bug class. lsi_execute_script -> address_space_write -> acpi_pcihp_eject_slot -> bus_remove_child So at this point the PCI device is still MMIO-mapped but eject from the bus... ??? Then IRQ is triggered, which the device wants to propagate via its PCI bus but it doesn't have any more and b00m. If a device is hotpluggable, who is responsible to unmap its regions? >> + } >> irq_num = bus->map_irq(pci_dev, irq_num); >> if (bus->set_irq) >> break; >> -- >> 2.26.2 >> >> >
+-- On Tue, 15 Sep 2020, Philippe Mathieu-Daudé wrote --+ | > I think in normal this 'bus' will be not NULL. I have look at the link in | > the commit msg. I find it is another DMA to MMIO issue which we have | > discussed a lot but didn't come up with an satisfying solution. If 'bus' is unlikely to be NULL, should this be a regular non-CVE bug? | As usual, question is how we got here. | As Li said, it is another DMA to MMIO bug class. | | lsi_execute_script | -> address_space_write | -> acpi_pcihp_eject_slot | -> bus_remove_child | | So at this point the PCI device is still MMIO-mapped but eject from the | bus... ??? Then IRQ is triggered, which the device wants to propagate via | its PCI bus but it doesn't have any more and b00m. | | If a device is hotpluggable, who is responsible to unmap its regions? Not sure, I guess I'll leave it for the upstream maintainers to device a better solution. | Nack, this should be an abort(). === diff --git a/hw/pci/pci.c b/hw/pci/pci.c index de0fae10ab..0ccb991410 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -253,6 +253,7 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change) PCIBus *bus; for (;;) { bus = pci_get_bus(pci_dev); + assert(bus); irq_num = bus->map_irq(pci_dev, irq_num); if (bus->set_irq) break; === This should be okay for now? Thank you. -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: > === > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index de0fae10ab..0ccb991410 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -253,6 +253,7 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int > irq_num, int change) > PCIBus *bus; > for (;;) { > bus = pci_get_bus(pci_dev); > + assert(bus); > irq_num = bus->map_irq(pci_dev, irq_num); > if (bus->set_irq) > break; > === > > This should be okay for now? Generally we don't bother to assert() that pointers that shouldn't be NULL really are NULL immediately before dereferencing them, because the dereference provides an equally easy-to-debug crash to the assert, and so the assert doesn't provide anything extra. assert()ing that a pointer is non-NULL is more useful if it is done in a place that identifies the problem at an earlier and easier-to-debug point in execution rather than at a later point which is distantly removed from the place where the bogus pointer was introduced. thanks -- PMM
Hello, +-- On Wed, 16 Sep 2020, Peter Maydell wrote --+ | On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: | > -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 | > ==1183858==Hint: address points to the zero page. | > #0 pci_change_irq_level hw/pci/pci.c:259 | > #1 pci_irq_handler hw/pci/pci.c:1445 | > #2 pci_set_irq hw/pci/pci.c:1463 | > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 | > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 | > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 | > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 | > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 | > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 ... | Generally we don't bother to assert() that pointers that shouldn't be NULL | really are NULL immediately before dereferencing them, because the | dereference provides an equally easy-to-debug crash to the assert, and so | the assert doesn't provide anything extra. assert()ing that a pointer is | non-NULL is more useful if it is done in a place that identifies the problem | at an earlier and easier-to-debug point in execution rather than at a later | point which is distantly removed from the place where the bogus pointer was | introduced. * The NULL dereference above occurs because the 'pci_dev->qdev->parent_bus' address gets overwritten (with 0x0) during scsi 'Memory Move' operation in ../hw/scsi/lsi53c895a.c #define LSI_BUF_SIZE 4096 lsi_mmio_write lsi_reg_writeb lsi_execute_script static void lsi_memcpy(LSIState *s, ... int count=12MB) { int n; uint8_t buf[LSI_BUF_SIZE]; while (count) { n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; lsi_mem_read(s, src, buf, n); <== read from DMA memory lsi_mem_write(s, dest, buf, n); <== write to I/O memory src += n; dest += n; count -= n; } } -> https://www.manualslib.com/manual/1407578/Lsi-Lsi53c895a.html?page=254#manual * Above loop moves data between DMA memory to i/o address space. * Going through the manual above, it seems 'Memory Move' can move upto 16MB of data between memory spaces. * I tried to see a suitable fix, but couldn't get one. - Limiting 'count' value does not seem right, as allowed value is upto 16MB. - Manual above talks about moving data via 'dma_buf'. But it doesn't seem to be used here. * During above loop, 'dest' address moves past its 'MemoryRegion mr' and overwrites the adjacent 'mr' memory area, overwritting 'parent_bus' value. Any thoughts/hints please...? Thank you. -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
[+Paolo, +Fam Zheng - for scsi] +-- On Mon, 28 Sep 2020, P J P wrote --+ | +-- On Wed, 16 Sep 2020, Peter Maydell wrote --+ | | On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: | | > -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 | | > ==1183858==Hint: address points to the zero page. | | > #0 pci_change_irq_level hw/pci/pci.c:259 | | > #1 pci_irq_handler hw/pci/pci.c:1445 | | > #2 pci_set_irq hw/pci/pci.c:1463 | | > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 | | > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 | | > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 | | > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 | | > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 | | > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 | ... | | Generally we don't bother to assert() that pointers that shouldn't be NULL | | really are NULL immediately before dereferencing them, because the | | dereference provides an equally easy-to-debug crash to the assert, and so | | the assert doesn't provide anything extra. assert()ing that a pointer is | | non-NULL is more useful if it is done in a place that identifies the problem | | at an earlier and easier-to-debug point in execution rather than at a later | | point which is distantly removed from the place where the bogus pointer was | | introduced. | | * The NULL dereference above occurs because the 'pci_dev->qdev->parent_bus' | address gets overwritten (with 0x0) during scsi 'Memory Move' operation in | | ../hw/scsi/lsi53c895a.c | #define LSI_BUF_SIZE 4096 | | lsi_mmio_write | lsi_reg_writeb | lsi_execute_script | static void lsi_memcpy(LSIState *s, ... int count=12MB) | { | int n; | uint8_t buf[LSI_BUF_SIZE]; | | while (count) { | n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; | lsi_mem_read(s, src, buf, n); <== read from DMA memory | lsi_mem_write(s, dest, buf, n); <== write to I/O memory | src += n; | dest += n; | count -= n; | } | } | -> https://www.manualslib.com/manual/1407578/Lsi-Lsi53c895a.html?page=254#manual | | * Above loop moves data between DMA memory to i/o address space. | | * Going through the manual above, it seems 'Memory Move' can move upto 16MB of | data between memory spaces. | | * I tried to see a suitable fix, but couldn't get one. | | - Limiting 'count' value does not seem right, as allowed value is upto 16MB. | | - Manual above talks about moving data via 'dma_buf'. But it doesn't seem to | be used here. | | * During above loop, 'dest' address moves past its 'MemoryRegion mr' and | overwrites the adjacent 'mr' memory area, overwritting 'parent_bus' value. | | Any thoughts/hints please...? @Paolo, @Fam...wdyt? Thank you. -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
On Wed, 30 Sep 2020 10:32:42 +0530 (IST) P J P <ppandit@redhat.com> wrote: > [+Paolo, +Fam Zheng - for scsi] > > +-- On Mon, 28 Sep 2020, P J P wrote --+ > | +-- On Wed, 16 Sep 2020, Peter Maydell wrote --+ > | | On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: > | | > -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 > | | > ==1183858==Hint: address points to the zero page. > | | > #0 pci_change_irq_level hw/pci/pci.c:259 > | | > #1 pci_irq_handler hw/pci/pci.c:1445 > | | > #2 pci_set_irq hw/pci/pci.c:1463 > | | > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 > | | > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 > | | > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 > | | > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 > | | > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 > | | > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 > | ... > | | Generally we don't bother to assert() that pointers that shouldn't be NULL > | | really are NULL immediately before dereferencing them, because the > | | dereference provides an equally easy-to-debug crash to the assert, and so > | | the assert doesn't provide anything extra. assert()ing that a pointer is > | | non-NULL is more useful if it is done in a place that identifies the problem > | | at an earlier and easier-to-debug point in execution rather than at a later > | | point which is distantly removed from the place where the bogus pointer was > | | introduced. > | > | * The NULL dereference above occurs because the 'pci_dev->qdev->parent_bus' > | address gets overwritten (with 0x0) during scsi 'Memory Move' operation in > | > | ../hw/scsi/lsi53c895a.c > | #define LSI_BUF_SIZE 4096 > | > | lsi_mmio_write > | lsi_reg_writeb > | lsi_execute_script > | static void lsi_memcpy(LSIState *s, ... int count=12MB) > | { > | int n; > | uint8_t buf[LSI_BUF_SIZE]; > | > | while (count) { > | n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; > | lsi_mem_read(s, src, buf, n); <== read from DMA memory > | lsi_mem_write(s, dest, buf, n); <== write to I/O memory > | src += n; > | dest += n; > | count -= n; > | } > | } > | -> https://www.manualslib.com/manual/1407578/Lsi-Lsi53c895a.html?page=254#manual > | > | * Above loop moves data between DMA memory to i/o address space. > | > | * Going through the manual above, it seems 'Memory Move' can move upto 16MB of > | data between memory spaces. > | > | * I tried to see a suitable fix, but couldn't get one. > | > | - Limiting 'count' value does not seem right, as allowed value is upto 16MB. > | > | - Manual above talks about moving data via 'dma_buf'. But it doesn't seem to > | be used here. > | > | * During above loop, 'dest' address moves past its 'MemoryRegion mr' and > | overwrites the adjacent 'mr' memory area, overwritting 'parent_bus' value. > | > | Any thoughts/hints please...? 'dest' is offset into MemoryRegion, so far I don't see how it could break into QEMU stack. Do you have a simple reproducer? > @Paolo, @Fam...wdyt? > > Thank you. > -- > Prasad J Pandit / Red Hat Product Security Team > 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
+-- On Wed, 30 Sep 2020, Igor Mammedov wrote --+ | 'dest' is offset into MemoryRegion, so far I don't see how it could break | into QEMU stack. Do you have a simple reproducer? Please see: -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 Thank you. -- Prasad J Pandit / Red Hat Product Security Team 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
On Wed, Sep 30, 2020 at 10:32:42AM +0530, P J P wrote: > > [+Paolo, +Fam Zheng - for scsi] > > +-- On Mon, 28 Sep 2020, P J P wrote --+ > | +-- On Wed, 16 Sep 2020, Peter Maydell wrote --+ > | | On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: > | | > -> https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 > | | > ==1183858==Hint: address points to the zero page. > | | > #0 pci_change_irq_level hw/pci/pci.c:259 > | | > #1 pci_irq_handler hw/pci/pci.c:1445 > | | > #2 pci_set_irq hw/pci/pci.c:1463 > | | > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 > | | > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 > | | > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 > | | > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 > | | > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 > | | > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 > | ... > | | Generally we don't bother to assert() that pointers that shouldn't be NULL > | | really are NULL immediately before dereferencing them, because the > | | dereference provides an equally easy-to-debug crash to the assert, and so > | | the assert doesn't provide anything extra. assert()ing that a pointer is > | | non-NULL is more useful if it is done in a place that identifies the problem > | | at an earlier and easier-to-debug point in execution rather than at a later > | | point which is distantly removed from the place where the bogus pointer was > | | introduced. > | > | * The NULL dereference above occurs because the 'pci_dev->qdev->parent_bus' > | address gets overwritten (with 0x0) during scsi 'Memory Move' operation in > | > | ../hw/scsi/lsi53c895a.c > | #define LSI_BUF_SIZE 4096 > | > | lsi_mmio_write > | lsi_reg_writeb > | lsi_execute_script > | static void lsi_memcpy(LSIState *s, ... int count=12MB) > | { > | int n; > | uint8_t buf[LSI_BUF_SIZE]; > | > | while (count) { > | n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; > | lsi_mem_read(s, src, buf, n); <== read from DMA memory > | lsi_mem_write(s, dest, buf, n); <== write to I/O memory > | src += n; > | dest += n; > | count -= n; > | } > | } > | -> https://www.manualslib.com/manual/1407578/Lsi-Lsi53c895a.html?page=254#manual > | > | * Above loop moves data between DMA memory to i/o address space. > | > | * Going through the manual above, it seems 'Memory Move' can move upto 16MB of > | data between memory spaces. > | > | * I tried to see a suitable fix, but couldn't get one. > | > | - Limiting 'count' value does not seem right, as allowed value is upto 16MB. > | > | - Manual above talks about moving data via 'dma_buf'. But it doesn't seem to > | be used here. > | > | * During above loop, 'dest' address moves past its 'MemoryRegion mr' and > | overwrites the adjacent 'mr' memory area, overwritting 'parent_bus' value. > | > | Any thoughts/hints please...? > > @Paolo, @Fam...wdyt? > > Thank you. Guys are we going anywhere with this patch? > -- > Prasad J Pandit / Red Hat Product Security Team > 8685 545E B54C 486B C6EB 271E E285 8B5A F050 DE8D
On Fri, 2020-10-30 at 05:01 -0400, Michael S. Tsirkin wrote: > On Wed, Sep 30, 2020 at 10:32:42AM +0530, P J P wrote: > > > > [+Paolo, +Fam Zheng - for scsi] > > > > +-- On Mon, 28 Sep 2020, P J P wrote --+ > > > +-- On Wed, 16 Sep 2020, Peter Maydell wrote --+ > > > > On Wed, 16 Sep 2020 at 07:28, P J P <ppandit@redhat.com> wrote: > > > > > -> > > > > > https://ruhr-uni-bochum.sciebo.de/s/NNWP2GfwzYKeKwE?path=%2Flsi_nullptr1 > > > > > ==1183858==Hint: address points to the zero page. > > > > > #0 pci_change_irq_level hw/pci/pci.c:259 > > > > > #1 pci_irq_handler hw/pci/pci.c:1445 > > > > > #2 pci_set_irq hw/pci/pci.c:1463 > > > > > #3 lsi_set_irq hw/scsi/lsi53c895a.c:488 > > > > > #4 lsi_update_irq hw/scsi/lsi53c895a.c:523 > > > > > #5 lsi_script_scsi_interrupt hw/scsi/lsi53c895a.c:554 > > > > > #6 lsi_execute_script hw/scsi/lsi53c895a.c:1149 > > > > > #7 lsi_reg_writeb hw/scsi/lsi53c895a.c:1984 > > > > > #8 lsi_io_write hw/scsi/lsi53c895a.c:2146 > > > > > > ... > > > > Generally we don't bother to assert() that pointers that > > > > shouldn't be NULL > > > > really are NULL immediately before dereferencing them, because > > > > the > > > > dereference provides an equally easy-to-debug crash to the > > > > assert, and so > > > > the assert doesn't provide anything extra. assert()ing that a > > > > pointer is > > > > non-NULL is more useful if it is done in a place that > > > > identifies the problem > > > > at an earlier and easier-to-debug point in execution rather > > > > than at a later > > > > point which is distantly removed from the place where the bogus > > > > pointer was > > > > introduced. > > > > > > * The NULL dereference above occurs because the 'pci_dev->qdev- > > > >parent_bus' > > > address gets overwritten (with 0x0) during scsi 'Memory Move' > > > operation in > > > > > > ../hw/scsi/lsi53c895a.c > > > #define LSI_BUF_SIZE 4096 > > > > > > lsi_mmio_write > > > lsi_reg_writeb > > > lsi_execute_script > > > static void lsi_memcpy(LSIState *s, ... int count=12MB) > > > { > > > int n; > > > uint8_t buf[LSI_BUF_SIZE]; > > > > > > while (count) { > > > n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; > > > lsi_mem_read(s, src, buf, n); <== read from DMA > > > memory > > > lsi_mem_write(s, dest, buf, n); <== write to I/O > > > memory > > > src += n; > > > dest += n; > > > count -= n; > > > } > > > } > > > -> > > > https://www.manualslib.com/manual/1407578/Lsi-Lsi53c895a.html?page=254#manual > > > > > > * Above loop moves data between DMA memory to i/o address space. > > > > > > * Going through the manual above, it seems 'Memory Move' can move > > > upto 16MB of > > > data between memory spaces. > > > > > > * I tried to see a suitable fix, but couldn't get one. > > > > > > - Limiting 'count' value does not seem right, as allowed value > > > is upto 16MB. > > > > > > - Manual above talks about moving data via 'dma_buf'. But it > > > doesn't seem to > > > be used here. > > > > > > * During above loop, 'dest' address moves past its 'MemoryRegion > > > mr' and > > > overwrites the adjacent 'mr' memory area, overwritting > > > 'parent_bus' value. I agree with Igor, I don't understand how pci_dma_rw writing into the next mr can cause the NULL pointer. parent_bus is in the QEMU heap, but mr is backed by different ram areas, protected with boundary check. Is there a backtrace at the corruption time? Fam
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index de0fae10ab..df5a2c3294 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -253,6 +253,9 @@ static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change) PCIBus *bus; for (;;) { bus = pci_get_bus(pci_dev); + if (!bus) { + return; + } irq_num = bus->map_irq(pci_dev, irq_num); if (bus->set_irq) break;