Message ID | 20250521164250.135776-13-alex.bennee@linaro.org |
---|---|
State | New |
Headers | show |
Series | Maintainer updates for May (testing, plugins, virtio-gpu) - pre-PR | expand |
On 2025/05/22 1:42, Alex Bennée wrote: > From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> > > This commit fixes an indefinite hang when using VIRTIO GPU blob objects > under TCG in certain conditions. > > The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a > MemoryRegion and attaches it to an offset on a PCI BAR of the > VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps > it. > > Because virglrenderer commands are not thread-safe they are only > called on the main context and QEMU performs the cleanup in three steps > to prevent a use-after-free scenario where the guest can access the > region after it’s unmapped: > > 1. From the main context, the region’s field finish_unmapping is false > by default, so it sets a variable cmd_suspended, increases the > renderer_blocked variable, deletes the blob subregion, and unparents > the blob subregion causing its reference count to decrement. > > 2. From an RCU context, the MemoryView gets freed, the FlatView gets > recalculated, the free callback of the blob region > virtio_gpu_virgl_hostmem_region_free is called which sets the > region’s field finish_unmapping to true, allowing the main thread > context to finish replying to the command > > 3. From the main context, the command is processed again, but this time > finish_unmapping is true, so virgl_renderer_resource_unmap can be > called and a response is sent to the guest. > > It happens so that under TCG, if the guest has no timers configured (and > thus no interrupt will cause the CPU to exit), the RCU thread does not > have enough time to grab the locks and recalculate the FlatView. > > That’s not a big problem in practice since most guests will assume a > response will happen later in time and go on to do different things, > potentially triggering interrupts and allowing the RCU context to run. > If the guest waits for the unmap command to complete though, it blocks > indefinitely. Attaching to the QEMU monitor and force quitting the guest > allows the cleanup to continue. > > There's no reason why the FlatView recalculation can't occur right away > when we delete the blob subregion, however. It does not, because when we > create the subregion we set the object as its own parent: > > memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); > > The extra reference is what prevents freeing the memory region object in > the memory transaction of deleting the subregion. > > This commit changes the owner object to the device, which removes the > extra owner reference in the memory region and causes the MR to be > freed right away in the main context. > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> > Reviewed-by: Alex Bennée <alex.bennee@linaro.org> > Tested-by: Alex Bennée <alex.bennee@linaro.org> > Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org> > Cc: qemu-stable@nongnu.org > --- > hw/display/virtio-gpu-virgl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c > index 71a7500de9..8fbe4e70cc 100644 > --- a/hw/display/virtio-gpu-virgl.c > +++ b/hw/display/virtio-gpu-virgl.c > @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g, > vmr->g = g; > mr = g_new0(MemoryRegion, 1); > > - memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); > + memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data); > memory_region_add_subregion(&b->hostmem, offset, mr); > memory_region_set_enabled(mr, true); > I suggest dropping this patch for now due to the reason I pointed out for the first version of this series.
Akihiko Odaki <akihiko.odaki@daynix.com> writes: > On 2025/05/22 1:42, Alex Bennée wrote: >> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> >> This commit fixes an indefinite hang when using VIRTIO GPU blob >> objects >> under TCG in certain conditions. >> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a >> MemoryRegion and attaches it to an offset on a PCI BAR of the >> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps >> it. >> Because virglrenderer commands are not thread-safe they are only >> called on the main context and QEMU performs the cleanup in three steps >> to prevent a use-after-free scenario where the guest can access the >> region after it’s unmapped: >> 1. From the main context, the region’s field finish_unmapping is >> false >> by default, so it sets a variable cmd_suspended, increases the >> renderer_blocked variable, deletes the blob subregion, and unparents >> the blob subregion causing its reference count to decrement. >> 2. From an RCU context, the MemoryView gets freed, the FlatView gets >> recalculated, the free callback of the blob region >> virtio_gpu_virgl_hostmem_region_free is called which sets the >> region’s field finish_unmapping to true, allowing the main thread >> context to finish replying to the command >> 3. From the main context, the command is processed again, but this >> time >> finish_unmapping is true, so virgl_renderer_resource_unmap can be >> called and a response is sent to the guest. >> It happens so that under TCG, if the guest has no timers configured >> (and >> thus no interrupt will cause the CPU to exit), the RCU thread does not >> have enough time to grab the locks and recalculate the FlatView. >> That’s not a big problem in practice since most guests will assume a >> response will happen later in time and go on to do different things, >> potentially triggering interrupts and allowing the RCU context to run. >> If the guest waits for the unmap command to complete though, it blocks >> indefinitely. Attaching to the QEMU monitor and force quitting the guest >> allows the cleanup to continue. >> There's no reason why the FlatView recalculation can't occur right >> away >> when we delete the blob subregion, however. It does not, because when we >> create the subregion we set the object as its own parent: >> memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); >> The extra reference is what prevents freeing the memory region >> object in >> the memory transaction of deleting the subregion. >> This commit changes the owner object to the device, which removes >> the >> extra owner reference in the memory region and causes the MR to be >> freed right away in the main context. >> Acked-by: Michael S. Tsirkin <mst@redhat.com> >> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> >> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> >> Tested-by: Alex Bennée <alex.bennee@linaro.org> >> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org> >> Cc: qemu-stable@nongnu.org >> --- >> hw/display/virtio-gpu-virgl.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> diff --git a/hw/display/virtio-gpu-virgl.c >> b/hw/display/virtio-gpu-virgl.c >> index 71a7500de9..8fbe4e70cc 100644 >> --- a/hw/display/virtio-gpu-virgl.c >> +++ b/hw/display/virtio-gpu-virgl.c >> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g, >> vmr->g = g; >> mr = g_new0(MemoryRegion, 1); >> - memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, >> data); >> + memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data); >> memory_region_add_subregion(&b->hostmem, offset, mr); >> memory_region_set_enabled(mr, true); >> > > I suggest dropping this patch for now due to the reason I pointed out > for the first version of this series. This fixes an actual bug - without it we get a hang.
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c index 71a7500de9..8fbe4e70cc 100644 --- a/hw/display/virtio-gpu-virgl.c +++ b/hw/display/virtio-gpu-virgl.c @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g, vmr->g = g; mr = g_new0(MemoryRegion, 1); - memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); + memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data); memory_region_add_subregion(&b->hostmem, offset, mr); memory_region_set_enabled(mr, true);