Message ID | 20240716180941.40211-12-philmd@linaro.org |
---|---|
State | New |
Headers | show |
Series | [PULL,01/13] hw/core/loader: allow loading larger ROMs | expand |
16.07.2024 21:09, Philippe Mathieu-Daudé wrote: > From: Akihiko Odaki <akihiko.odaki@daynix.com> > > Remove dpy_cursor_define_supported() as it brings no benefit today and > it has a few inherent problems. > > All graphical displays except egl-headless support cursor composition > without DMA-BUF, and egl-headless is meant to be used in conjunction > with another graphical display, so dpy_cursor_define_supported() > always returns true and meaningless. > > Even if we add a new display without cursor composition in the future, > dpy_cursor_define_supported() will be problematic as a cursor display > fix for it because some display devices like virtio-gpu cannot tell the > lack of cursor composition capability to the guest and are unable to > utilize the value the function returns. Therefore, all non-headless > graphical displays must actually implement cursor composition for > correct cursor display. > > Another problem with dpy_cursor_define_supported() is that it returns > true even if only some of the display listeners support cursor > composition, which is wrong unless all display listeners that lack > cursor composition is headless. Apparently this commit made windows10 guest to freeze. There's a rather hairy bugreport at https://bugs.debian.org/1084199 . Also there's an interesting bug filed against qemu, https://gitlab.com/qemu-project/qemu/-/issues/1628 , which seems to be relevant. Can you take a look please? This user did a great job bisecting qemu with no experience whatsoever.. Thanks, /mjt
On Tue, 29 Oct 2024 at 13:30, Michael Tokarev <mjt@tls.msk.ru> wrote: > 16.07.2024 21:09, Philippe Mathieu-Daudé wrote: > > From: Akihiko Odaki <akihiko.odaki@daynix.com> > > > > Remove dpy_cursor_define_supported() as it brings no benefit today and > > it has a few inherent problems. > > > > All graphical displays except egl-headless support cursor composition > > without DMA-BUF, and egl-headless is meant to be used in conjunction > > with another graphical display, so dpy_cursor_define_supported() > > always returns true and meaningless. > > > > Even if we add a new display without cursor composition in the future, > > dpy_cursor_define_supported() will be problematic as a cursor display > > fix for it because some display devices like virtio-gpu cannot tell the > > lack of cursor composition capability to the guest and are unable to > > utilize the value the function returns. Therefore, all non-headless > > graphical displays must actually implement cursor composition for > > correct cursor display. > > > > Another problem with dpy_cursor_define_supported() is that it returns > > true even if only some of the display listeners support cursor > > composition, which is wrong unless all display listeners that lack > > cursor composition is headless. > > Apparently this commit made windows10 guest to freeze. There's a rather > hairy bugreport at https://bugs.debian.org/1084199 . Also there's an > interesting bug filed against qemu, > https://gitlab.com/qemu-project/qemu/-/issues/1628 , > which seems to be relevant. > > Can you take a look please? > > This user did a great job bisecting qemu with no experience whatsoever.. > > It seems an odd commit to be causing this, but you never know… Some random thoughts on this and a few questions: As far as I can tell, in both cases the guest display adapter is QXL. In the case of the Debian user, I think the UI backend is SPICE? Presumably the it's the same for the user reporting the bug directly against Qemu, as they also seem to be using libvirt. (No full command line available as far as I can see?) In any case, the SPICE display specifies: .dpy_cursor_define = display_mouse_define, Which means the old dpy_cursor_define_supported() would have been returning true prior to this commit, with the modified QXL and VMSvga code paths theoretically unaffected. A bit odd, to say the least. But I suppose it's possible that my understanding is incorrect, and dpy_cursor_define_supported() might have returned false. The code in qxl_render_cursor that follows this check and which will now always run, vs would not have done previously includes: if (qxl->debug > 1 && cmd->type != QXL_CURSOR_MOVE) { fprintf(stderr, "%s", __func__); qxl_log_cmd_cursor(qxl, cmd, ext->group_id); fprintf(stderr, "\n"); } Can we get the user to set qxl->debug to a value > 1 and see if the freeze coincides with logging from here? (Possibly tricky to intercept the fprintf output from Qemu run via libvirt though.) I also notice that the subsequent code in the QXL_CURSOR_SET case looks a lot like the code being replaced by Michael's patch from the Qemu bug tracker. (qxl_phys2virt calls) It looks OK to me, but I'm not familiar with QXL at all, so perhaps a chunk of code that could use another review? Or could it be a timing-dependent bug? Although the removed check should have been rather cheap and shouldn't affect timing much: - if (!dpy_cursor_define_supported(qxl->vga.con)) { - return 0; - } Given that "The time before the freeze seems to be random, from a few seconds to a couple of minutes." there is a possibility of a false negative during the bisect. (i.e. commit marked GOOD that should be BAD because it happened to not hit the freeze in the usual time) If that was the case, we would expect the regression to be caused by a nearby *earlier* commit. The only candidate I can spot that touches SPICE or QXL or the display subsystem in general is a418e7a (ui/console: Convert mouse visibility parameter into bool) from the same patchset which looks even more benign, however. We could ask the user to check whether there's any connection with mouse cursor changes, e.g. whether he can more easily provoke the issue by perform actions that rapidly change the mouse cursor. (For example by visiting https://developer.mozilla.org/en-US/docs/Web/CSS/cursor in the guest and moving back and forth over the test area.) Is there an easy way to take a sampling profile on Linux that will show us stack traces of all the threads in the frozen Qemu process? On macOS this is easy with the Activity Monitor GUI or iprofiler on the command line. That ought to confirm whether it's a deadlock or indefinite wait in some Qemu subsystem. Michael, what's the situation with the patch you suggested in your comment on the Qemu bug: https://gitlab.com/qemu-project/qemu/-/issues/1628#note_2144606625 ? Is there any chance we can get the Debian user to try that? Hope that helps! Phil
diff --git a/include/ui/console.h b/include/ui/console.h index 82b573e680..fa986ab97e 100644 --- a/include/ui/console.h +++ b/include/ui/console.h @@ -324,7 +324,6 @@ void dpy_text_update(QemuConsole *con, int x, int y, int w, int h); void dpy_text_resize(QemuConsole *con, int w, int h); void dpy_mouse_set(QemuConsole *con, int x, int y, bool on); void dpy_cursor_define(QemuConsole *con, QEMUCursor *cursor); -bool dpy_cursor_define_supported(QemuConsole *con); bool dpy_gfx_check_format(QemuConsole *con, pixman_format_code_t format); diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c index 8daae72c8d..335d01edde 100644 --- a/hw/display/qxl-render.c +++ b/hw/display/qxl-render.c @@ -307,10 +307,6 @@ int qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt *ext) return 1; } - if (!dpy_cursor_define_supported(qxl->vga.con)) { - return 0; - } - if (qxl->debug > 1 && cmd->type != QXL_CURSOR_MOVE) { fprintf(stderr, "%s", __func__); qxl_log_cmd_cursor(qxl, cmd, ext->group_id); diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c index 512f224b9f..3db3ff98f7 100644 --- a/hw/display/vmware_vga.c +++ b/hw/display/vmware_vga.c @@ -904,10 +904,8 @@ static uint32_t vmsvga_value_read(void *opaque, uint32_t address) caps |= SVGA_CAP_RECT_FILL; #endif #ifdef HW_MOUSE_ACCEL - if (dpy_cursor_define_supported(s->vga.con)) { - caps |= SVGA_CAP_CURSOR | SVGA_CAP_CURSOR_BYPASS_2 | - SVGA_CAP_CURSOR_BYPASS; - } + caps |= SVGA_CAP_CURSOR | SVGA_CAP_CURSOR_BYPASS_2 | + SVGA_CAP_CURSOR_BYPASS; #endif ret = caps; break; diff --git a/ui/console.c b/ui/console.c index 8e9cc1628a..e8f0083af7 100644 --- a/ui/console.c +++ b/ui/console.c @@ -1001,19 +1001,6 @@ void dpy_cursor_define(QemuConsole *c, QEMUCursor *cursor) } } -bool dpy_cursor_define_supported(QemuConsole *con) -{ - DisplayState *s = con->ds; - DisplayChangeListener *dcl; - - QLIST_FOREACH(dcl, &s->listeners, next) { - if (dcl->ops->dpy_cursor_define) { - return true; - } - } - return false; -} - QEMUGLContext dpy_gl_ctx_create(QemuConsole *con, struct QEMUGLParams *qparams) {