mbox series

[for,7.2-rc?,v2,0/5] continuing efforts to fix vhost-user issues

Message ID 20221125173043.1998075-1-alex.bennee@linaro.org
Headers show
Series continuing efforts to fix vhost-user issues | expand

Message

Alex Bennée Nov. 25, 2022, 5:30 p.m. UTC
Hi,

This is continuing to attempt to fix the various vhost-user issues
that are currently plaguing the release. One concrete bug I've come
across is that all qtest MMIO devices where being treated as legacy
which caused the VIRTIO_F_VERSION_1 flag to get missed causing s390x
to fall back to trying to set the endian value for the virt-queues.

I've patched it for the GPIO tests and raised a tracking bug (#1342)
for the general problem. This might explain why the only other VirtIO
vhost-user MMIO device tested via qtest are the virtio-net-tests. The
vhost networking support is its own special implementation so its hard
to compare the code for GPIO. It does make me wonder if disabling the
mmio version of the test for now would be worthwhile. FWIW I did try
disabling force-legacy for all machine types and that caused a bunch
of the other tests to fail.

I made some progress in tracking down the memory leak that clang
complains about. It comes down to the line:

  gpio->vhost_dev.vqs = g_new0(struct vhost_virtqueue, gpio->vhost_dev.nvqs);

which is never cleared up because we never call
vu_gpio_device_unrealize() in the test. However its unclear why this
is the case. We don't seem to unrealize the vhost-user-network tests
either and clang doesn't complain about that.

I can replicate some of the other failures I've been seeing in CI by
running:

  ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test

however this seems to run everything in parallel and maybe is better
at exposing race conditions. Perhaps the CI system makes those races
easier to hit? Unfortunately I've not been able to figure out exactly
how things go wrong in the failure case. 

I've included Stefano's:

  vhost: enable vrings in vhost_dev_start() for vhost-user devices

in this series as it makes sense and improves the vring state errors.
However it's up to you if you want to include it in the eventual PR.
There are still CI errors I'm trying to track down but I thought it
would be worth posting the current state of my tree.

Please review.


Alex Bennée (4):
  include/hw: attempt to document VirtIO feature variables
  include/hw: VM state takes precedence in virtio_device_should_start
  tests/qtests: override "force-legacy" for gpio virtio-mmio tests
  hw/virtio: ensure a valid host_feature set for virtio-user-gpio

Stefano Garzarella (1):
  vhost: enable vrings in vhost_dev_start() for vhost-user devices

 include/hw/virtio/vhost.h        | 31 ++++++++++++++++++----
 include/hw/virtio/virtio.h       | 43 ++++++++++++++++++++++++++-----
 backends/cryptodev-vhost.c       |  4 +--
 backends/vhost-user.c            |  4 +--
 hw/block/vhost-user-blk.c        |  4 +--
 hw/net/vhost_net.c               |  8 +++---
 hw/scsi/vhost-scsi-common.c      |  4 +--
 hw/virtio/vhost-user-fs.c        |  4 +--
 hw/virtio/vhost-user-gpio.c      | 10 ++++++--
 hw/virtio/vhost-user-i2c.c       |  4 +--
 hw/virtio/vhost-user-rng.c       |  4 +--
 hw/virtio/vhost-vsock-common.c   |  4 +--
 hw/virtio/vhost.c                | 44 ++++++++++++++++++++++++++++----
 tests/qtest/libqos/virtio-gpio.c |  3 ++-
 hw/virtio/trace-events           |  4 +--
 15 files changed, 134 insertions(+), 41 deletions(-)

Comments

Stefan Weil Nov. 25, 2022, 6:20 p.m. UTC | #1
Am 25.11.22 um 18:30 schrieb Alex Bennée:
> Hi,
> 
> This is continuing to attempt to fix the various vhost-user issues
> that are currently plaguing the release. One concrete bug I've come
> across is that all qtest MMIO devices where being treated as legacy
> which caused the VIRTIO_F_VERSION_1 flag to get missed causing s390x
> to fall back to trying to set the endian value for the virt-queues.

Do you want to add my 4 commits which fix format strings for 
libvhost-user to your series, or should they be handled separately?

Regards
Stefan
Alex Bennée Nov. 25, 2022, 7:58 p.m. UTC | #2
Alex Bennée <alex.bennee@linaro.org> writes:

> Hi,
>
<snip>
> I can replicate some of the other failures I've been seeing in CI by
> running:
>
>   ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test
>
> however this seems to run everything in parallel and maybe is better
> at exposing race conditions. Perhaps the CI system makes those races
> easier to hit? Unfortunately I've not been able to figure out exactly
> how things go wrong in the failure case. 
>
<snip>

There is a circular call - we are in vu_gpio_stop which triggers a write
to vhost-user which allows us to catch a disconnect event:

  #0  vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
  #1  0x0000557adbe0518a in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:138
  #2  0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255
  #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
  #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
  #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
  #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
  #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
  #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
  #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
  #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
  #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
  #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
  #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
  #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
  #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
  #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
  #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
  #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
  #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
  #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
  #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
  #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
  #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
  #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
  (rr) p hdev->started
  $9 = true
  (rr) info thread
    Id   Target Id                                Frame 
  * 1    Thread 2140414.2140414 (qemu-system-aar) vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
    2    Thread 2140414.2140439 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
    3    Thread 2140414.2140442 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
    4    Thread 2140414.2140443 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()

During which we eliminate the vhost_dev with a memset:

  Thread 1 hit Hardware watchpoint 2: *(unsigned int *) 0x557adf80da30

  Old value = 2
  New value = 0
  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
  Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S.
  220     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
  (rr) bt
  #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
  #1  0x0000557adbdd67f8 in vhost_dev_cleanup (hdev=0x557adf80d878) at ../../hw/virtio/vhost.c:1501
  #2  0x0000557adbe04d68 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:256
  #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
  #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
  #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
  #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
  #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
  #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
  #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
  #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
  #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
  #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
  #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
  #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
  #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
  #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
  #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
  #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
  #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
  #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
  #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
  #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
  #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
  #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48

Before finally:

  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
  #1  0x00007f24dc269537 in __GI_abort () at abort.c:79
  #2  0x00007f24dc26940f in __assert_fail_base (fmt=0x7f24dc3e16a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=<optimized out>) at assert.c:92
  #3  0x00007f24dc278662 in __GI___assert_fail (assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=0x557adc28d922 "int virtio_pci_set_guest_notifiers(DeviceState *, int, _Bool)") at assert.c:101
  #4  0x0000557adb8e97f1 in virtio_pci_set_guest_notifiers (d=0x557adf805280, nvqs=0, assign=false) at ../../hw/virtio/virtio-pci.c:1029
  #5  0x0000557adbe051c7 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:144
  #6  0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
  #7  0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
  #8  0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
  #9  0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
  #10 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
  #11 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
  #12 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
  #13 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
  #14 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48

Because of course we've just done that on disconnect.

Not sure what the cleanest way to avoid that is yet. Maybe it will be
clearer on Monday morning.
Alex Bennée Nov. 25, 2022, 8:08 p.m. UTC | #3
Stefan Weil <sw@weilnetz.de> writes:

> Am 25.11.22 um 18:30 schrieb Alex Bennée:
>> Hi,
>> This is continuing to attempt to fix the various vhost-user issues
>> that are currently plaguing the release. One concrete bug I've come
>> across is that all qtest MMIO devices where being treated as legacy
>> which caused the VIRTIO_F_VERSION_1 flag to get missed causing s390x
>> to fall back to trying to set the endian value for the virt-queues.
>
> Do you want to add my 4 commits which fix format strings for
> libvhost-user to your series, or should they be handled separately?

I'm going to leave the choice of VirtIO patches to take to MST given
he is the maintainer and I've obviously broken it enough this release :-/
Alex Bennée Nov. 26, 2022, 9:42 a.m. UTC | #4
Alex Bennée <alex.bennee@linaro.org> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Hi,
>>
> <snip>
>> I can replicate some of the other failures I've been seeing in CI by
>> running:
>>
>>   ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test
>>
>> however this seems to run everything in parallel and maybe is better
>> at exposing race conditions. Perhaps the CI system makes those races
>> easier to hit? Unfortunately I've not been able to figure out exactly
>> how things go wrong in the failure case. 
>>
> <snip>
>
> There is a circular call - we are in vu_gpio_stop which triggers a write
> to vhost-user which allows us to catch a disconnect event:
>
>   #0  vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
>   #1  0x0000557adbe0518a in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:138
>   #2  0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255
>   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274

I suspect the best choice here is to schedule the cleanup as a later
date. Should I use the aio_bh one shots for this or maybe an rcu cleanup
event?

Paolo, any suggestions?

>   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
>   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
>   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
>   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
>   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
>   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
>   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
>   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
>   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
>   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
>   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
>   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
>   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>   (rr) p hdev->started
>   $9 = true
>   (rr) info thread
>     Id   Target Id                                Frame 
>   * 1    Thread 2140414.2140414 (qemu-system-aar) vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
>     2    Thread 2140414.2140439 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>     3    Thread 2140414.2140442 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>     4    Thread 2140414.2140443 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>
> During which we eliminate the vhost_dev with a memset:
>
>   Thread 1 hit Hardware watchpoint 2: *(unsigned int *) 0x557adf80da30
>
>   Old value = 2
>   New value = 0
>   __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
>   Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S.
>   220     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
>   (rr) bt
>   #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
>   #1  0x0000557adbdd67f8 in vhost_dev_cleanup (hdev=0x557adf80d878) at ../../hw/virtio/vhost.c:1501
>   #2  0x0000557adbe04d68 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:256
>   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
>   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
>   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
>   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
>   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
>   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
>   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
>   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
>   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
>   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
>   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
>   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
>   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
>   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>
> Before finally:
>
>   #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>   #1  0x00007f24dc269537 in __GI_abort () at abort.c:79
>   #2  0x00007f24dc26940f in __assert_fail_base (fmt=0x7f24dc3e16a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=<optimized out>) at assert.c:92
>   #3  0x00007f24dc278662 in __GI___assert_fail (assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=0x557adc28d922 "int virtio_pci_set_guest_notifiers(DeviceState *, int, _Bool)") at assert.c:101
>   #4  0x0000557adb8e97f1 in virtio_pci_set_guest_notifiers (d=0x557adf805280, nvqs=0, assign=false) at ../../hw/virtio/virtio-pci.c:1029
>   #5  0x0000557adbe051c7 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:144
>   #6  0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>   #7  0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>   #8  0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>   #9  0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>   #10 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>   #11 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>   #12 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>   #13 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>   #14 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>
> Because of course we've just done that on disconnect.
>
> Not sure what the cleanest way to avoid that is yet. Maybe it will be
> clearer on Monday morning.
Stefan Hajnoczi Nov. 26, 2022, 11:24 a.m. UTC | #5
On Sat, 26 Nov 2022 at 04:45, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Alex Bennée <alex.bennee@linaro.org> writes:
>
> > Alex Bennée <alex.bennee@linaro.org> writes:
> >
> >> Hi,
> >>
> > <snip>
> >> I can replicate some of the other failures I've been seeing in CI by
> >> running:
> >>
> >>   ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test
> >>
> >> however this seems to run everything in parallel and maybe is better
> >> at exposing race conditions. Perhaps the CI system makes those races
> >> easier to hit? Unfortunately I've not been able to figure out exactly
> >> how things go wrong in the failure case.
> >>
> > <snip>
> >
> > There is a circular call - we are in vu_gpio_stop which triggers a write
> > to vhost-user which allows us to catch a disconnect event:
> >
> >   #0  vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
> >   #1  0x0000557adbe0518a in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:138
> >   #2  0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:255
> >   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
>
> I suspect the best choice here is to schedule the cleanup as a later
> date. Should I use the aio_bh one shots for this or maybe an rcu cleanup
> event?
>
> Paolo, any suggestions?
>
> >   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
> >   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
> >   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
> >   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129

Does this mean the backend closed the connection before receiving all
the vhost-user protocol messages sent by QEMU?

This looks like a backend bug. It prevents QEMU's vhost-user client
from cleanly stopping the virtqueue (vhost_virtqueue_stop()).

QEMU is still broken if it cannot handle disconnect at any time. Maybe
a simple solution for that is to check for reentrancy (either by
checking an existing variable or adding a new one to prevent
vu_gpio_stop() from calling itself).

> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >   (rr) p hdev->started
> >   $9 = true
> >   (rr) info thread
> >     Id   Target Id                                Frame
> >   * 1    Thread 2140414.2140414 (qemu-system-aar) vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
> >     2    Thread 2140414.2140439 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >     3    Thread 2140414.2140442 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >     4    Thread 2140414.2140443 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >
> > During which we eliminate the vhost_dev with a memset:
> >
> >   Thread 1 hit Hardware watchpoint 2: *(unsigned int *) 0x557adf80da30
> >
> >   Old value = 2
> >   New value = 0
> >   __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
> >   Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S.
> >   220     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
> >   (rr) bt
> >   #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
> >   #1  0x0000557adbdd67f8 in vhost_dev_cleanup (hdev=0x557adf80d878) at ../../hw/virtio/vhost.c:1501
> >   #2  0x0000557adbe04d68 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:256
> >   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
> >   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
> >   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
> >   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
> >   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >
> > Before finally:
> >
> >   #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> >   #1  0x00007f24dc269537 in __GI_abort () at abort.c:79
> >   #2  0x00007f24dc26940f in __assert_fail_base (fmt=0x7f24dc3e16a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=<optimized out>) at assert.c:92
> >   #3  0x00007f24dc278662 in __GI___assert_fail (assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=0x557adc28d922 "int virtio_pci_set_guest_notifiers(DeviceState *, int, _Bool)") at assert.c:101
> >   #4  0x0000557adb8e97f1 in virtio_pci_set_guest_notifiers (d=0x557adf805280, nvqs=0, assign=false) at ../../hw/virtio/virtio-pci.c:1029
> >   #5  0x0000557adbe051c7 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:144
> >   #6  0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >   #7  0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >   #8  0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >   #9  0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >   #10 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >   #11 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >   #12 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >   #13 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >   #14 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >
> > Because of course we've just done that on disconnect.
> >
> > Not sure what the cleanest way to avoid that is yet. Maybe it will be
> > clearer on Monday morning.
>
>
> --
> Alex Bennée
>
Alex Bennée Nov. 26, 2022, 2:12 p.m. UTC | #6
Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Sat, 26 Nov 2022 at 04:45, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>>
>> Alex Bennée <alex.bennee@linaro.org> writes:
>>
>> > Alex Bennée <alex.bennee@linaro.org> writes:
>> >
>> >> Hi,
>> >>
>> > <snip>
>> >> I can replicate some of the other failures I've been seeing in CI by
>> >> running:
>> >>
>> >>   ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test
>> >>
>> >> however this seems to run everything in parallel and maybe is better
>> >> at exposing race conditions. Perhaps the CI system makes those races
>> >> easier to hit? Unfortunately I've not been able to figure out exactly
>> >> how things go wrong in the failure case.
>> >>
>> > <snip>
>> >
>> > There is a circular call - we are in vu_gpio_stop which triggers a write
>> > to vhost-user which allows us to catch a disconnect event:
>> >
>> >   #0 vhost_dev_is_started (hdev=0x557adf80d878) at
>> > /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
>> >   #1  0x0000557adbe0518a in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:138
>> >   #2 0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640)
>> > at ../../hw/virtio/vhost-user-gpio.c:255
>> >   #3 0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640,
>> > event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
>>
>> I suspect the best choice here is to schedule the cleanup as a later
>> date. Should I use the aio_bh one shots for this or maybe an rcu cleanup
>> event?
>>
>> Paolo, any suggestions?
>>
>> >   #4 0x0000557adc0539ef in chr_be_event (s=0x557adea51f10,
>> > event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
>> >   #5 0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10,
>> > event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
>> >   #6 0x0000557adc04f666 in tcp_chr_disconnect_locked
>> > (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
>> >   #7 0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10,
>> > buf=0x7ffe8588cce0 "\v", len=20) at
>> > ../../chardev/char-socket.c:129
>
> Does this mean the backend closed the connection before receiving all
> the vhost-user protocol messages sent by QEMU?
>
> This looks like a backend bug. It prevents QEMU's vhost-user client
> from cleanly stopping the virtqueue (vhost_virtqueue_stop()).

Well the backend in this case is the qtest framework so not the worlds
most complete implementation.

> QEMU is still broken if it cannot handle disconnect at any time. Maybe
> a simple solution for that is to check for reentrancy (either by
> checking an existing variable or adding a new one to prevent
> vu_gpio_stop() from calling itself).

vhost-user-blk introduced an additional flag:

    /*
     * There are at least two steps of initialization of the
     * vhost-user device. The first is a "connect" step and
     * second is a "start" step. Make a separation between
     * those initialization phases by using two fields.
     */
    /* vhost_user_blk_connect/vhost_user_blk_disconnect */
    bool connected;
    /* vhost_user_blk_start/vhost_user_blk_stop */
    bool started_vu;

but that in itself is not enough. If you look at the various cases of
handling CHR_EVENT_CLOSED you'll see some schedule the shutdown with aio
and some don't even bother (so will probably break the same way).

Rather than have a mish-mash of solutions maybe we should introduce a
new vhost function - vhost_user_async_close() which can take care of the
scheduling and wrap it with a check for a valid vhost structure in case
it gets shutdown in the meantime?

>
>> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
>> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
>> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
>> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
>> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
>> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
>> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
>> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
>> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>> >   (rr) p hdev->started
>> >   $9 = true
>> >   (rr) info thread
>> >     Id   Target Id                                Frame
>> >   * 1    Thread 2140414.2140414 (qemu-system-aar) vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
>> >     2    Thread 2140414.2140439 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>> >     3    Thread 2140414.2140442 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>> >     4    Thread 2140414.2140443 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
>> >
>> > During which we eliminate the vhost_dev with a memset:
>> >
>> >   Thread 1 hit Hardware watchpoint 2: *(unsigned int *) 0x557adf80da30
>> >
>> >   Old value = 2
>> >   New value = 0
>> >   __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
>> >   Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S.
>> >   220     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
>> >   (rr) bt
>> >   #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
>> >   #1  0x0000557adbdd67f8 in vhost_dev_cleanup (hdev=0x557adf80d878) at ../../hw/virtio/vhost.c:1501
>> >   #2  0x0000557adbe04d68 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:256
>> >   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
>> >   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
>> >   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
>> >   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
>> >   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
>> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
>> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
>> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
>> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
>> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
>> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
>> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
>> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
>> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>> >
>> > Before finally:
>> >
>> >   #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>> >   #1  0x00007f24dc269537 in __GI_abort () at abort.c:79
>> >   #2  0x00007f24dc26940f in __assert_fail_base (fmt=0x7f24dc3e16a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=<optimized out>) at assert.c:92
>> >   #3  0x00007f24dc278662 in __GI___assert_fail (assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=0x557adc28d922 "int virtio_pci_set_guest_notifiers(DeviceState *, int, _Bool)") at assert.c:101
>> >   #4  0x0000557adb8e97f1 in virtio_pci_set_guest_notifiers (d=0x557adf805280, nvqs=0, assign=false) at ../../hw/virtio/virtio-pci.c:1029
>> >   #5  0x0000557adbe051c7 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:144
>> >   #6  0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
>> >   #7  0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
>> >   #8  0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
>> >   #9  0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
>> >   #10 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
>> >   #11 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
>> >   #12 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
>> >   #13 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
>> >   #14 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
>> >
>> > Because of course we've just done that on disconnect.
>> >
>> > Not sure what the cleanest way to avoid that is yet. Maybe it will be
>> > clearer on Monday morning.
>>
>>
>> --
>> Alex Bennée
>>
Stefan Hajnoczi Nov. 27, 2022, 6:04 p.m. UTC | #7
On Sat, 26 Nov 2022 at 09:18, Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Stefan Hajnoczi <stefanha@gmail.com> writes:
>
> > On Sat, 26 Nov 2022 at 04:45, Alex Bennée <alex.bennee@linaro.org> wrote:
> >>
> >>
> >> Alex Bennée <alex.bennee@linaro.org> writes:
> >>
> >> > Alex Bennée <alex.bennee@linaro.org> writes:
> >> >
> >> >> Hi,
> >> >>
> >> > <snip>
> >> >> I can replicate some of the other failures I've been seeing in CI by
> >> >> running:
> >> >>
> >> >>   ../../meson/meson.py test --repeat 10 --print-errorlogs qtest-arm/qos-test
> >> >>
> >> >> however this seems to run everything in parallel and maybe is better
> >> >> at exposing race conditions. Perhaps the CI system makes those races
> >> >> easier to hit? Unfortunately I've not been able to figure out exactly
> >> >> how things go wrong in the failure case.
> >> >>
> >> > <snip>
> >> >
> >> > There is a circular call - we are in vu_gpio_stop which triggers a write
> >> > to vhost-user which allows us to catch a disconnect event:
> >> >
> >> >   #0 vhost_dev_is_started (hdev=0x557adf80d878) at
> >> > /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
> >> >   #1  0x0000557adbe0518a in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:138
> >> >   #2 0x0000557adbe04d56 in vu_gpio_disconnect (dev=0x557adf80d640)
> >> > at ../../hw/virtio/vhost-user-gpio.c:255
> >> >   #3 0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640,
> >> > event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
> >>
> >> I suspect the best choice here is to schedule the cleanup as a later
> >> date. Should I use the aio_bh one shots for this or maybe an rcu cleanup
> >> event?
> >>
> >> Paolo, any suggestions?
> >>
> >> >   #4 0x0000557adc0539ef in chr_be_event (s=0x557adea51f10,
> >> > event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
> >> >   #5 0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10,
> >> > event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
> >> >   #6 0x0000557adc04f666 in tcp_chr_disconnect_locked
> >> > (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
> >> >   #7 0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10,
> >> > buf=0x7ffe8588cce0 "\v", len=20) at
> >> > ../../chardev/char-socket.c:129
> >
> > Does this mean the backend closed the connection before receiving all
> > the vhost-user protocol messages sent by QEMU?
> >
> > This looks like a backend bug. It prevents QEMU's vhost-user client
> > from cleanly stopping the virtqueue (vhost_virtqueue_stop()).
>
> Well the backend in this case is the qtest framework so not the worlds
> most complete implementation.
>
> > QEMU is still broken if it cannot handle disconnect at any time. Maybe
> > a simple solution for that is to check for reentrancy (either by
> > checking an existing variable or adding a new one to prevent
> > vu_gpio_stop() from calling itself).
>
> vhost-user-blk introduced an additional flag:
>
>     /*
>      * There are at least two steps of initialization of the
>      * vhost-user device. The first is a "connect" step and
>      * second is a "start" step. Make a separation between
>      * those initialization phases by using two fields.
>      */
>     /* vhost_user_blk_connect/vhost_user_blk_disconnect */
>     bool connected;
>     /* vhost_user_blk_start/vhost_user_blk_stop */
>     bool started_vu;
>
> but that in itself is not enough. If you look at the various cases of
> handling CHR_EVENT_CLOSED you'll see some schedule the shutdown with aio
> and some don't even bother (so will probably break the same way).
>
> Rather than have a mish-mash of solutions maybe we should introduce a
> new vhost function - vhost_user_async_close() which can take care of the
> scheduling and wrap it with a check for a valid vhost structure in case
> it gets shutdown in the meantime?

Handling this in core vhost code would be great.

I suggested checking a variable because it's not async. Async is more
complicated because it creates a new in-between state while waiting
for the operation to complete. Async approaches are more likely to
have bugs for this reason.

vhost-user-blk.c's async shutdown is a good example of that:
    case CHR_EVENT_CLOSED:
        if (!runstate_check(RUN_STATE_SHUTDOWN)) {
            /*
             * A close event may happen during a read/write, but vhost
             * code assumes the vhost_dev remains setup, so delay the
             * stop & clear.
             */
            AioContext *ctx = qemu_get_current_aio_context();

            qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, NULL,
                    NULL, NULL, false);
            aio_bh_schedule_oneshot(ctx, vhost_user_blk_chr_closed_bh, opaque);

            /*
             * Move vhost device to the stopped state. The vhost-user device
             * will be clean up and disconnected in BH. This can be useful in
             * the vhost migration code. If disconnect was caught there is an
             * option for the general vhost code to get the dev state without
             * knowing its type (in this case vhost-user).
             *
             * FIXME: this is sketchy to be reaching into vhost_dev
             * now because we are forcing something that implies we
             * have executed vhost_dev_stop() but that won't happen
             * until vhost_user_blk_stop() gets called from the bh.
             * Really this state check should be tracked locally.
             */
            s->dev.started = false;

That said, maybe async is really needed here. I haven't looked at the code.

>
> >
> >> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
> >> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
> >> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
> >> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
> >> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
> >> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
> >> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
> >> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
> >> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >> >   (rr) p hdev->started
> >> >   $9 = true
> >> >   (rr) info thread
> >> >     Id   Target Id                                Frame
> >> >   * 1    Thread 2140414.2140414 (qemu-system-aar) vhost_dev_is_started (hdev=0x557adf80d878) at /home/alex/lsrc/qemu.git/include/hw/virtio/vhost.h:199
> >> >     2    Thread 2140414.2140439 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >> >     3    Thread 2140414.2140442 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >> >     4    Thread 2140414.2140443 (qemu-system-aar) 0x0000000070000002 in syscall_traced ()
> >> >
> >> > During which we eliminate the vhost_dev with a memset:
> >> >
> >> >   Thread 1 hit Hardware watchpoint 2: *(unsigned int *) 0x557adf80da30
> >> >
> >> >   Old value = 2
> >> >   New value = 0
> >> >   __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
> >> >   Download failed: Invalid argument.  Continuing without source file ./string/../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S.
> >> >   220     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
> >> >   (rr) bt
> >> >   #0  __memset_avx2_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:220
> >> >   #1  0x0000557adbdd67f8 in vhost_dev_cleanup (hdev=0x557adf80d878) at ../../hw/virtio/vhost.c:1501
> >> >   #2  0x0000557adbe04d68 in vu_gpio_disconnect (dev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:256
> >> >   #3  0x0000557adbe049bb in vu_gpio_event (opaque=0x557adf80d640, event=CHR_EVENT_CLOSED) at ../../hw/virtio/vhost-user-gpio.c:274
> >> >   #4  0x0000557adc0539ef in chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:61
> >> >   #5  0x0000557adc0506aa in qemu_chr_be_event (s=0x557adea51f10, event=CHR_EVENT_CLOSED) at ../../chardev/char.c:81
> >> >   #6  0x0000557adc04f666 in tcp_chr_disconnect_locked (chr=0x557adea51f10) at ../../chardev/char-socket.c:470
> >> >   #7  0x0000557adc04c81a in tcp_chr_write (chr=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-socket.c:129
> >> >   #8  0x0000557adc050999 in qemu_chr_write_buffer (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, offset=0x7ffe8588cbe4, write_all=true) at ../../chardev/char.c:121
> >> >   #9  0x0000557adc0507c7 in qemu_chr_write (s=0x557adea51f10, buf=0x7ffe8588cce0 "\v", len=20, write_all=true) at ../../chardev/char.c:173
> >> >   #10 0x0000557adc046f3a in qemu_chr_fe_write_all (be=0x557adf80d830, buf=0x7ffe8588cce0 "\v", len=20) at ../../chardev/char-fe.c:53
> >> >   #11 0x0000557adbddc02f in vhost_user_write (dev=0x557adf80d878, msg=0x7ffe8588cce0, fds=0x0, fd_num=0) at ../../hw/virtio/vhost-user.c:490
> >> >   #12 0x0000557adbddd48f in vhost_user_get_vring_base (dev=0x557adf80d878, ring=0x7ffe8588d000) at ../../hw/virtio/vhost-user.c:1260
> >> >   #13 0x0000557adbdd4bd6 in vhost_virtqueue_stop (dev=0x557adf80d878, vdev=0x557adf80d640, vq=0x557adf843570, idx=0) at ../../hw/virtio/vhost.c:1220
> >> >   #14 0x0000557adbdd7eda in vhost_dev_stop (hdev=0x557adf80d878, vdev=0x557adf80d640, vrings=false) at ../../hw/virtio/vhost.c:1916
> >> >   #15 0x0000557adbe051a6 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:142
> >> >   #16 0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >> >   #17 0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >> >   #18 0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >> >   #19 0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >> >   #20 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >> >   #21 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >> >   #22 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >> >   #23 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >> >   #24 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >> >
> >> > Before finally:
> >> >
> >> >   #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> >> >   #1  0x00007f24dc269537 in __GI_abort () at abort.c:79
> >> >   #2  0x00007f24dc26940f in __assert_fail_base (fmt=0x7f24dc3e16a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=<optimized out>) at assert.c:92
> >> >   #3  0x00007f24dc278662 in __GI___assert_fail (assertion=0x557adc28d8f5 "assign || nvqs == proxy->nvqs_with_notifiers", file=0x557adc28d7ab "../../hw/virtio/virtio-pci.c", line=1029, function=0x557adc28d922 "int virtio_pci_set_guest_notifiers(DeviceState *, int, _Bool)") at assert.c:101
> >> >   #4  0x0000557adb8e97f1 in virtio_pci_set_guest_notifiers (d=0x557adf805280, nvqs=0, assign=false) at ../../hw/virtio/virtio-pci.c:1029
> >> >   #5  0x0000557adbe051c7 in vu_gpio_stop (vdev=0x557adf80d640) at ../../hw/virtio/vhost-user-gpio.c:144
> >> >   #6  0x0000557adbe04849 in vu_gpio_set_status (vdev=0x557adf80d640, status=15 '\017') at ../../hw/virtio/vhost-user-gpio.c:173
> >> >   #7  0x0000557adbdc87ff in virtio_set_status (vdev=0x557adf80d640, val=15 '\017') at ../../hw/virtio/virtio.c:2442
> >> >   #8  0x0000557adbdcbfa0 in virtio_vmstate_change (opaque=0x557adf80d640, running=false, state=RUN_STATE_SHUTDOWN) at ../../hw/virtio/virtio.c:3736
> >> >   #9  0x0000557adb91ad27 in vm_state_notify (running=false, state=RUN_STATE_SHUTDOWN) at ../../softmmu/runstate.c:334
> >> >   #10 0x0000557adb910e88 in do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=false) at ../../softmmu/cpus.c:262
> >> >   #11 0x0000557adb910e30 in vm_shutdown () at ../../softmmu/cpus.c:280
> >> >   #12 0x0000557adb91b9c3 in qemu_cleanup () at ../../softmmu/runstate.c:827
> >> >   #13 0x0000557adb522975 in qemu_default_main () at ../../softmmu/main.c:38
> >> >   #14 0x0000557adb5229a8 in main (argc=27, argv=0x7ffe8588d2f8) at ../../softmmu/main.c:48
> >> >
> >> > Because of course we've just done that on disconnect.
> >> >
> >> > Not sure what the cleanest way to avoid that is yet. Maybe it will be
> >> > clearer on Monday morning.
> >>
> >>
> >> --
> >> Alex Bennée
> >>
>
>
> --
> Alex Bennée