diff mbox series

[RFC] docs/interop: define STANDALONE protocol feature for vhost-user

Message ID 20230704123600.1808604-1-alex.bennee@linaro.org
State Superseded
Headers show
Series [RFC] docs/interop: define STANDALONE protocol feature for vhost-user | expand

Commit Message

Alex Bennée July 4, 2023, 12:36 p.m. UTC
Currently QEMU has to know some details about the back-end to be able
to setup the guest. While various parts of the setup can be delegated
to the backend (for example config handling) this is a very piecemeal
approach.

This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
which the back-end can advertise which allows a probe message to be
sent to get all the details QEMU needs to know in one message.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
Initial RFC for discussion. I intend to prototype this work with QEMU
and one of the rust-vmm vhost-user daemons.
---
 docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost-user.c      |  8 ++++++++
 2 files changed, 45 insertions(+)

Comments

Stefano Garzarella July 4, 2023, 2:54 p.m. UTC | #1
On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>Currently QEMU has to know some details about the back-end to be able
>to setup the guest. While various parts of the setup can be delegated
>to the backend (for example config handling) this is a very piecemeal
>approach.
>
>This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>which the back-end can advertise which allows a probe message to be
>sent to get all the details QEMU needs to know in one message.
>
>Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
>---
>Initial RFC for discussion. I intend to prototype this work with QEMU
>and one of the rust-vmm vhost-user daemons.

Thanks for starting this discussion!

I'm comparing with vhost-vdpa IOCTLs, so my questions may be
superficial, but they help me understand the differences.

>---
> docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
> hw/virtio/vhost-user.c      |  8 ++++++++
> 2 files changed, 45 insertions(+)
>
>diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>index 5a070adbc1..85b1b1583a 100644
>--- a/docs/interop/vhost-user.rst
>+++ b/docs/interop/vhost-user.rst
>@@ -275,6 +275,21 @@ Inflight description
>
> :queue size: a 16-bit size of virtqueues
>
>+Backend specifications
>+^^^^^^^^^^^^^^^^^^^^^^
>+
>++-----------+-------------+------------+------------+
>+| device id | config size |   min_vqs  |   max_vqs  |
>++-----------+-------------+------------+------------+
>+
>+:device id: a 32-bit value holding the VirtIO device ID
>+
>+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
>+
>+:min_vqs: a 32-bit value holding the minimum number of vqs supported

Why do we need the minimum?

>+
>+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs

Is this overlap with VHOST_USER_GET_QUEUE_NUM?

>+
> C structure
> -----------
>
>@@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>           VhostUserConfig config;
>           VhostUserVringArea area;
>           VhostUserInflight inflight;
>+          VhostUserBackendSpecs specs;
>       };
>   } QEMU_PACKED VhostUserMsg;
>
>@@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
> * ``VHOST_USER_GET_VRING_BASE``
> * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
> * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
>+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>
> .. seealso::
>
>@@ -885,6 +902,13 @@ Protocol features
>   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>   #define VHOST_USER_PROTOCOL_F_STATUS               16
>   #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>+  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
>+
>+Some features are only valid in the presence of other supporting
>+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
>+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
>+``VHOST_USER_PROTOCOL_F_STATUS``.
>+

What about adding a new section where we will describe what we mean
with "standalone" devices?

For example that the entire virtio device is emulated in the backend,
etc.

By the way, I was thinking more about F_FULL_DEVICE, but I'm not good
with names, so I'll just throw out an idea :-)

Thanks,
Stefano

>
> Front-end message types
> -----------------------
>@@ -1440,6 +1464,19 @@ Front-end message types
>   query the back-end for its device status as defined in the Virtio
>   specification.
>
>+``VHOST_USER_GET_BACKEND_SPECS``
>+  :id: 41
>+  :request payload: N/A
>+  :reply payload: ``Backend specifications``
>+
>+  When the ``VHOST_USER_PROTOCOL_F_STANDALONE`` protocol feature has been
>+  successfully negotiated, this message is submitted by the front-end to
>+  query the back-end for its capabilities. This is intended to remove
>+  the need for the front-end to know ahead of time what the VirtIO
>+  device the backend emulates is.
>+
>+  The reply contains the device id, size of the config space and the
>+  range of VirtQueues the backend supports.
>
> Back-end message types
> ----------------------
>diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>index c4e0cbd702..28b021d5d3 100644
>--- a/hw/virtio/vhost-user.c
>+++ b/hw/virtio/vhost-user.c
>@@ -202,6 +202,13 @@ typedef struct VhostUserInflight {
>     uint16_t queue_size;
> } VhostUserInflight;
>
>+typedef struct VhostUserBackendSpecs {
>+    uint32_t device_id;
>+    uint32_t config_size;
>+    uint32_t min_vqs;
>+    uint32_t max_vqs;
>+} VhostUserBackendSpecs;
>+
> typedef struct {
>     VhostUserRequest request;
>
>@@ -226,6 +233,7 @@ typedef union {
>         VhostUserCryptoSession session;
>         VhostUserVringArea area;
>         VhostUserInflight inflight;
>+        VhostUserBackendSpecs specs;
> } VhostUserPayload;
>
> typedef struct VhostUserMsg {
>-- 
>2.39.2
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>
Alex Bennée July 4, 2023, 3:02 p.m. UTC | #2
Stefano Garzarella <sgarzare@redhat.com> writes:

> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>>Currently QEMU has to know some details about the back-end to be able
>>to setup the guest. While various parts of the setup can be delegated
>>to the backend (for example config handling) this is a very piecemeal
>>approach.
>>
>>This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>>which the back-end can advertise which allows a probe message to be
>>sent to get all the details QEMU needs to know in one message.
>>
>>Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>
>>---
>>Initial RFC for discussion. I intend to prototype this work with QEMU
>>and one of the rust-vmm vhost-user daemons.
>
> Thanks for starting this discussion!
>
> I'm comparing with vhost-vdpa IOCTLs, so my questions may be
> superficial, but they help me understand the differences.

I did have a quick read-through to get a handle on vhost-vdpa but the
docs are fairly empty. However I see there is more detail in the linux
commit so after looking at that I do wonder:

 * The kernel commit defines a subset of VIRTIO_F feature flags. Should
   we do the same for this interface?

 * The VDPA GET/SET STATUS and GET/SET CONFIG ioctls are already covered
   by the equivalent VHOST_USER messages?

>>---
>> docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>> hw/virtio/vhost-user.c      |  8 ++++++++
>> 2 files changed, 45 insertions(+)
>>
>>diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>>index 5a070adbc1..85b1b1583a 100644
>>--- a/docs/interop/vhost-user.rst
>>+++ b/docs/interop/vhost-user.rst
>>@@ -275,6 +275,21 @@ Inflight description
>>
>> :queue size: a 16-bit size of virtqueues
>>
>>+Backend specifications
>>+^^^^^^^^^^^^^^^^^^^^^^
>>+
>>++-----------+-------------+------------+------------+
>>+| device id | config size |   min_vqs  |   max_vqs  |
>>++-----------+-------------+------------+------------+
>>+
>>+:device id: a 32-bit value holding the VirtIO device ID
>>+
>>+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
>>+
>>+:min_vqs: a 32-bit value holding the minimum number of vqs supported
>
> Why do we need the minimum?

We need to know the minimum number because some devices have fixed VQs
that must be present.

>
>>+
>>+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
>
> Is this overlap with VHOST_USER_GET_QUEUE_NUM?

Yes it does and I considered implementing a bunch of messages to fill in
around what we already have. However that seemed like it would add a
bunch of complexity to the interface when we could get all the initial
data in one go.

>
>>+
>> C structure
>> -----------
>>
>>@@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>>           VhostUserConfig config;
>>           VhostUserVringArea area;
>>           VhostUserInflight inflight;
>>+          VhostUserBackendSpecs specs;
>>       };
>>   } QEMU_PACKED VhostUserMsg;
>>
>>@@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
>> * ``VHOST_USER_GET_VRING_BASE``
>> * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>> * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
>>+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>>
>> .. seealso::
>>
>>@@ -885,6 +902,13 @@ Protocol features
>>   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>>   #define VHOST_USER_PROTOCOL_F_STATUS               16
>>   #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>>+  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
>>+
>>+Some features are only valid in the presence of other supporting
>>+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
>>+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
>>+``VHOST_USER_PROTOCOL_F_STATUS``.
>>+
>
> What about adding a new section where we will describe what we mean
> with "standalone" devices?
>
> For example that the entire virtio device is emulated in the backend,
> etc.
>
> By the way, I was thinking more about F_FULL_DEVICE, but I'm not good
> with names, so I'll just throw out an idea :-)

Naming things is hard ;-)

I could add a new section although AIUI there is nothing really in
existing daemons which is split between the front-end and back-end. The
stubs are basically boilerplate and ensure DT/PCIe hubs make the device
appear so once things are discovered QEMU never really gets involved
aside from being a dumb relay.

>
> Thanks,
> Stefano
>
>>
>> Front-end message types
>> -----------------------
>>@@ -1440,6 +1464,19 @@ Front-end message types
>>   query the back-end for its device status as defined in the Virtio
>>   specification.
>>
>>+``VHOST_USER_GET_BACKEND_SPECS``
>>+  :id: 41
>>+  :request payload: N/A
>>+  :reply payload: ``Backend specifications``
>>+
>>+  When the ``VHOST_USER_PROTOCOL_F_STANDALONE`` protocol feature has been
>>+  successfully negotiated, this message is submitted by the front-end to
>>+  query the back-end for its capabilities. This is intended to remove
>>+  the need for the front-end to know ahead of time what the VirtIO
>>+  device the backend emulates is.
>>+
>>+  The reply contains the device id, size of the config space and the
>>+  range of VirtQueues the backend supports.
>>
>> Back-end message types
>> ----------------------
>>diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>>index c4e0cbd702..28b021d5d3 100644
>>--- a/hw/virtio/vhost-user.c
>>+++ b/hw/virtio/vhost-user.c
>>@@ -202,6 +202,13 @@ typedef struct VhostUserInflight {
>>     uint16_t queue_size;
>> } VhostUserInflight;
>>
>>+typedef struct VhostUserBackendSpecs {
>>+    uint32_t device_id;
>>+    uint32_t config_size;
>>+    uint32_t min_vqs;
>>+    uint32_t max_vqs;
>>+} VhostUserBackendSpecs;
>>+
>> typedef struct {
>>     VhostUserRequest request;
>>
>>@@ -226,6 +233,7 @@ typedef union {
>>         VhostUserCryptoSession session;
>>         VhostUserVringArea area;
>>         VhostUserInflight inflight;
>>+        VhostUserBackendSpecs specs;
>> } VhostUserPayload;
>>
>> typedef struct VhostUserMsg {
>> -- 2.39.2
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>
Alex Bennée July 6, 2023, 4:31 p.m. UTC | #3
Alex Bennée <alex.bennee@linaro.org> writes:

> Currently QEMU has to know some details about the back-end to be able
> to setup the guest. While various parts of the setup can be delegated
> to the backend (for example config handling) this is a very piecemeal
> approach.
>
> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> which the back-end can advertise which allows a probe message to be
> sent to get all the details QEMU needs to know in one message.
>
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
> ---
> Initial RFC for discussion. I intend to prototype this work with QEMU
> and one of the rust-vmm vhost-user daemons.
> ---
>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>  hw/virtio/vhost-user.c      |  8 ++++++++
>  2 files changed, 45 insertions(+)
>
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5a070adbc1..85b1b1583a 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -275,6 +275,21 @@ Inflight description
>  
>  :queue size: a 16-bit size of virtqueues
>  
> +Backend specifications
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> ++-----------+-------------+------------+------------+
> +| device id | config size |   min_vqs  |   max_vqs  |
> ++-----------+-------------+------------+------------+
> +
> +:device id: a 32-bit value holding the VirtIO device ID
> +
> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> +
> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> +
> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
> +
>  C structure
>  -----------
>  
> @@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>            VhostUserConfig config;
>            VhostUserVringArea area;
>            VhostUserInflight inflight;
> +          VhostUserBackendSpecs specs;
>        };
>    } QEMU_PACKED VhostUserMsg;
>  
> @@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
>  * ``VHOST_USER_GET_VRING_BASE``
>  * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>  * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
> +* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>  
>  .. seealso::
>  
> @@ -885,6 +902,13 @@ Protocol features
>    #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>    #define VHOST_USER_PROTOCOL_F_STATUS               16
>    #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
> +  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
> +
> +Some features are only valid in the presence of other supporting
> +features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
> +backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
> +``VHOST_USER_PROTOCOL_F_STATUS``.
> +

This is too tight a restriction as not all VirtIO backends manage a
config space. So I suggest the following:

  Some features are only valid in the presence of other supporting
  features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
  backend must also support ``VHOST_USER_PROTOCOL_F_STATUS`` and
  optionally ``VHOST_USER_PROTOCOL_F_CONFIG`` (if there is a config space).
Michael S. Tsirkin July 6, 2023, 4:48 p.m. UTC | #4
On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> Currently QEMU has to know some details about the back-end to be able
> to setup the guest. While various parts of the setup can be delegated
> to the backend (for example config handling) this is a very piecemeal
> approach.

> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> which the back-end can advertise which allows a probe message to be
> sent to get all the details QEMU needs to know in one message.

The reason we do piecemeal is that these existing pieces can be reused
as others evolve or fall by wayside.

For example, I can think of instances where you want to connect
specifically to e.g. networking backend, and specify it
on command line. Reasons could be many, e.g. for debugging,
or to prevent connecting to wrong device on wrong channel
(kind of like type safety).

What is the reason to have 1 message? startup latency?
How about we allow pipelining several messages then?
Will be easier.


> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> Initial RFC for discussion. I intend to prototype this work with QEMU
> and one of the rust-vmm vhost-user daemons.
> ---
>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>  hw/virtio/vhost-user.c      |  8 ++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5a070adbc1..85b1b1583a 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -275,6 +275,21 @@ Inflight description
>  
>  :queue size: a 16-bit size of virtqueues
>  
> +Backend specifications
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> ++-----------+-------------+------------+------------+
> +| device id | config size |   min_vqs  |   max_vqs  |
> ++-----------+-------------+------------+------------+
> +
> +:device id: a 32-bit value holding the VirtIO device ID
> +
> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> +
> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> +
> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
> +

looks like a weird set of info.
why would we want # of vqs and not their sizes?
why config size but not config itself?




>  C structure
>  -----------
>  
> @@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>            VhostUserConfig config;
>            VhostUserVringArea area;
>            VhostUserInflight inflight;
> +          VhostUserBackendSpecs specs;
>        };
>    } QEMU_PACKED VhostUserMsg;
>  
> @@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
>  * ``VHOST_USER_GET_VRING_BASE``
>  * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>  * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
> +* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>  
>  .. seealso::
>  
> @@ -885,6 +902,13 @@ Protocol features
>    #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>    #define VHOST_USER_PROTOCOL_F_STATUS               16
>    #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
> +  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
> +
> +Some features are only valid in the presence of other supporting
> +features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
> +backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
> +``VHOST_USER_PROTOCOL_F_STATUS``.
> +
>  
>  Front-end message types
>  -----------------------
> @@ -1440,6 +1464,19 @@ Front-end message types
>    query the back-end for its device status as defined in the Virtio
>    specification.
>  
> +``VHOST_USER_GET_BACKEND_SPECS``
> +  :id: 41
> +  :request payload: N/A
> +  :reply payload: ``Backend specifications``
> +
> +  When the ``VHOST_USER_PROTOCOL_F_STANDALONE`` protocol feature has been
> +  successfully negotiated, this message is submitted by the front-end to
> +  query the back-end for its capabilities. This is intended to remove
> +  the need for the front-end to know ahead of time what the VirtIO
> +  device the backend emulates is.
> +
> +  The reply contains the device id, size of the config space and the
> +  range of VirtQueues the backend supports.
>  
>  Back-end message types
>  ----------------------
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index c4e0cbd702..28b021d5d3 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -202,6 +202,13 @@ typedef struct VhostUserInflight {
>      uint16_t queue_size;
>  } VhostUserInflight;
>  
> +typedef struct VhostUserBackendSpecs {
> +    uint32_t device_id;
> +    uint32_t config_size;
> +    uint32_t min_vqs;
> +    uint32_t max_vqs;
> +} VhostUserBackendSpecs;
> +
>  typedef struct {
>      VhostUserRequest request;
>  
> @@ -226,6 +233,7 @@ typedef union {
>          VhostUserCryptoSession session;
>          VhostUserVringArea area;
>          VhostUserInflight inflight;
> +        VhostUserBackendSpecs specs;
>  } VhostUserPayload;
>  
>  typedef struct VhostUserMsg {
> -- 
> 2.39.2
Alex Bennée July 7, 2023, 7:58 a.m. UTC | #5
"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>> Currently QEMU has to know some details about the back-end to be able
>> to setup the guest. While various parts of the setup can be delegated
>> to the backend (for example config handling) this is a very piecemeal
>> approach.
>
>> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>> which the back-end can advertise which allows a probe message to be
>> sent to get all the details QEMU needs to know in one message.
>
> The reason we do piecemeal is that these existing pieces can be reused
> as others evolve or fall by wayside.

Sure I have no objection in principle but we then turn code like:

        if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_STANDALONE)) {
            err = vhost_user_get_backend_specs(dev, errp);
            if (err < 0) {
                error_setg_errno(errp, EPROTO, "vhost_get_backend_specs failed");
                return -EPROTO;
            }
        }

to

        if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
            dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
            dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
            dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
        ) {
            err = vhost_user_get_virtio_id(dev, errp);
            if (err < 0) {
                error_setg_errno(errp, EPROTO, "vhost_get_backend_id failed");
                return -EPROTO;
            }
            err = vhost_user_get_virtio_cfgsz(dev, errp);
            if (err < 0) {
                error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz failed");
                return -EPROTO;
            }
            err = vhost_user_get_virtio_minvq(dev, errp);
            if (err < 0) {
                error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq failed");
                return -EPROTO;
            }
            err = vhost_user_get_virtio_maxvq(dev, errp);
            if (err < 0) {
                error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq failed");
                return -EPROTO;
            }
            dev->specs.valid = true;
        }

for little gain IMHO.

> For example, I can think of instances where you want to connect
> specifically to e.g. networking backend, and specify it
> on command line. Reasons could be many, e.g. for debugging,
> or to prevent connecting to wrong device on wrong channel
> (kind of like type safety).

I don't quite follow what you are trying to say here.

> What is the reason to have 1 message? startup latency?
> How about we allow pipelining several messages then?
> Will be easier.

I'm not overly worried about performance because this is all at
start-up. I am worried about excessive complexity though. We already
have quite a lot of interacting protocol messages.

>
>
>> 
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> 
>> ---
>> Initial RFC for discussion. I intend to prototype this work with QEMU
>> and one of the rust-vmm vhost-user daemons.
>> ---
>>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>>  hw/virtio/vhost-user.c      |  8 ++++++++
>>  2 files changed, 45 insertions(+)
>> 
>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>> index 5a070adbc1..85b1b1583a 100644
>> --- a/docs/interop/vhost-user.rst
>> +++ b/docs/interop/vhost-user.rst
>> @@ -275,6 +275,21 @@ Inflight description
>>  
>>  :queue size: a 16-bit size of virtqueues
>>  
>> +Backend specifications
>> +^^^^^^^^^^^^^^^^^^^^^^
>> +
>> ++-----------+-------------+------------+------------+
>> +| device id | config size |   min_vqs  |   max_vqs  |
>> ++-----------+-------------+------------+------------+
>> +
>> +:device id: a 32-bit value holding the VirtIO device ID
>> +
>> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
>> +
>> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
>> +
>> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
>> +
>
> looks like a weird set of info.

It's basically the information you need for -device vhost-user-device to
start-up (and what is essentially the information set by the stubs as
they start-up).

> why would we want # of vqs and not their sizes?

I thought the vring's themselves where allocated by the driver. We only
need to the number of vqs so we can allocate the tracking structures.

> why config size but not config itself?

We already have GET_CONFIG and SET_CONFIG but without knowing the size
of the config space we can't properly set it up.

<snip>
Michael S. Tsirkin July 7, 2023, 9:57 a.m. UTC | #6
On Fri, Jul 07, 2023 at 08:58:00AM +0100, Alex Bennée wrote:
> 
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> >> Currently QEMU has to know some details about the back-end to be able
> >> to setup the guest. While various parts of the setup can be delegated
> >> to the backend (for example config handling) this is a very piecemeal
> >> approach.
> >
> >> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> >> which the back-end can advertise which allows a probe message to be
> >> sent to get all the details QEMU needs to know in one message.
> >
> > The reason we do piecemeal is that these existing pieces can be reused
> > as others evolve or fall by wayside.
> 
> Sure I have no objection in principle but we then turn code like:
> 
>         if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_STANDALONE)) {
>             err = vhost_user_get_backend_specs(dev, errp);
>             if (err < 0) {
>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_specs failed");
>                 return -EPROTO;
>             }
>         }
> 
> to
> 
>         if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
>         ) {
>             err = vhost_user_get_virtio_id(dev, errp);
>             if (err < 0) {
>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_id failed");
>                 return -EPROTO;
>             }
>             err = vhost_user_get_virtio_cfgsz(dev, errp);
>             if (err < 0) {
>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz failed");
>                 return -EPROTO;
>             }
>             err = vhost_user_get_virtio_minvq(dev, errp);
>             if (err < 0) {
>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq failed");
>                 return -EPROTO;
>             }
>             err = vhost_user_get_virtio_maxvq(dev, errp);
>             if (err < 0) {
>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq failed");
>                 return -EPROTO;
>             }
>             dev->specs.valid = true;
>         }
> 
> for little gain IMHO.
>
> > For example, I can think of instances where you want to connect
> > specifically to e.g. networking backend, and specify it
> > on command line. Reasons could be many, e.g. for debugging,
> > or to prevent connecting to wrong device on wrong channel
> > (kind of like type safety).
> 
> I don't quite follow what you are trying to say here.

That some or all of these might be better on qemu command line
not come from backend. Then we'll want to *send* it to backend.
All this at our discretion without protocol changes.


> > What is the reason to have 1 message? startup latency?
> > How about we allow pipelining several messages then?
> > Will be easier.
> 
> I'm not overly worried about performance because this is all at
> start-up. I am worried about excessive complexity though. We already
> have quite a lot of interacting protocol messages.
> 
> >
> >
> >> 
> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >> 
> >> ---
> >> Initial RFC for discussion. I intend to prototype this work with QEMU
> >> and one of the rust-vmm vhost-user daemons.
> >> ---
> >>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
> >>  hw/virtio/vhost-user.c      |  8 ++++++++
> >>  2 files changed, 45 insertions(+)
> >> 
> >> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> >> index 5a070adbc1..85b1b1583a 100644
> >> --- a/docs/interop/vhost-user.rst
> >> +++ b/docs/interop/vhost-user.rst
> >> @@ -275,6 +275,21 @@ Inflight description
> >>  
> >>  :queue size: a 16-bit size of virtqueues
> >>  
> >> +Backend specifications
> >> +^^^^^^^^^^^^^^^^^^^^^^
> >> +
> >> ++-----------+-------------+------------+------------+
> >> +| device id | config size |   min_vqs  |   max_vqs  |
> >> ++-----------+-------------+------------+------------+
> >> +
> >> +:device id: a 32-bit value holding the VirtIO device ID
> >> +
> >> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> >> +
> >> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> >> +
> >> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
> >> +
> >
> > looks like a weird set of info.
> 
> It's basically the information you need for -device vhost-user-device to
> start-up (and what is essentially the information set by the stubs as
> they start-up).
> 
> > why would we want # of vqs and not their sizes?
> 
> I thought the vring's themselves where allocated by the driver. We only
> need to the number of vqs so we can allocate the tracking structures.

size is specified by device though

> > why config size but not config itself?
> 
> We already have GET_CONFIG and SET_CONFIG but without knowing the size
> of the config space we can't properly set it up.

I don't get it. each message includes size already.

> <snip>
> 
> -- 
> Alex Bennée
> Virtualisation Tech Lead @ Linaro
Stefano Garzarella July 7, 2023, 10:27 a.m. UTC | #7
On Tue, Jul 04, 2023 at 04:02:42PM +0100, Alex Bennée wrote:
>
>Stefano Garzarella <sgarzare@redhat.com> writes:
>
>> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>>>Currently QEMU has to know some details about the back-end to be able
>>>to setup the guest. While various parts of the setup can be delegated
>>>to the backend (for example config handling) this is a very piecemeal
>>>approach.
>>>
>>>This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>>>which the back-end can advertise which allows a probe message to be
>>>sent to get all the details QEMU needs to know in one message.
>>>
>>>Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>>
>>>---
>>>Initial RFC for discussion. I intend to prototype this work with QEMU
>>>and one of the rust-vmm vhost-user daemons.
>>
>> Thanks for starting this discussion!
>>
>> I'm comparing with vhost-vdpa IOCTLs, so my questions may be
>> superficial, but they help me understand the differences.
>
>I did have a quick read-through to get a handle on vhost-vdpa but the
>docs are fairly empty. However I see there is more detail in the linux
>commit so after looking at that I do wonder:
>
> * The kernel commit defines a subset of VIRTIO_F feature flags. Should
>   we do the same for this interface?

Sorry, I didn't get this.

Do you mean that the kernel is filtering some features?
Or are you talking about backend features?

>
> * The VDPA GET/SET STATUS and GET/SET CONFIG ioctls are already covered
>   by the equivalent VHOST_USER messages?

Yep, I think so.

>
>>>---
>>> docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>>> hw/virtio/vhost-user.c      |  8 ++++++++
>>> 2 files changed, 45 insertions(+)
>>>
>>>diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>>>index 5a070adbc1..85b1b1583a 100644
>>>--- a/docs/interop/vhost-user.rst
>>>+++ b/docs/interop/vhost-user.rst
>>>@@ -275,6 +275,21 @@ Inflight description
>>>
>>> :queue size: a 16-bit size of virtqueues
>>>
>>>+Backend specifications
>>>+^^^^^^^^^^^^^^^^^^^^^^
>>>+
>>>++-----------+-------------+------------+------------+
>>>+| device id | config size |   min_vqs  |   max_vqs  |
>>>++-----------+-------------+------------+------------+
>>>+
>>>+:device id: a 32-bit value holding the VirtIO device ID
>>>+
>>>+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
>>>+
>>>+:min_vqs: a 32-bit value holding the minimum number of vqs supported
>>
>> Why do we need the minimum?
>
>We need to know the minimum number because some devices have fixed VQs
>that must be present.

But does QEMU need to know this?

Or is it okay that the driver will then fail in the guest if there
are not the right number of queues?

>
>>
>>>+
>>>+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
>>
>> Is this overlap with VHOST_USER_GET_QUEUE_NUM?
>
>Yes it does and I considered implementing a bunch of messages to fill in
>around what we already have. However that seemed like it would add a
>bunch of complexity to the interface when we could get all the initial
>data in one go.

Yes I understand, though if we need to add new things to return in the
future how do we do it? Do we need to provide features for this
structure?

>
>>
>>>+
>>> C structure
>>> -----------
>>>
>>>@@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>>>           VhostUserConfig config;
>>>           VhostUserVringArea area;
>>>           VhostUserInflight inflight;
>>>+          VhostUserBackendSpecs specs;
>>>       };
>>>   } QEMU_PACKED VhostUserMsg;
>>>
>>>@@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
>>> * ``VHOST_USER_GET_VRING_BASE``
>>> * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>>> * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
>>>+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>>>
>>> .. seealso::
>>>
>>>@@ -885,6 +902,13 @@ Protocol features
>>>   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>>>   #define VHOST_USER_PROTOCOL_F_STATUS               16
>>>   #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>>>+  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
>>>+
>>>+Some features are only valid in the presence of other supporting
>>>+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
>>>+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
>>>+``VHOST_USER_PROTOCOL_F_STATUS``.
>>>+
>>
>> What about adding a new section where we will describe what we mean
>> with "standalone" devices?
>>
>> For example that the entire virtio device is emulated in the backend,
>> etc.
>>
>> By the way, I was thinking more about F_FULL_DEVICE, but I'm not good
>> with names, so I'll just throw out an idea :-)
>
>Naming things is hard ;-)

I know :-)

>
>I could add a new section although AIUI there is nothing really in
>existing daemons which is split between the front-end and back-end. The
>stubs are basically boilerplate and ensure DT/PCIe hubs make the device
>appear so once things are discovered QEMU never really gets involved
>aside from being a dumb relay.

For the backend I don't think there is much to say, but for the frontend
it changes a lot in my opinion if this new feature is supported, since
we basically offlaod the whole virtio device to the external process.

Thanks,
Stefano
Stefano Garzarella July 7, 2023, 10:35 a.m. UTC | #8
On Thu, Jul 06, 2023 at 05:31:15PM +0100, Alex Bennée wrote:
>
>Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Currently QEMU has to know some details about the back-end to be able
>> to setup the guest. While various parts of the setup can be delegated
>> to the backend (for example config handling) this is a very piecemeal
>> approach.
>>
>> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>> which the back-end can advertise which allows a probe message to be
>> sent to get all the details QEMU needs to know in one message.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>>
>> ---
>> Initial RFC for discussion. I intend to prototype this work with QEMU
>> and one of the rust-vmm vhost-user daemons.
>> ---
>>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>>  hw/virtio/vhost-user.c      |  8 ++++++++
>>  2 files changed, 45 insertions(+)
>>
>> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
>> index 5a070adbc1..85b1b1583a 100644
>> --- a/docs/interop/vhost-user.rst
>> +++ b/docs/interop/vhost-user.rst
>> @@ -275,6 +275,21 @@ Inflight description
>>
>>  :queue size: a 16-bit size of virtqueues
>>
>> +Backend specifications
>> +^^^^^^^^^^^^^^^^^^^^^^
>> +
>> ++-----------+-------------+------------+------------+
>> +| device id | config size |   min_vqs  |   max_vqs  |
>> ++-----------+-------------+------------+------------+
>> +
>> +:device id: a 32-bit value holding the VirtIO device ID
>> +
>> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
>> +
>> +:min_vqs: a 32-bit value holding the minimum number of vqs supported
>> +
>> +:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
>> +
>>  C structure
>>  -----------
>>
>> @@ -296,6 +311,7 @@ In QEMU the vhost-user message is implemented with the following struct:
>>            VhostUserConfig config;
>>            VhostUserVringArea area;
>>            VhostUserInflight inflight;
>> +          VhostUserBackendSpecs specs;
>>        };
>>    } QEMU_PACKED VhostUserMsg;
>>
>> @@ -316,6 +332,7 @@ replies. Here is a list of the ones that do:
>>  * ``VHOST_USER_GET_VRING_BASE``
>>  * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
>>  * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
>> +* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
>>
>>  .. seealso::
>>
>> @@ -885,6 +902,13 @@ Protocol features
>>    #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
>>    #define VHOST_USER_PROTOCOL_F_STATUS               16
>>    #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
>> +  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
>> +
>> +Some features are only valid in the presence of other supporting
>> +features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
>> +backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
>> +``VHOST_USER_PROTOCOL_F_STATUS``.
>> +
>
>This is too tight a restriction as not all VirtIO backends manage a
>config space. So I suggest the following:
>
>  Some features are only valid in the presence of other supporting
>  features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
>  backend must also support ``VHOST_USER_PROTOCOL_F_STATUS`` and
>  optionally ``VHOST_USER_PROTOCOL_F_CONFIG`` (if there is a config space).

Right, but could we describe it more as a kind of dependence between
features?

Something like this:

Some features depend on others to be supported:

* ``VHOST_USER_PROTOCOL_F_STANDALONE`` depends on:

   * ``VHOST_USER_PROTOCOL_F_STATUS``
   * ``VHOST_USER_PROTOCOL_F_CONFIG`` (if there is a config space)

Thanks,
Stefano
Alex Bennée July 7, 2023, 1:12 p.m. UTC | #9
"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Fri, Jul 07, 2023 at 08:58:00AM +0100, Alex Bennée wrote:
>> 
>> "Michael S. Tsirkin" <mst@redhat.com> writes:
>> 
>> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
>> >> Currently QEMU has to know some details about the back-end to be able
>> >> to setup the guest. While various parts of the setup can be delegated
>> >> to the backend (for example config handling) this is a very piecemeal
>> >> approach.
>> >
>> >> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
>> >> which the back-end can advertise which allows a probe message to be
>> >> sent to get all the details QEMU needs to know in one message.
>> >
>> > The reason we do piecemeal is that these existing pieces can be reused
>> > as others evolve or fall by wayside.
>> 
>> Sure I have no objection in principle but we then turn code like:
>> 
>>         if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_STANDALONE)) {
>>             err = vhost_user_get_backend_specs(dev, errp);
>>             if (err < 0) {
>>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_specs failed");
>>                 return -EPROTO;
>>             }
>>         }
>> 
>> to
>> 
>>         if (dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_ID) &&
>>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_CFGSZ) &&
>>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MINVQ) &&
>>             dev->protocol_features & (1ULL << VHOST_USER_PROTOCOL_F_MAXVQ)
>>         ) {
>>             err = vhost_user_get_virtio_id(dev, errp);
>>             if (err < 0) {
>>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_id failed");
>>                 return -EPROTO;
>>             }
>>             err = vhost_user_get_virtio_cfgsz(dev, errp);
>>             if (err < 0) {
>>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_cfgsz failed");
>>                 return -EPROTO;
>>             }
>>             err = vhost_user_get_virtio_minvq(dev, errp);
>>             if (err < 0) {
>>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_minvq failed");
>>                 return -EPROTO;
>>             }
>>             err = vhost_user_get_virtio_maxvq(dev, errp);
>>             if (err < 0) {
>>                 error_setg_errno(errp, EPROTO, "vhost_get_backend_maxvq failed");
>>                 return -EPROTO;
>>             }
>>             dev->specs.valid = true;
>>         }
>> 
>> for little gain IMHO.
>>
>> > For example, I can think of instances where you want to connect
>> > specifically to e.g. networking backend, and specify it
>> > on command line. Reasons could be many, e.g. for debugging,
>> > or to prevent connecting to wrong device on wrong channel
>> > (kind of like type safety).
>> 
>> I don't quite follow what you are trying to say here.
>
> That some or all of these might be better on qemu command line
> not come from backend. Then we'll want to *send* it to backend.
> All this at our discretion without protocol changes.

That doesn't solve the standalone problem though (not all VMM's are QEMU
after all). I'm currently putting together a PoC with the
vhost-user-device and I was intending:

 - no CLI args, probe and if nothing fail
 - CLI args, probe with no response, continue with CLI args
 - CLI args, probe with response, check args match (or in bounds for
   vqs) and fail if not

Stefan wasn't super keen on the vhost-user-device in v2 being user
creatable because things could go weird quite quickly in hard to debug
ways:

  Message-Id: <20230418162140.373219-1-alex.bennee@linaro.org>
  Date: Tue, 18 Apr 2023 17:21:27 +0100
  Subject: [PATCH v2 00/13] virtio: add vhost-user-generic and reduce copy and paste
  From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>

However it certainly is useful from a development point of view being
able to plug in new VirtIO backends without having to copy and paste
another slightly different stub into QEMU. I was pondering a middle
ground of maybe making the CLI options all x- variants to emphasise the
"here be dragons please know what you are doing" aspect of them.
Stefan Hajnoczi July 20, 2023, 7:32 p.m. UTC | #10
On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> Currently QEMU has to know some details about the back-end to be able
> to setup the guest. While various parts of the setup can be delegated
> to the backend (for example config handling) this is a very piecemeal
> approach.
> 
> This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> which the back-end can advertise which allows a probe message to be
> sent to get all the details QEMU needs to know in one message.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> Initial RFC for discussion. I intend to prototype this work with QEMU
> and one of the rust-vmm vhost-user daemons.
> ---
>  docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
>  hw/virtio/vhost-user.c      |  8 ++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5a070adbc1..85b1b1583a 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -275,6 +275,21 @@ Inflight description
>  
>  :queue size: a 16-bit size of virtqueues
>  
> +Backend specifications
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> ++-----------+-------------+------------+------------+
> +| device id | config size |   min_vqs  |   max_vqs  |
> ++-----------+-------------+------------+------------+
> +
> +:device id: a 32-bit value holding the VirtIO device ID
> +
> +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> +
> +:min_vqs: a 32-bit value holding the minimum number of vqs supported

What is the purpose of min_vqs? I'm not sure why the front-end needs to
know this.
Stefan Hajnoczi July 20, 2023, 7:34 p.m. UTC | #11
On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index c4e0cbd702..28b021d5d3 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -202,6 +202,13 @@ typedef struct VhostUserInflight {
>      uint16_t queue_size;
>  } VhostUserInflight;
>  
> +typedef struct VhostUserBackendSpecs {
> +    uint32_t device_id;
> +    uint32_t config_size;
> +    uint32_t min_vqs;

You already answered my question about min_vqs in another sub-thread.
I'll continue there. Please ignore my question.

Stefan
Stefan Hajnoczi July 20, 2023, 7:36 p.m. UTC | #12
On Fri, Jul 07, 2023 at 12:27:39PM +0200, Stefano Garzarella wrote:
> On Tue, Jul 04, 2023 at 04:02:42PM +0100, Alex Bennée wrote:
> > 
> > Stefano Garzarella <sgarzare@redhat.com> writes:
> > 
> > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > > index 5a070adbc1..85b1b1583a 100644
> > > > --- a/docs/interop/vhost-user.rst
> > > > +++ b/docs/interop/vhost-user.rst
> > > > @@ -275,6 +275,21 @@ Inflight description
> > > > 
> > > > :queue size: a 16-bit size of virtqueues
> > > > 
> > > > +Backend specifications
> > > > +^^^^^^^^^^^^^^^^^^^^^^
> > > > +
> > > > ++-----------+-------------+------------+------------+
> > > > +| device id | config size |   min_vqs  |   max_vqs  |
> > > > ++-----------+-------------+------------+------------+
> > > > +
> > > > +:device id: a 32-bit value holding the VirtIO device ID
> > > > +
> > > > +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> > > > +
> > > > +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> > > 
> > > Why do we need the minimum?
> > 
> > We need to know the minimum number because some devices have fixed VQs
> > that must be present.
> 
> But does QEMU need to know this?
> 
> Or is it okay that the driver will then fail in the guest if there
> are not the right number of queues?

I don't understand why min_vqs is needed either. It's not the
front-end's job to ensure that the device will be used properly. A
spec-compliant driver will work with a spec-compliant device, so it's
not clear why the front-end needs this information.

Stefan
Stefan Hajnoczi July 20, 2023, 7:58 p.m. UTC | #13
On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > Currently QEMU has to know some details about the back-end to be able
> > to setup the guest. While various parts of the setup can be delegated
> > to the backend (for example config handling) this is a very piecemeal
> > approach.
> 
> > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > which the back-end can advertise which allows a probe message to be
> > sent to get all the details QEMU needs to know in one message.
> 
> The reason we do piecemeal is that these existing pieces can be reused
> as others evolve or fall by wayside.
> 
> For example, I can think of instances where you want to connect
> specifically to e.g. networking backend, and specify it
> on command line. Reasons could be many, e.g. for debugging,
> or to prevent connecting to wrong device on wrong channel
> (kind of like type safety).
> 
> What is the reason to have 1 message? startup latency?
> How about we allow pipelining several messages then?
> Will be easier.

This flag effectively says that the back-end is a full VIRTIO device
with a Device Status Register, Configuration Space, Virtqueues, the
device type, etc. This is different from previous vhost-user devices
which sometimes just offloaded certain virtqueues without providing the
full VIRTIO device (parts were emulated in the VMM).

So for example, a vhost-user-net device does not support the controlq.
Alex's "standalone" device is a mode where the vhost-user protocol is
used but the back-end must implement a full virtio-net device.
Standalone devices are like vDPA device in this respect.

I think it is important to have a protocol feature bit that advertises
that this is a standalone device, since the semantics are different for
traditional vhost-user-net devices.

However, I think having a single message is inflexible and duplicates
existing vhost-user protocol messages like VHOST_USER_GET_QUEUE_NUM. I
would prefer VHOST_USER_GET_DEVICE_ID and other messages.

Stefan
Michael S. Tsirkin July 20, 2023, 9:14 p.m. UTC | #14
On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > Currently QEMU has to know some details about the back-end to be able
> > > to setup the guest. While various parts of the setup can be delegated
> > > to the backend (for example config handling) this is a very piecemeal
> > > approach.
> > 
> > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > which the back-end can advertise which allows a probe message to be
> > > sent to get all the details QEMU needs to know in one message.
> > 
> > The reason we do piecemeal is that these existing pieces can be reused
> > as others evolve or fall by wayside.
> > 
> > For example, I can think of instances where you want to connect
> > specifically to e.g. networking backend, and specify it
> > on command line. Reasons could be many, e.g. for debugging,
> > or to prevent connecting to wrong device on wrong channel
> > (kind of like type safety).
> > 
> > What is the reason to have 1 message? startup latency?
> > How about we allow pipelining several messages then?
> > Will be easier.
> 
> This flag effectively says that the back-end is a full VIRTIO device
> with a Device Status Register, Configuration Space, Virtqueues, the
> device type, etc. This is different from previous vhost-user devices
> which sometimes just offloaded certain virtqueues without providing the
> full VIRTIO device (parts were emulated in the VMM).
> 
> So for example, a vhost-user-net device does not support the controlq.
> Alex's "standalone" device is a mode where the vhost-user protocol is
> used but the back-end must implement a full virtio-net device.
> Standalone devices are like vDPA device in this respect.
> 
> I think it is important to have a protocol feature bit that advertises
> that this is a standalone device, since the semantics are different for
> traditional vhost-user-net devices.

Not sure what that would gain as compared to a feature bit per
message as we did previously.

> However, I think having a single message is inflexible and duplicates
> existing vhost-user protocol messages like VHOST_USER_GET_QUEUE_NUM. I
> would prefer VHOST_USER_GET_DEVICE_ID and other messages.
> 
> Stefan

Exactly.
Stefan Hajnoczi July 20, 2023, 9:31 p.m. UTC | #15
On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > Currently QEMU has to know some details about the back-end to be able
> > > > to setup the guest. While various parts of the setup can be delegated
> > > > to the backend (for example config handling) this is a very piecemeal
> > > > approach.
> > >
> > > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > which the back-end can advertise which allows a probe message to be
> > > > sent to get all the details QEMU needs to know in one message.
> > >
> > > The reason we do piecemeal is that these existing pieces can be reused
> > > as others evolve or fall by wayside.
> > >
> > > For example, I can think of instances where you want to connect
> > > specifically to e.g. networking backend, and specify it
> > > on command line. Reasons could be many, e.g. for debugging,
> > > or to prevent connecting to wrong device on wrong channel
> > > (kind of like type safety).
> > >
> > > What is the reason to have 1 message? startup latency?
> > > How about we allow pipelining several messages then?
> > > Will be easier.
> >
> > This flag effectively says that the back-end is a full VIRTIO device
> > with a Device Status Register, Configuration Space, Virtqueues, the
> > device type, etc. This is different from previous vhost-user devices
> > which sometimes just offloaded certain virtqueues without providing the
> > full VIRTIO device (parts were emulated in the VMM).
> >
> > So for example, a vhost-user-net device does not support the controlq.
> > Alex's "standalone" device is a mode where the vhost-user protocol is
> > used but the back-end must implement a full virtio-net device.
> > Standalone devices are like vDPA device in this respect.
> >
> > I think it is important to have a protocol feature bit that advertises
> > that this is a standalone device, since the semantics are different for
> > traditional vhost-user-net devices.
>
> Not sure what that would gain as compared to a feature bit per
> message as we did previously.

Having a single feature bit makes it easier to distinguish between a
traditional vhost-user device and a standalone device.

For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
you whether this device is a standalone device that is appropriate for
a new generic QEMU --device vhost-user-device feature that Alex is
working on. It could be a traditional vhost-user device that is not
standalone but implements the VHOST_USER_GET_DEVICE_ID message.

How will we detect standalone devices? It will be messy if there is no
single feature bit that advertises that this back-end is a standalone
device.

Stefan
Michael S. Tsirkin July 20, 2023, 10:22 p.m. UTC | #16
On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > Currently QEMU has to know some details about the back-end to be able
> > > > > to setup the guest. While various parts of the setup can be delegated
> > > > > to the backend (for example config handling) this is a very piecemeal
> > > > > approach.
> > > >
> > > > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > which the back-end can advertise which allows a probe message to be
> > > > > sent to get all the details QEMU needs to know in one message.
> > > >
> > > > The reason we do piecemeal is that these existing pieces can be reused
> > > > as others evolve or fall by wayside.
> > > >
> > > > For example, I can think of instances where you want to connect
> > > > specifically to e.g. networking backend, and specify it
> > > > on command line. Reasons could be many, e.g. for debugging,
> > > > or to prevent connecting to wrong device on wrong channel
> > > > (kind of like type safety).
> > > >
> > > > What is the reason to have 1 message? startup latency?
> > > > How about we allow pipelining several messages then?
> > > > Will be easier.
> > >
> > > This flag effectively says that the back-end is a full VIRTIO device
> > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > device type, etc. This is different from previous vhost-user devices
> > > which sometimes just offloaded certain virtqueues without providing the
> > > full VIRTIO device (parts were emulated in the VMM).
> > >
> > > So for example, a vhost-user-net device does not support the controlq.
> > > Alex's "standalone" device is a mode where the vhost-user protocol is
> > > used but the back-end must implement a full virtio-net device.
> > > Standalone devices are like vDPA device in this respect.
> > >
> > > I think it is important to have a protocol feature bit that advertises
> > > that this is a standalone device, since the semantics are different for
> > > traditional vhost-user-net devices.
> >
> > Not sure what that would gain as compared to a feature bit per
> > message as we did previously.
> 
> Having a single feature bit makes it easier to distinguish between a
> traditional vhost-user device and a standalone device.
> 
> For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> you whether this device is a standalone device that is appropriate for
> a new generic QEMU --device vhost-user-device feature that Alex is
> working on. It could be a traditional vhost-user device that is not
> standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> 
> How will we detect standalone devices? It will be messy if there is no
> single feature bit that advertises that this back-end is a standalone
> device.
> 
> Stefan

Looks like standalone implies some 5-6 messages to be supported.
So just test the 6 bits are all ones.
Stefan Hajnoczi July 24, 2023, 6:08 p.m. UTC | #17
On Thu, Jul 20, 2023 at 06:22:08PM -0400, Michael S. Tsirkin wrote:
> On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> > On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > > Currently QEMU has to know some details about the back-end to be able
> > > > > > to setup the guest. While various parts of the setup can be delegated
> > > > > > to the backend (for example config handling) this is a very piecemeal
> > > > > > approach.
> > > > >
> > > > > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > > which the back-end can advertise which allows a probe message to be
> > > > > > sent to get all the details QEMU needs to know in one message.
> > > > >
> > > > > The reason we do piecemeal is that these existing pieces can be reused
> > > > > as others evolve or fall by wayside.
> > > > >
> > > > > For example, I can think of instances where you want to connect
> > > > > specifically to e.g. networking backend, and specify it
> > > > > on command line. Reasons could be many, e.g. for debugging,
> > > > > or to prevent connecting to wrong device on wrong channel
> > > > > (kind of like type safety).
> > > > >
> > > > > What is the reason to have 1 message? startup latency?
> > > > > How about we allow pipelining several messages then?
> > > > > Will be easier.
> > > >
> > > > This flag effectively says that the back-end is a full VIRTIO device
> > > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > > device type, etc. This is different from previous vhost-user devices
> > > > which sometimes just offloaded certain virtqueues without providing the
> > > > full VIRTIO device (parts were emulated in the VMM).
> > > >
> > > > So for example, a vhost-user-net device does not support the controlq.
> > > > Alex's "standalone" device is a mode where the vhost-user protocol is
> > > > used but the back-end must implement a full virtio-net device.
> > > > Standalone devices are like vDPA device in this respect.
> > > >
> > > > I think it is important to have a protocol feature bit that advertises
> > > > that this is a standalone device, since the semantics are different for
> > > > traditional vhost-user-net devices.
> > >
> > > Not sure what that would gain as compared to a feature bit per
> > > message as we did previously.
> > 
> > Having a single feature bit makes it easier to distinguish between a
> > traditional vhost-user device and a standalone device.
> > 
> > For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> > you whether this device is a standalone device that is appropriate for
> > a new generic QEMU --device vhost-user-device feature that Alex is
> > working on. It could be a traditional vhost-user device that is not
> > standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> > 
> > How will we detect standalone devices? It will be messy if there is no
> > single feature bit that advertises that this back-end is a standalone
> > device.
> > 
> > Stefan
> 
> Looks like standalone implies some 5-6 messages to be supported.
> So just test the 6 bits are all ones.

It's not clear to me that the individual bits together mean this is
really a standalone device, but let's go with individual commands and
see if a front-end can distinguish standalone devices or not. If not,
then we can still add "standalone" feature bit before merging the code.

Stefan
Erik Schilling July 26, 2023, 2:33 p.m. UTC | #18
On Tue Jul 4, 2023 at 4:54 PM CEST, Stefano Garzarella wrote:
> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> >Currently QEMU has to know some details about the back-end to be able
> >to setup the guest. While various parts of the setup can be delegated
> >to the backend (for example config handling) this is a very piecemeal
> >approach.
> >
> >This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> >which the back-end can advertise which allows a probe message to be
> >sent to get all the details QEMU needs to know in one message.
> >
> >Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> >
> >---
> >Initial RFC for discussion. I intend to prototype this work with QEMU
> >and one of the rust-vmm vhost-user daemons.
>
> Thanks for starting this discussion!
>
> I'm comparing with vhost-vdpa IOCTLs, so my questions may be
> superficial, but they help me understand the differences.
>
> >---
> > docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
> > hw/virtio/vhost-user.c      |  8 ++++++++
> > 2 files changed, 45 insertions(+)
> >
> >diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> >index 5a070adbc1..85b1b1583a 100644
> >--- a/docs/interop/vhost-user.rst
> >+++ b/docs/interop/vhost-user.rst
> >@@ -275,6 +275,21 @@ Inflight description
> >
> > :queue size: a 16-bit size of virtqueues
> >
> >+Backend specifications
> >+^^^^^^^^^^^^^^^^^^^^^^
> >+
> >++-----------+-------------+------------+------------+
> >+| device id | config size |   min_vqs  |   max_vqs  |
> >++-----------+-------------+------------+------------+
> >+
> >+:device id: a 32-bit value holding the VirtIO device ID
> >+
> >+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> >+
> >+:min_vqs: a 32-bit value holding the minimum number of vqs supported
>
> Why do we need the minimum?
>
> >+
> >+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
>
> Is this overlap with VHOST_USER_GET_QUEUE_NUM?

While fiddeling with a rust-vmm implementation of this I wondered:

Would a standalone daemon even need VHOST_USER_PROTOCOL_F_MQ if
VHOST_USER_PROTOCOL_F_CONFIG is required anyway? It looks like
all full virtio devices provide config information that allows to
derive the exact or maximum number of queues already. So wouldn't
VHOST_USER_PROTOCOL_F_MQ just provide a second, redundant way to get
the information? (And this would be a third?)

Am I missing something here? Otherwise, I think we could drop dependency
on VHOST_USER_PROTOCOL_F_MQ and get the info from the device config for
standalone daemons?

- Erik
Stefan Hajnoczi July 26, 2023, 3:51 p.m. UTC | #19
On Wed, 26 Jul 2023 at 11:42, Erik Schilling <erik.schilling@linaro.org> wrote:
>
> On Tue Jul 4, 2023 at 4:54 PM CEST, Stefano Garzarella wrote:
> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > >Currently QEMU has to know some details about the back-end to be able
> > >to setup the guest. While various parts of the setup can be delegated
> > >to the backend (for example config handling) this is a very piecemeal
> > >approach.
> > >
> > >This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > >which the back-end can advertise which allows a probe message to be
> > >sent to get all the details QEMU needs to know in one message.
> > >
> > >Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> > >
> > >---
> > >Initial RFC for discussion. I intend to prototype this work with QEMU
> > >and one of the rust-vmm vhost-user daemons.
> >
> > Thanks for starting this discussion!
> >
> > I'm comparing with vhost-vdpa IOCTLs, so my questions may be
> > superficial, but they help me understand the differences.
> >
> > >---
> > > docs/interop/vhost-user.rst | 37 +++++++++++++++++++++++++++++++++++++
> > > hw/virtio/vhost-user.c      |  8 ++++++++
> > > 2 files changed, 45 insertions(+)
> > >
> > >diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > >index 5a070adbc1..85b1b1583a 100644
> > >--- a/docs/interop/vhost-user.rst
> > >+++ b/docs/interop/vhost-user.rst
> > >@@ -275,6 +275,21 @@ Inflight description
> > >
> > > :queue size: a 16-bit size of virtqueues
> > >
> > >+Backend specifications
> > >+^^^^^^^^^^^^^^^^^^^^^^
> > >+
> > >++-----------+-------------+------------+------------+
> > >+| device id | config size |   min_vqs  |   max_vqs  |
> > >++-----------+-------------+------------+------------+
> > >+
> > >+:device id: a 32-bit value holding the VirtIO device ID
> > >+
> > >+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> > >+
> > >+:min_vqs: a 32-bit value holding the minimum number of vqs supported
> >
> > Why do we need the minimum?
> >
> > >+
> > >+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
> >
> > Is this overlap with VHOST_USER_GET_QUEUE_NUM?
>
> While fiddeling with a rust-vmm implementation of this I wondered:
>
> Would a standalone daemon even need VHOST_USER_PROTOCOL_F_MQ if
> VHOST_USER_PROTOCOL_F_CONFIG is required anyway? It looks like
> all full virtio devices provide config information that allows to
> derive the exact or maximum number of queues already. So wouldn't
> VHOST_USER_PROTOCOL_F_MQ just provide a second, redundant way to get
> the information? (And this would be a third?)
>
> Am I missing something here? Otherwise, I think we could drop dependency
> on VHOST_USER_PROTOCOL_F_MQ and get the info from the device config for
> standalone daemons?

vhost (in general and that includes vhost-user) was not designed to be
a full VIRTIO device, just an interface for offloading specific
virtqueues. Now that vhost-user is being extended to support full
VIRTIO devices ("standalone"), some of the vhost-user protocol
features are redundant.

However, actually dropping the redundant features must be done
carefully because they might be used in ways that you don't expect by
existing vhost-user implementations. For example, maybe existing
vhost-user-net code doesn't look at the VIRTIO Configuration Space
field because it queries the number of virtqueues via
VHOST_USER_PROTOCOL_F_MQ + VHOST_USER_GET_QUEUE_NUM. So you might be
forced to keep them for compatibility.

I think the answer to your questions is: yes, but carefully because
you might break existing implementations or make it hard for them to
remain compatible.

Stefan
Michael S. Tsirkin July 26, 2023, 4:01 p.m. UTC | #20
On Thu, Jul 20, 2023 at 03:36:01PM -0400, Stefan Hajnoczi wrote:
> On Fri, Jul 07, 2023 at 12:27:39PM +0200, Stefano Garzarella wrote:
> > On Tue, Jul 04, 2023 at 04:02:42PM +0100, Alex Bennée wrote:
> > > 
> > > Stefano Garzarella <sgarzare@redhat.com> writes:
> > > 
> > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > > > index 5a070adbc1..85b1b1583a 100644
> > > > > --- a/docs/interop/vhost-user.rst
> > > > > +++ b/docs/interop/vhost-user.rst
> > > > > @@ -275,6 +275,21 @@ Inflight description
> > > > > 
> > > > > :queue size: a 16-bit size of virtqueues
> > > > > 
> > > > > +Backend specifications
> > > > > +^^^^^^^^^^^^^^^^^^^^^^
> > > > > +
> > > > > ++-----------+-------------+------------+------------+
> > > > > +| device id | config size |   min_vqs  |   max_vqs  |
> > > > > ++-----------+-------------+------------+------------+
> > > > > +
> > > > > +:device id: a 32-bit value holding the VirtIO device ID
> > > > > +
> > > > > +:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
> > > > > +
> > > > > +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> > > > 
> > > > Why do we need the minimum?
> > > 
> > > We need to know the minimum number because some devices have fixed VQs
> > > that must be present.
> > 
> > But does QEMU need to know this?
> > 
> > Or is it okay that the driver will then fail in the guest if there
> > are not the right number of queues?
> 
> I don't understand why min_vqs is needed either. It's not the
> front-end's job to ensure that the device will be used properly. A
> spec-compliant driver will work with a spec-compliant device, so it's
> not clear why the front-end needs this information.
> 
> Stefan



I think this really demonstrates why we should keep separate
messages and not the "standalone" thing which seems to
mean "bundle a bunch of mostly unrelated stuff in one message":
this way each field is carefully documented.
Michael S. Tsirkin July 26, 2023, 4:02 p.m. UTC | #21
On Mon, Jul 24, 2023 at 02:08:39PM -0400, Stefan Hajnoczi wrote:
> On Thu, Jul 20, 2023 at 06:22:08PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> > > On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > > > Currently QEMU has to know some details about the back-end to be able
> > > > > > > to setup the guest. While various parts of the setup can be delegated
> > > > > > > to the backend (for example config handling) this is a very piecemeal
> > > > > > > approach.
> > > > > >
> > > > > > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > > > which the back-end can advertise which allows a probe message to be
> > > > > > > sent to get all the details QEMU needs to know in one message.
> > > > > >
> > > > > > The reason we do piecemeal is that these existing pieces can be reused
> > > > > > as others evolve or fall by wayside.
> > > > > >
> > > > > > For example, I can think of instances where you want to connect
> > > > > > specifically to e.g. networking backend, and specify it
> > > > > > on command line. Reasons could be many, e.g. for debugging,
> > > > > > or to prevent connecting to wrong device on wrong channel
> > > > > > (kind of like type safety).
> > > > > >
> > > > > > What is the reason to have 1 message? startup latency?
> > > > > > How about we allow pipelining several messages then?
> > > > > > Will be easier.
> > > > >
> > > > > This flag effectively says that the back-end is a full VIRTIO device
> > > > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > > > device type, etc. This is different from previous vhost-user devices
> > > > > which sometimes just offloaded certain virtqueues without providing the
> > > > > full VIRTIO device (parts were emulated in the VMM).
> > > > >
> > > > > So for example, a vhost-user-net device does not support the controlq.
> > > > > Alex's "standalone" device is a mode where the vhost-user protocol is
> > > > > used but the back-end must implement a full virtio-net device.
> > > > > Standalone devices are like vDPA device in this respect.
> > > > >
> > > > > I think it is important to have a protocol feature bit that advertises
> > > > > that this is a standalone device, since the semantics are different for
> > > > > traditional vhost-user-net devices.
> > > >
> > > > Not sure what that would gain as compared to a feature bit per
> > > > message as we did previously.
> > > 
> > > Having a single feature bit makes it easier to distinguish between a
> > > traditional vhost-user device and a standalone device.
> > > 
> > > For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> > > you whether this device is a standalone device that is appropriate for
> > > a new generic QEMU --device vhost-user-device feature that Alex is
> > > working on. It could be a traditional vhost-user device that is not
> > > standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> > > 
> > > How will we detect standalone devices? It will be messy if there is no
> > > single feature bit that advertises that this back-end is a standalone
> > > device.
> > > 
> > > Stefan
> > 
> > Looks like standalone implies some 5-6 messages to be supported.
> > So just test the 6 bits are all ones.
> 
> It's not clear to me that the individual bits together mean this is
> really a standalone device, but let's go with individual commands and
> see if a front-end can distinguish standalone devices or not. If not,
> then we can still add "standalone" feature bit before merging the code.
> 
> Stefan


I think it just shows that what a "standalone" device is just isn't
that well defined ;).
Stefan Hajnoczi July 26, 2023, 5:37 p.m. UTC | #22
On Wed, Jul 26, 2023 at 12:02:33PM -0400, Michael S. Tsirkin wrote:
> On Mon, Jul 24, 2023 at 02:08:39PM -0400, Stefan Hajnoczi wrote:
> > On Thu, Jul 20, 2023 at 06:22:08PM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> > > > On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > > > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > > > > Currently QEMU has to know some details about the back-end to be able
> > > > > > > > to setup the guest. While various parts of the setup can be delegated
> > > > > > > > to the backend (for example config handling) this is a very piecemeal
> > > > > > > > approach.
> > > > > > >
> > > > > > > > This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > > > > which the back-end can advertise which allows a probe message to be
> > > > > > > > sent to get all the details QEMU needs to know in one message.
> > > > > > >
> > > > > > > The reason we do piecemeal is that these existing pieces can be reused
> > > > > > > as others evolve or fall by wayside.
> > > > > > >
> > > > > > > For example, I can think of instances where you want to connect
> > > > > > > specifically to e.g. networking backend, and specify it
> > > > > > > on command line. Reasons could be many, e.g. for debugging,
> > > > > > > or to prevent connecting to wrong device on wrong channel
> > > > > > > (kind of like type safety).
> > > > > > >
> > > > > > > What is the reason to have 1 message? startup latency?
> > > > > > > How about we allow pipelining several messages then?
> > > > > > > Will be easier.
> > > > > >
> > > > > > This flag effectively says that the back-end is a full VIRTIO device
> > > > > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > > > > device type, etc. This is different from previous vhost-user devices
> > > > > > which sometimes just offloaded certain virtqueues without providing the
> > > > > > full VIRTIO device (parts were emulated in the VMM).
> > > > > >
> > > > > > So for example, a vhost-user-net device does not support the controlq.
> > > > > > Alex's "standalone" device is a mode where the vhost-user protocol is
> > > > > > used but the back-end must implement a full virtio-net device.
> > > > > > Standalone devices are like vDPA device in this respect.
> > > > > >
> > > > > > I think it is important to have a protocol feature bit that advertises
> > > > > > that this is a standalone device, since the semantics are different for
> > > > > > traditional vhost-user-net devices.
> > > > >
> > > > > Not sure what that would gain as compared to a feature bit per
> > > > > message as we did previously.
> > > > 
> > > > Having a single feature bit makes it easier to distinguish between a
> > > > traditional vhost-user device and a standalone device.
> > > > 
> > > > For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> > > > you whether this device is a standalone device that is appropriate for
> > > > a new generic QEMU --device vhost-user-device feature that Alex is
> > > > working on. It could be a traditional vhost-user device that is not
> > > > standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> > > > 
> > > > How will we detect standalone devices? It will be messy if there is no
> > > > single feature bit that advertises that this back-end is a standalone
> > > > device.
> > > > 
> > > > Stefan
> > > 
> > > Looks like standalone implies some 5-6 messages to be supported.
> > > So just test the 6 bits are all ones.
> > 
> > It's not clear to me that the individual bits together mean this is
> > really a standalone device, but let's go with individual commands and
> > see if a front-end can distinguish standalone devices or not. If not,
> > then we can still add "standalone" feature bit before merging the code.
> > 
> > Stefan
> 
> 
> I think it just shows that what a "standalone" device is just isn't
> that well defined ;).

No, I think it's the opposite way around. The existing vhost model is
not well defined. From time to time someone has to extend it to add
things like Configuration Space support or VHOST_USER_GET_DEVICE_ID.

Standalone devices are well defined: they are what the VIRTIO
specification describes.

My concern is that new messages added for standalone devices might also
be useful for existing non-standalone devices. Or do you want to declare
these new messages off-limits to non-standalone devices?

Stefan
diff mbox series

Patch

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5a070adbc1..85b1b1583a 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -275,6 +275,21 @@  Inflight description
 
 :queue size: a 16-bit size of virtqueues
 
+Backend specifications
+^^^^^^^^^^^^^^^^^^^^^^
+
++-----------+-------------+------------+------------+
+| device id | config size |   min_vqs  |   max_vqs  |
++-----------+-------------+------------+------------+
+
+:device id: a 32-bit value holding the VirtIO device ID
+
+:config size: a 32-bit value holding the config size (see ``VHOST_USER_GET_CONFIG``)
+
+:min_vqs: a 32-bit value holding the minimum number of vqs supported
+
+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must be >= min_vqs
+
 C structure
 -----------
 
@@ -296,6 +311,7 @@  In QEMU the vhost-user message is implemented with the following struct:
           VhostUserConfig config;
           VhostUserVringArea area;
           VhostUserInflight inflight;
+          VhostUserBackendSpecs specs;
       };
   } QEMU_PACKED VhostUserMsg;
 
@@ -316,6 +332,7 @@  replies. Here is a list of the ones that do:
 * ``VHOST_USER_GET_VRING_BASE``
 * ``VHOST_USER_SET_LOG_BASE`` (if ``VHOST_USER_PROTOCOL_F_LOG_SHMFD``)
 * ``VHOST_USER_GET_INFLIGHT_FD`` (if ``VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD``)
+* ``VHOST_USER_GET_BACKEND_SPECS`` (if ``VHOST_USER_PROTOCOL_F_STANDALONE``)
 
 .. seealso::
 
@@ -885,6 +902,13 @@  Protocol features
   #define VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS  15
   #define VHOST_USER_PROTOCOL_F_STATUS               16
   #define VHOST_USER_PROTOCOL_F_XEN_MMAP             17
+  #define VHOST_USER_PROTOCOL_F_STANDALONE           18
+
+Some features are only valid in the presence of other supporting
+features. In the case of ``VHOST_USER_PROTOCOL_F_STANDALONE`` the
+backend must also support ``VHOST_USER_PROTOCOL_F_CONFIG`` and
+``VHOST_USER_PROTOCOL_F_STATUS``.
+
 
 Front-end message types
 -----------------------
@@ -1440,6 +1464,19 @@  Front-end message types
   query the back-end for its device status as defined in the Virtio
   specification.
 
+``VHOST_USER_GET_BACKEND_SPECS``
+  :id: 41
+  :request payload: N/A
+  :reply payload: ``Backend specifications``
+
+  When the ``VHOST_USER_PROTOCOL_F_STANDALONE`` protocol feature has been
+  successfully negotiated, this message is submitted by the front-end to
+  query the back-end for its capabilities. This is intended to remove
+  the need for the front-end to know ahead of time what the VirtIO
+  device the backend emulates is.
+
+  The reply contains the device id, size of the config space and the
+  range of VirtQueues the backend supports.
 
 Back-end message types
 ----------------------
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index c4e0cbd702..28b021d5d3 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -202,6 +202,13 @@  typedef struct VhostUserInflight {
     uint16_t queue_size;
 } VhostUserInflight;
 
+typedef struct VhostUserBackendSpecs {
+    uint32_t device_id;
+    uint32_t config_size;
+    uint32_t min_vqs;
+    uint32_t max_vqs;
+} VhostUserBackendSpecs;
+
 typedef struct {
     VhostUserRequest request;
 
@@ -226,6 +233,7 @@  typedef union {
         VhostUserCryptoSession session;
         VhostUserVringArea area;
         VhostUserInflight inflight;
+        VhostUserBackendSpecs specs;
 } VhostUserPayload;
 
 typedef struct VhostUserMsg {