Message ID | 20220801211240.597859-1-quic_eberman@quicinc.com |
---|---|
Headers | show |
Series | Drivers for gunyah hypervisor | expand |
On Mon, Aug 1, 2022 at 3:16 PM Elliot Berman <quic_eberman@quicinc.com> wrote: > > Gunyah is a Type-1 hypervisor independent of any > high-level OS kernel, and runs in a higher CPU privilege level. It does > not depend on any lower-privileged OS kernel/code for its core > functionality. This increases its security and can support a much smaller > trusted computing base than a Type-2 hypervisor. > > Gunyah is an open source hypervisor. The source repo is available at > https://github.com/quic/gunyah-hypervisor. Nowhere in this series do I see a change log, yet this is marked as v2. How is anyone supposed to identify what is the difference between v1 and v2?
On Mon, Aug 1, 2022 at 3:16 PM Elliot Berman <quic_eberman@quicinc.com> wrote: > diff --git a/MAINTAINERS b/MAINTAINERS > index 04ec80ee7352..18fb034526e1 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -8736,6 +8736,13 @@ L: linux-efi@vger.kernel.org > S: Maintained > F: block/partitions/efi.* > > +GUNYAH HYPERVISOR DRIVER > +M: Elliot Berman <quic_eberman@quicinc.com> > +M: Murali Nalajala <quic_mnalajal@quicinc.com> > +L: linux-arm-msm@vger.kernel.org > +S: Maintained Supported? Sure seems like something you'd be paid to maintain.... > +F: Documentation/virt/gunyah/ > + > HABANALABS PCI DRIVER > M: Oded Gabbay <ogabbay@kernel.org> > S: Supported > -- > 2.25.1 >
Hi Jeffrey, On 8/1/2022 2:27 PM, Jeffrey Hugo wrote: > On Mon, Aug 1, 2022 at 3:16 PM Elliot Berman <quic_eberman@quicinc.com> wrote: >> >> Gunyah is a Type-1 hypervisor independent of any >> high-level OS kernel, and runs in a higher CPU privilege level. It does >> not depend on any lower-privileged OS kernel/code for its core >> functionality. This increases its security and can support a much smaller >> trusted computing base than a Type-2 hypervisor. >> >> Gunyah is an open source hypervisor. The source repo is available at >> https://github.com/quic/gunyah-hypervisor. > > Nowhere in this series do I see a change log, yet this is marked as > v2. How is anyone supposed to identify what is the difference between > v1 and v2? I dropped the message when copying cover letter: Changes in v2: - DT bindings clean up - Switch hypercalls to follow SMCCC
On 02/08/2022 00:12, Elliot Berman wrote: > Add architecture-independent standard error codes, types, and macros for > Gunyah hypercalls. > > Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> > --- > MAINTAINERS | 1 + > include/linux/gunyah.h | 75 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 76 insertions(+) > create mode 100644 include/linux/gunyah.h > > diff --git a/MAINTAINERS b/MAINTAINERS > index 02f97ac90cdf..2e4f1d9ed47b 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -8744,6 +8744,7 @@ S: Maintained > F: Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml > F: Documentation/virt/gunyah/ > F: arch/arm64/include/asm/gunyah.h > +F: include/linux/gunyah.h > > HABANALABS PCI DRIVER > M: Oded Gabbay <ogabbay@kernel.org> > diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h > new file mode 100644 > index 000000000000..69931a0f5736 > --- /dev/null > +++ b/include/linux/gunyah.h > @@ -0,0 +1,75 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. > + */ > + > +#ifndef _GUNYAH_H > +#define _GUNYAH_H > + > +#include <linux/types.h> > +#include <linux/errno.h> > +#include <asm/gunyah.h> > + > +typedef u64 gh_capid_t; > + > +/* Common Gunyah macros */ > +#define GH_CAPID_INVAL U64_MAX > + > +#define GH_ERROR_OK 0 Is there any semantic difference between GH_ERROR_foo < 0 and GH_ERROR_bar > 0 ? > +#define GH_ERROR_UNIMPLEMENTED -1 > +#define GH_ERROR_RETRY -2 > + > +#define GH_ERROR_ARG_INVAL 1 > +#define GH_ERROR_ARG_SIZE 2 > +#define GH_ERROR_ARG_ALIGN 3 > + > +#define GH_ERROR_NOMEM 10 > + > +#define GH_ERROR_ADDR_OVFL 20 > +#define GH_ERROR_ADDR_UNFL 21 > +#define GH_ERROR_ADDR_INVAL 22 > + > +#define GH_ERROR_DENIED 30 > +#define GH_ERROR_BUSY 31 > +#define GH_ERROR_IDLE 32 > + > +#define GH_ERROR_IRQ_BOUND 40 > +#define GH_ERROR_IRQ_UNBOUND 41 > + > +#define GH_ERROR_CSPACE_CAP_NULL 50 > +#define GH_ERROR_CSPACE_CAP_REVOKED 51 > +#define GH_ERROR_CSPACE_WRONG_OBJ_TYPE 52 > +#define GH_ERROR_CSPACE_INSUF_RIGHTS 53 > +#define GH_ERROR_CSPACE_FULL 54 > + > +#define GH_ERROR_MSGQUEUE_EMPTY 60 > +#define GH_ERROR_MSGQUEUE_FULL 61 > + > +static inline int gh_remap_error(int gh_error) > +{ > + switch (gh_error) { > + case GH_ERROR_OK: > + return 0; > + case GH_ERROR_NOMEM: > + return -ENOMEM; > + case GH_ERROR_DENIED: > + case GH_ERROR_CSPACE_CAP_NULL: > + case GH_ERROR_CSPACE_CAP_REVOKED: > + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE: > + case GH_ERROR_CSPACE_INSUF_RIGHTS: > + case GH_ERROR_CSPACE_FULL: > + return -EACCES; > + case GH_ERROR_BUSY: > + case GH_ERROR_IDLE: > + return -EBUSY; > + case GH_ERROR_IRQ_BOUND: > + case GH_ERROR_IRQ_UNBOUND: > + case GH_ERROR_MSGQUEUE_FULL: > + case GH_ERROR_MSGQUEUE_EMPTY: > + return -EPERM; > + default: > + return -EINVAL; > + } > +} > + > +#endif
On 02/08/2022 00:12, Elliot Berman wrote: > Gunyah message queues are unidirectional pipelines to communicate > between 2 virtual machines, but are typically paired to allow > bidirectional communication. The intended use case is for small control > messages between 2 VMs, as they support a maximum of 240 bytes. > > Message queues can be discovered either by resource manager or on the > devicetree. To support discovery on the devicetree, client drivers can devicetree and discovery do not quite match to me. The device is delared in the DT, not discovered. > use gh_msgq_platform_host_attach to allocate the tx and rx message > queues according to > Documentation/devicetree/bindings/gunyah/qcom,hypervisor.yml. -ENOSUCHFILE > > Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> > --- > arch/arm64/include/asm/gunyah.h | 4 + > drivers/virt/gunyah/Makefile | 2 +- > drivers/virt/gunyah/gunyah_private.h | 3 + > drivers/virt/gunyah/msgq.c | 223 +++++++++++++++++++++++++++ > drivers/virt/gunyah/sysfs.c | 9 ++ > include/linux/gunyah.h | 13 ++ > 6 files changed, 253 insertions(+), 1 deletion(-) > create mode 100644 drivers/virt/gunyah/msgq.c > > diff --git a/arch/arm64/include/asm/gunyah.h b/arch/arm64/include/asm/gunyah.h > index 3aee35009910..ba7398bd851b 100644 > --- a/arch/arm64/include/asm/gunyah.h > +++ b/arch/arm64/include/asm/gunyah.h > @@ -27,6 +27,10 @@ > | ((fn) & GH_CALL_FUNCTION_NUM_MASK)) > > #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x0000) > +#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x001B) > +#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x001C) > + > +#define GH_HYPERCALL_MSGQ_SEND_FLAGS_PUSH BIT(0) > > #define ___gh_count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x > > diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile > index 3869fb7371df..94dc8e738911 100644 > --- a/drivers/virt/gunyah/Makefile > +++ b/drivers/virt/gunyah/Makefile > @@ -1,4 +1,4 @@ > # SPDX-License-Identifier: GPL-2.0-only > > -gunyah-y += sysfs.o device.o > +gunyah-y += sysfs.o device.o msgq.o > obj-$(CONFIG_GUNYAH) += gunyah.o > \ No newline at end of file Newline > diff --git a/drivers/virt/gunyah/gunyah_private.h b/drivers/virt/gunyah/gunyah_private.h > index 5f3832608020..2ade32bd9bdf 100644 > --- a/drivers/virt/gunyah/gunyah_private.h > +++ b/drivers/virt/gunyah/gunyah_private.h > @@ -9,4 +9,7 @@ > int __init gunyah_bus_init(void); > void gunyah_bus_exit(void); > > +int __init gh_msgq_init(void); > +void gh_msgq_exit(void); > + > #endif > diff --git a/drivers/virt/gunyah/msgq.c b/drivers/virt/gunyah/msgq.c > new file mode 100644 > index 000000000000..afc2572d3e7d > --- /dev/null > +++ b/drivers/virt/gunyah/msgq.c > @@ -0,0 +1,223 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved. > + */ > + > +#include <linux/interrupt.h> > +#include <linux/gunyah.h> > +#include <linux/module.h> > +#include <linux/printk.h> > +#include <linux/init.h> > +#include <linux/slab.h> > +#include <linux/wait.h> > + > +#include "gunyah_private.h" > + > +struct gh_msgq { > + bool ready; > + wait_queue_head_t wq; > + spinlock_t lock; > +}; > + > +static irqreturn_t gh_msgq_irq_handler(int irq, void *dev) > +{ > + struct gh_msgq *msgq = dev; > + > + spin_lock(&msgq->lock); > + msgq->ready = true; > + spin_unlock(&msgq->lock); > + wake_up_interruptible_all(&msgq->wq); > + > + return IRQ_HANDLED; > +} > + > +static int __gh_msgq_send(struct gunyah_device *ghdev, void *buff, size_t size, u64 tx_flags) > +{ > + unsigned long flags, gh_error; > + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); > + ssize_t ret; > + bool ready; > + > + spin_lock_irqsave(&msgq->lock, flags); > + arch_gh_hypercall(GH_HYPERCALL_MSGQ_SEND, 5, > + ghdev->capid, size, (uintptr_t)buff, tx_flags, 0, > + gh_error, ready); > + switch (gh_error) { > + case GH_ERROR_OK: > + ret = 0; > + msgq->ready = ready; > + break; > + case GH_ERROR_MSGQUEUE_FULL: > + ret = -EAGAIN; > + msgq->ready = false; > + break; > + default: > + ret = gh_remap_error(gh_error); > + break; > + } > + spin_unlock_irqrestore(&msgq->lock, flags); > + > + return ret; > +} > + > +/** > + * gh_msgq_send() - Send a message to the client running on a different VM > + * @client: The client descriptor that was obtained via gh_msgq_register() > + * @buff: Pointer to the buffer where the received data must be placed > + * @buff_size: The size of the buffer space available > + * @flags: Optional flags to pass to receive the data. For the list of flags, > + * see linux/gunyah/gh_msgq.h > + * > + * Returns: The number of bytes copied to buff. <0 if there was an error. > + * > + * Note: this function may sleep and should not be called from interrupt context > + */ > +ssize_t gh_msgq_send(struct gunyah_device *ghdev, void *buff, size_t size, > + const unsigned long flags) > +{ > + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); > + ssize_t ret; > + u64 tx_flags = 0; > + > + if (flags & GH_MSGQ_TX_PUSH) > + tx_flags |= GH_HYPERCALL_MSGQ_SEND_FLAGS_PUSH; > + > + do { > + ret = __gh_msgq_send(ghdev, buff, size, tx_flags); > + > + if (ret == -EAGAIN) { > + if (flags & GH_MSGQ_NONBLOCK) > + goto out; > + if (wait_event_interruptible(msgq->wq, msgq->ready)) > + ret = -ERESTARTSYS; > + } > + } while (ret == -EAGAIN); Any limit on the amount of retries? Can the driver wait forever here? > + > +out: > + return ret; > +} > +EXPORT_SYMBOL_GPL(gh_msgq_send); Both _send and _recv functions are not well designed. Can you call gh_msgq_send() on any gunyah_device? Yes. Will it work? No. Could you please check if mailbox API work for you? It seems that it is what you are trying to implement on your own. > + > +static ssize_t __gh_msgq_recv(struct gunyah_device *ghdev, void *buff, size_t size) > +{ > + unsigned long flags, gh_error; > + size_t recv_size; > + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); > + ssize_t ret; > + bool ready; > + > + spin_lock_irqsave(&msgq->lock, flags); > + > + arch_gh_hypercall(GH_HYPERCALL_MSGQ_RECV, 4, > + ghdev->capid, (uintptr_t)buff, size, 0, > + gh_error, recv_size, ready); > + switch (gh_error) { > + case GH_ERROR_OK: > + ret = recv_size; > + msgq->ready = ready; > + break; > + case GH_ERROR_MSGQUEUE_EMPTY: > + ret = -EAGAIN; > + msgq->ready = false; > + break; > + default: > + ret = gh_remap_error(gh_error); > + break; > + } > + spin_unlock_irqrestore(&msgq->lock, flags); > + > + return ret; > +} > + > +/** > + * gh_msgq_recv() - Receive a message from the client running on a different VM > + * @client: The client descriptor that was obtained via gh_msgq_register() > + * @buff: Pointer to the buffer where the received data must be placed > + * @buff_size: The size of the buffer space available > + * @flags: Optional flags to pass to receive the data. For the list of flags, > + * see linux/gunyah/gh_msgq.h > + * > + * Returns: The number of bytes copied to buff. <0 if there was an error. > + * > + * Note: this function may sleep and should not be called from interrupt context > + */ > +ssize_t gh_msgq_recv(struct gunyah_device *ghdev, void *buff, size_t size, > + const unsigned long flags) > +{ > + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); > + ssize_t ret; > + > + do { > + ret = __gh_msgq_recv(ghdev, buff, size); > + > + if (ret == -EAGAIN) { > + if (flags & GH_MSGQ_NONBLOCK) > + goto out; > + if (wait_event_interruptible(msgq->wq, msgq->ready)) > + ret = -ERESTARTSYS; > + } > + } while (ret == -EAGAIN); > + > +out: > + return ret; > +} > +EXPORT_SYMBOL_GPL(gh_msgq_recv); > + > +static int gh_msgq_probe(struct gunyah_device *ghdev) > +{ > + struct gh_msgq *msgq; > + > + msgq = devm_kzalloc(&ghdev->dev, sizeof(*msgq), GFP_KERNEL); > + if (!msgq) > + return -ENOMEM; > + ghdev_set_drvdata(ghdev, msgq); > + > + msgq->ready = true; /* Assume we can use the message queue right away */ > + init_waitqueue_head(&msgq->wq); > + spin_lock_init(&msgq->lock); > + > + return devm_request_irq(&ghdev->dev, ghdev->irq, gh_msgq_irq_handler, 0, > + dev_name(&ghdev->dev), msgq); > +} > + > +static struct gunyah_driver gh_msgq_tx_driver = { > + .driver = { > + .name = "gh_msgq_tx", > + .owner = THIS_MODULE, > + }, > + .type = GUNYAH_DEVICE_TYPE_MSGQ_TX, > + .probe = gh_msgq_probe, > +}; > + > +static struct gunyah_driver gh_msgq_rx_driver = { > + .driver = { > + .name = "gh_msgq_rx", > + .owner = THIS_MODULE, > + }, > + .type = GUNYAH_DEVICE_TYPE_MSGQ_RX, > + .probe = gh_msgq_probe, If you have to duplicate the whole device structure just to bind to two difference devices, it looks like a bad abstraction. Please check how other busses have solved this issue. They did, believe me. > +}; MODULE_DEVICE_TABLE() ? > + > +int __init gh_msgq_init(void) > +{ > + int ret; > + > + ret = gunyah_register_driver(&gh_msgq_tx_driver); > + if (ret) > + return ret; > + > + ret = gunyah_register_driver(&gh_msgq_rx_driver); > + if (ret) > + goto err_rx; > + > + return ret; > +err_rx: > + gunyah_unregister_driver(&gh_msgq_tx_driver); > + return ret; > +} > + > +void gh_msgq_exit(void) > +{ > + gunyah_unregister_driver(&gh_msgq_rx_driver); > + gunyah_unregister_driver(&gh_msgq_tx_driver); > +} > diff --git a/drivers/virt/gunyah/sysfs.c b/drivers/virt/gunyah/sysfs.c > index 220560cb3b1c..7589689e5e92 100644 > --- a/drivers/virt/gunyah/sysfs.c > +++ b/drivers/virt/gunyah/sysfs.c > @@ -73,6 +73,8 @@ static ssize_t features_show(struct kobject *kobj, struct kobj_attribute *attr, > > if (GH_IDENTIFY_PARTITION_CSPACE(gunyah_api.flags)) > len += sysfs_emit_at(buffer, len, "cspace "); > + if (GH_IDENTIFY_MSGQUEUE(gunyah_api.flags)) > + len += sysfs_emit_at(buffer, len, "message-queue "); Again, this should go to the sysfs patch. > > len += sysfs_emit_at(buffer, len, "\n"); > return len; > @@ -142,7 +144,13 @@ static int __init gunyah_init(void) > if (ret) > goto err_sysfs; > > + ret = gh_msgq_init(); > + if (ret) > + goto err_bus; > + Please stop beating everything in a single module. Having a provider (bus) and a consumer (drivers for this bus) in a single module sounds like an overkill. Or, a wrong abstraction. Please remind me, why do you need gunyah bus in the first place? I could not find any other calls to gunyah_device_add in this series. Which devices do you expect to be added in future? Would they require separate drivers? > return ret; > +err_bus: > + gunyah_bus_exit(); > err_sysfs: > gh_sysfs_unregister(); > return ret; > @@ -151,6 +159,7 @@ module_init(gunyah_init); > > static void __exit gunyah_exit(void) > { > + gh_msgq_exit(); > gunyah_bus_exit(); > gh_sysfs_unregister(); > } > diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h > index ce35f4491773..099224f9d6d1 100644 > --- a/include/linux/gunyah.h > +++ b/include/linux/gunyah.h > @@ -6,6 +6,7 @@ > #ifndef _GUNYAH_H > #define _GUNYAH_H > > +#include <linux/platform_device.h> > #include <linux/device.h> > #include <linux/types.h> > #include <linux/errno.h> > @@ -117,4 +118,16 @@ struct gunyah_driver { > int gunyah_register_driver(struct gunyah_driver *ghdrv); > void gunyah_unregister_driver(struct gunyah_driver *ghdrv); > > +#define GH_MSGQ_MAX_MSG_SIZE 1024 > + > +/* Possible flags to pass for Tx or Rx */ > +#define GH_MSGQ_TX_PUSH BIT(0) > +#define GH_MSGQ_NONBLOCK BIT(32) > + > +ssize_t gh_msgq_send(struct gunyah_device *ghdev, void *buff, size_t size, > + const unsigned long flags); > +ssize_t gh_msgq_recv(struct gunyah_device *ghdev, void *buff, size_t size, > + const unsigned long flags); > + > + > #endif
On 02/08/2022 00:12, Elliot Berman wrote: > Gunyah is a Type-1 hypervisor independent of any > high-level OS kernel, and runs in a higher CPU privilege level. It does > not depend on any lower-privileged OS kernel/code for its core > functionality. This increases its security and can support a much smaller > trusted computing base than a Type-2 hypervisor. > > Gunyah is an open source hypervisor. The source repo is available at > https://github.com/quic/gunyah-hypervisor. > > The diagram below shows the architecture. > > :: > > Primary VM Secondary VMs Is there any significant difference between Primary VM and other VMs? > +-----+ +-----+ | +-----+ +-----+ +-----+ > | | | | | | | | | | | > EL0 | APP | | APP | | | APP | | APP | | APP | > | | | | | | | | | | | > +-----+ +-----+ | +-----+ +-----+ +-----+ > ---------------------|------------------------- > +--------------+ | +----------------------+ > | | | | | > EL1 | Linux Kernel | | |Linux kernel/Other OS | ... > | | | | | > +--------------+ | +----------------------+ > --------hvc/smc------|------hvc/smc------------ > +----------------------------------------+ > | | > EL2 | Gunyah Hypervisor | > | | > +----------------------------------------+ > > Gunyah provides these following features. > > - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on > physical CPUs and enables time-sharing of the CPUs. Is the scheduling provided behind the back of the OS or does it require cooperation? > - Memory Management: Gunyah tracks memory ownership and use of all memory > under its control. Memory partitioning between VMs is a fundamental > security feature. > - Interrupt Virtualization: All interrupts are handled in the hypervisor > and routed to the assigned VM. > - Inter-VM Communication: There are several different mechanisms provided > for communicating between VMs. > - Device Virtualization: Para-virtualization of devices is supported using > inter-VM communication. Low level system features and devices such as > interrupt controllers are supported with emulation where required. After reviewing some of the patches from the series, I'd like to understand, what does it provide (and can be provided) to the VMs. I'd like to understand it first, before going deep into the API issues. 1) The hypervisor provides message queues, doorbells and vCPUs Each of resources has it's own capability ID. Why is it called capability? Is it just a misname for the resource ID, or has it any other meaning behind? If it is a capability, who is capable of what? At this moment you create allocate two message queues with fixed IDs for communication with resource manager. Then you use these message queues to organize a console and a pack of tty devices. What other kinds of services does RM provide to the guest OS? Do you expect any other drivers to be calling into the RM? What is the usecase for the doorbells? Who provides doorbells? You mentioned that the RM generates DT overlays. What kind of information goes to the overlay? My current impression of this series is that you have misused the concept of devices. Rather than exporting MSGQs and BELLs as gunyah_devices and then using them from other drivers, I'd suggest turning them into resources provided by the gunyah driver core. I mentioned using the mailbox API for this. Another subsystem that might ring the bell for you is the remoteproc, especially the rproc_subdev. I might be completely wrong about this, but if my in-mind picture of Gunyah is correct, I'd have implemented the gunyah core subsytem as mailbox provider, RM as a separate platform driver consuming these mailboxes and in turn being a remoteproc driver, and consoles as remoteproc subdevices. I can assume that at some point you would like to use Gunyah to boot secondary VMs from the primary VM by calling into RM, etc. Most probably at this moment a VM would be allocated other bells, message queues, etc. If this assumption is correct, them the VM can become a separate device (remoteproc?) in the Linux device tree. I might be wrong in any of the assumptions above. Please feel free to correct me. We can then think about a better API for your usecase.
On 8/2/2022 12:33 AM, Dmitry Baryshkov wrote: > On 02/08/2022 00:12, Elliot Berman wrote: >> Add architecture-independent standard error codes, types, and macros for >> Gunyah hypercalls. >> >> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> >> --- >> MAINTAINERS | 1 + >> include/linux/gunyah.h | 75 ++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 76 insertions(+) >> create mode 100644 include/linux/gunyah.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 02f97ac90cdf..2e4f1d9ed47b 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -8744,6 +8744,7 @@ S: Maintained >> F: Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml >> F: Documentation/virt/gunyah/ >> F: arch/arm64/include/asm/gunyah.h >> +F: include/linux/gunyah.h >> HABANALABS PCI DRIVER >> M: Oded Gabbay <ogabbay@kernel.org> >> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h >> new file mode 100644 >> index 000000000000..69931a0f5736 >> --- /dev/null >> +++ b/include/linux/gunyah.h >> @@ -0,0 +1,75 @@ >> +/* SPDX-License-Identifier: GPL-2.0-only */ >> +/* >> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights >> reserved. >> + */ >> + >> +#ifndef _GUNYAH_H >> +#define _GUNYAH_H >> + >> +#include <linux/types.h> >> +#include <linux/errno.h> >> +#include <asm/gunyah.h> >> + >> +typedef u64 gh_capid_t; >> + >> +/* Common Gunyah macros */ >> +#define GH_CAPID_INVAL U64_MAX >> + >> +#define GH_ERROR_OK 0 > > Is there any semantic difference between GH_ERROR_foo < 0 and > GH_ERROR_bar > 0 ? > GH_ERROR_foo < 0 comes from Gunyah's plumbing for handling hypercalls. GH_ERROR_bar > 0 comes from the hypercall itself. >> +#define GH_ERROR_UNIMPLEMENTED -1 >> +#define GH_ERROR_RETRY -2 >> + >> +#define GH_ERROR_ARG_INVAL 1 >> +#define GH_ERROR_ARG_SIZE 2 >> +#define GH_ERROR_ARG_ALIGN 3 >> + >> +#define GH_ERROR_NOMEM 10 >> + >> +#define GH_ERROR_ADDR_OVFL 20 >> +#define GH_ERROR_ADDR_UNFL 21 >> +#define GH_ERROR_ADDR_INVAL 22 >> + >> +#define GH_ERROR_DENIED 30 >> +#define GH_ERROR_BUSY 31 >> +#define GH_ERROR_IDLE 32 >> + >> +#define GH_ERROR_IRQ_BOUND 40 >> +#define GH_ERROR_IRQ_UNBOUND 41 >> + >> +#define GH_ERROR_CSPACE_CAP_NULL 50 >> +#define GH_ERROR_CSPACE_CAP_REVOKED 51 >> +#define GH_ERROR_CSPACE_WRONG_OBJ_TYPE 52 >> +#define GH_ERROR_CSPACE_INSUF_RIGHTS 53 >> +#define GH_ERROR_CSPACE_FULL 54 >> + >> +#define GH_ERROR_MSGQUEUE_EMPTY 60 >> +#define GH_ERROR_MSGQUEUE_FULL 61 >> + >> +static inline int gh_remap_error(int gh_error) >> +{ >> + switch (gh_error) { >> + case GH_ERROR_OK: >> + return 0; >> + case GH_ERROR_NOMEM: >> + return -ENOMEM; >> + case GH_ERROR_DENIED: >> + case GH_ERROR_CSPACE_CAP_NULL: >> + case GH_ERROR_CSPACE_CAP_REVOKED: >> + case GH_ERROR_CSPACE_WRONG_OBJ_TYPE: >> + case GH_ERROR_CSPACE_INSUF_RIGHTS: >> + case GH_ERROR_CSPACE_FULL: >> + return -EACCES; >> + case GH_ERROR_BUSY: >> + case GH_ERROR_IDLE: >> + return -EBUSY; >> + case GH_ERROR_IRQ_BOUND: >> + case GH_ERROR_IRQ_UNBOUND: >> + case GH_ERROR_MSGQUEUE_FULL: >> + case GH_ERROR_MSGQUEUE_EMPTY: >> + return -EPERM; >> + default: >> + return -EINVAL; >> + } >> +} >> + >> +#endif > >
On Mon, Aug 01, 2022 at 02:12:29PM -0700, Elliot Berman wrote: > Gunyah is a Type-1 hypervisor independent of any > high-level OS kernel, and runs in a higher CPU privilege level. It does > not depend on any lower-privileged OS kernel/code for its core > functionality. This increases its security and can support a much smaller > trusted computing base than a Type-2 hypervisor. > > Gunyah is an open source hypervisor. The source repo is available at > https://github.com/quic/gunyah-hypervisor. > > The diagram below shows the architecture. > > :: > > Primary VM Secondary VMs > +-----+ +-----+ | +-----+ +-----+ +-----+ > | | | | | | | | | | | > EL0 | APP | | APP | | | APP | | APP | | APP | > | | | | | | | | | | | > +-----+ +-----+ | +-----+ +-----+ +-----+ > ---------------------|------------------------- > +--------------+ | +----------------------+ > | | | | | > EL1 | Linux Kernel | | |Linux kernel/Other OS | ... > | | | | | > +--------------+ | +----------------------+ > --------hvc/smc------|------hvc/smc------------ > +----------------------------------------+ > | | > EL2 | Gunyah Hypervisor | > | | > +----------------------------------------+ > > Gunyah provides these following features. > > - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on > physical CPUs and enables time-sharing of the CPUs. > - Memory Management: Gunyah tracks memory ownership and use of all memory > under its control. Memory partitioning between VMs is a fundamental > security feature. > - Interrupt Virtualization: All interrupts are handled in the hypervisor > and routed to the assigned VM. > - Inter-VM Communication: There are several different mechanisms provided > for communicating between VMs. > - Device Virtualization: Para-virtualization of devices is supported using > inter-VM communication. Low level system features and devices such as > interrupt controllers are supported with emulation where required. > Hi, I can't apply this series on top of mainline or linux-next. On what tree (and what commit) this series is based on? I'd like to do htmldocs test. Thanks.
On 8/4/2022 1:26 AM, Bagas Sanjaya wrote: > On Mon, Aug 01, 2022 at 02:12:29PM -0700, Elliot Berman wrote: >> Gunyah is a Type-1 hypervisor independent of any >> high-level OS kernel, and runs in a higher CPU privilege level. It does >> not depend on any lower-privileged OS kernel/code for its core >> functionality. This increases its security and can support a much smaller >> trusted computing base than a Type-2 hypervisor. >> >> Gunyah is an open source hypervisor. The source repo is available at >> https://github.com/quic/gunyah-hypervisor. >> >> The diagram below shows the architecture. >> >> :: >> >> Primary VM Secondary VMs >> +-----+ +-----+ | +-----+ +-----+ +-----+ >> | | | | | | | | | | | >> EL0 | APP | | APP | | | APP | | APP | | APP | >> | | | | | | | | | | | >> +-----+ +-----+ | +-----+ +-----+ +-----+ >> ---------------------|------------------------- >> +--------------+ | +----------------------+ >> | | | | | >> EL1 | Linux Kernel | | |Linux kernel/Other OS | ... >> | | | | | >> +--------------+ | +----------------------+ >> --------hvc/smc------|------hvc/smc------------ >> +----------------------------------------+ >> | | >> EL2 | Gunyah Hypervisor | >> | | >> +----------------------------------------+ >> >> Gunyah provides these following features. >> >> - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on >> physical CPUs and enables time-sharing of the CPUs. >> - Memory Management: Gunyah tracks memory ownership and use of all memory >> under its control. Memory partitioning between VMs is a fundamental >> security feature. >> - Interrupt Virtualization: All interrupts are handled in the hypervisor >> and routed to the assigned VM. >> - Inter-VM Communication: There are several different mechanisms provided >> for communicating between VMs. >> - Device Virtualization: Para-virtualization of devices is supported using >> inter-VM communication. Low level system features and devices such as >> interrupt controllers are supported with emulation where required. >> > > Hi, > > I can't apply this series on top of mainline or linux-next. On what tree > (and what commit) this series is based on? I'd like to do htmldocs test. > The series should apply cleanly on commit 4a57a8400075 ("vf/remap: return the amount of bytes actually deduplicated") from Linus's tree. > Thanks. >
On Thu, Aug 04, 2022 at 02:48:58PM -0700, Elliot Berman wrote: > > > > Hi, > > > > I can't apply this series on top of mainline or linux-next. On what tree > > (and what commit) this series is based on? I'd like to do htmldocs test. > > > > The series should apply cleanly on commit 4a57a8400075 ("vf/remap: return > the amount of bytes actually deduplicated") from Linus's tree. > Applied, thanks. Next time, don't forget to specify --base when using git-format-patch.
On Fri, 05 Aug 2022 03:15:24 +0100, Bagas Sanjaya <bagasdotme@gmail.com> wrote: > > On Thu, Aug 04, 2022 at 02:48:58PM -0700, Elliot Berman wrote: > > > > > > Hi, > > > > > > I can't apply this series on top of mainline or linux-next. On what tree > > > (and what commit) this series is based on? I'd like to do htmldocs test. > > > > > > > The series should apply cleanly on commit 4a57a8400075 ("vf/remap: return > > the amount of bytes actually deduplicated") from Linus's tree. > > > > Applied, thanks. > > Next time, don't forget to specify --base when using git-format-patch. Or even better, use a tagged release as the base (an early -rc would do), and not some random commit. Thanks, M.
On 8/2/2022 1:14 AM, Dmitry Baryshkov wrote: > On 02/08/2022 00:12, Elliot Berman wrote: >> Gunyah message queues are unidirectional pipelines to communicate >> between 2 virtual machines, but are typically paired to allow >> bidirectional communication. The intended use case is for small control >> messages between 2 VMs, as they support a maximum of 240 bytes. >> >> Message queues can be discovered either by resource manager or on the >> devicetree. To support discovery on the devicetree, client drivers can > > devicetree and discovery do not quite match to me. The device is delared > in the DT, not discovered. > >> use gh_msgq_platform_host_attach to allocate the tx and rx message >> queues according to >> Documentation/devicetree/bindings/gunyah/qcom,hypervisor.yml. > > -ENOSUCHFILE > Should be Documentaton/devicetree/bindings/firmware/gunyah-hypervisor.yaml >> >> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> >> --- >> arch/arm64/include/asm/gunyah.h | 4 + >> drivers/virt/gunyah/Makefile | 2 +- >> drivers/virt/gunyah/gunyah_private.h | 3 + >> drivers/virt/gunyah/msgq.c | 223 +++++++++++++++++++++++++++ >> drivers/virt/gunyah/sysfs.c | 9 ++ >> include/linux/gunyah.h | 13 ++ >> 6 files changed, 253 insertions(+), 1 deletion(-) >> create mode 100644 drivers/virt/gunyah/msgq.c >> >> diff --git a/arch/arm64/include/asm/gunyah.h >> b/arch/arm64/include/asm/gunyah.h >> index 3aee35009910..ba7398bd851b 100644 >> --- a/arch/arm64/include/asm/gunyah.h >> +++ b/arch/arm64/include/asm/gunyah.h >> @@ -27,6 +27,10 @@ >> | ((fn) & GH_CALL_FUNCTION_NUM_MASK)) >> #define GH_HYPERCALL_HYP_IDENTIFY GH_HYPERCALL(0x0000) >> +#define GH_HYPERCALL_MSGQ_SEND GH_HYPERCALL(0x001B) >> +#define GH_HYPERCALL_MSGQ_RECV GH_HYPERCALL(0x001C) >> + >> +#define GH_HYPERCALL_MSGQ_SEND_FLAGS_PUSH BIT(0) >> #define ___gh_count_args(_0, _1, _2, _3, _4, _5, _6, _7, _8, x, ...) x >> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile >> index 3869fb7371df..94dc8e738911 100644 >> --- a/drivers/virt/gunyah/Makefile >> +++ b/drivers/virt/gunyah/Makefile >> @@ -1,4 +1,4 @@ >> # SPDX-License-Identifier: GPL-2.0-only >> -gunyah-y += sysfs.o device.o >> +gunyah-y += sysfs.o device.o msgq.o >> obj-$(CONFIG_GUNYAH) += gunyah.o >> \ No newline at end of file > > Newline > >> diff --git a/drivers/virt/gunyah/gunyah_private.h >> b/drivers/virt/gunyah/gunyah_private.h >> index 5f3832608020..2ade32bd9bdf 100644 >> --- a/drivers/virt/gunyah/gunyah_private.h >> +++ b/drivers/virt/gunyah/gunyah_private.h >> @@ -9,4 +9,7 @@ >> int __init gunyah_bus_init(void); >> void gunyah_bus_exit(void); >> +int __init gh_msgq_init(void); >> +void gh_msgq_exit(void); >> + >> #endif >> diff --git a/drivers/virt/gunyah/msgq.c b/drivers/virt/gunyah/msgq.c >> new file mode 100644 >> index 000000000000..afc2572d3e7d >> --- /dev/null >> +++ b/drivers/virt/gunyah/msgq.c >> @@ -0,0 +1,223 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +/* >> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights >> reserved. >> + */ >> + >> +#include <linux/interrupt.h> >> +#include <linux/gunyah.h> >> +#include <linux/module.h> >> +#include <linux/printk.h> >> +#include <linux/init.h> >> +#include <linux/slab.h> >> +#include <linux/wait.h> >> + >> +#include "gunyah_private.h" >> + >> +struct gh_msgq { >> + bool ready; >> + wait_queue_head_t wq; >> + spinlock_t lock; >> +}; >> + >> +static irqreturn_t gh_msgq_irq_handler(int irq, void *dev) >> +{ >> + struct gh_msgq *msgq = dev; >> + >> + spin_lock(&msgq->lock); >> + msgq->ready = true; >> + spin_unlock(&msgq->lock); >> + wake_up_interruptible_all(&msgq->wq); >> + >> + return IRQ_HANDLED; >> +} >> + >> +static int __gh_msgq_send(struct gunyah_device *ghdev, void *buff, >> size_t size, u64 tx_flags) >> +{ >> + unsigned long flags, gh_error; >> + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); >> + ssize_t ret; >> + bool ready; >> + >> + spin_lock_irqsave(&msgq->lock, flags); >> + arch_gh_hypercall(GH_HYPERCALL_MSGQ_SEND, 5, >> + ghdev->capid, size, (uintptr_t)buff, tx_flags, 0, >> + gh_error, ready); >> + switch (gh_error) { >> + case GH_ERROR_OK: >> + ret = 0; >> + msgq->ready = ready; >> + break; >> + case GH_ERROR_MSGQUEUE_FULL: >> + ret = -EAGAIN; >> + msgq->ready = false; >> + break; >> + default: >> + ret = gh_remap_error(gh_error); >> + break; >> + } >> + spin_unlock_irqrestore(&msgq->lock, flags); >> + >> + return ret; >> +} >> + >> +/** >> + * gh_msgq_send() - Send a message to the client running on a >> different VM >> + * @client: The client descriptor that was obtained via >> gh_msgq_register() >> + * @buff: Pointer to the buffer where the received data must be placed >> + * @buff_size: The size of the buffer space available >> + * @flags: Optional flags to pass to receive the data. For the list >> of flags, >> + * see linux/gunyah/gh_msgq.h >> + * >> + * Returns: The number of bytes copied to buff. <0 if there was an >> error. >> + * >> + * Note: this function may sleep and should not be called from >> interrupt context >> + */ >> +ssize_t gh_msgq_send(struct gunyah_device *ghdev, void *buff, size_t >> size, >> + const unsigned long flags) >> +{ >> + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); >> + ssize_t ret; >> + u64 tx_flags = 0; >> + >> + if (flags & GH_MSGQ_TX_PUSH) >> + tx_flags |= GH_HYPERCALL_MSGQ_SEND_FLAGS_PUSH; >> + >> + do { >> + ret = __gh_msgq_send(ghdev, buff, size, tx_flags); >> + >> + if (ret == -EAGAIN) { >> + if (flags & GH_MSGQ_NONBLOCK) >> + goto out; >> + if (wait_event_interruptible(msgq->wq, msgq->ready)) >> + ret = -ERESTARTSYS; >> + } >> + } while (ret == -EAGAIN); > > Any limit on the amount of retries? Can the driver wait forever here? > >> + >> +out: >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(gh_msgq_send); > > Both _send and _recv functions are not well designed. Can you call > gh_msgq_send() on any gunyah_device? Yes. Will it work? No. > My intention here is to rely on hypervisor to properly complain about it. I thought it better to not have redundant checks, but I can add it in the Linux layer as well. > Could you please check if mailbox API work for you? It seems that it is > what you are trying to implement on your own. > My understanding is that mailbox API was designed for queuing single register accesses. The mailbox APIs aren't intended to queue up arbitrary length messages like needed here. rpmsg is similar in the sense that it had variable length messages and doesn't use the mailbox framework for this reason. >> + >> +static ssize_t __gh_msgq_recv(struct gunyah_device *ghdev, void >> *buff, size_t size) >> +{ >> + unsigned long flags, gh_error; >> + size_t recv_size; >> + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); >> + ssize_t ret; >> + bool ready; >> + >> + spin_lock_irqsave(&msgq->lock, flags); >> + >> + arch_gh_hypercall(GH_HYPERCALL_MSGQ_RECV, 4, >> + ghdev->capid, (uintptr_t)buff, size, 0, >> + gh_error, recv_size, ready); >> + switch (gh_error) { >> + case GH_ERROR_OK: >> + ret = recv_size; >> + msgq->ready = ready; >> + break; >> + case GH_ERROR_MSGQUEUE_EMPTY: >> + ret = -EAGAIN; >> + msgq->ready = false; >> + break; >> + default: >> + ret = gh_remap_error(gh_error); >> + break; >> + } >> + spin_unlock_irqrestore(&msgq->lock, flags); >> + >> + return ret; >> +} >> + >> +/** >> + * gh_msgq_recv() - Receive a message from the client running on a >> different VM >> + * @client: The client descriptor that was obtained via >> gh_msgq_register() >> + * @buff: Pointer to the buffer where the received data must be placed >> + * @buff_size: The size of the buffer space available >> + * @flags: Optional flags to pass to receive the data. For the list >> of flags, >> + * see linux/gunyah/gh_msgq.h >> + * >> + * Returns: The number of bytes copied to buff. <0 if there was an >> error. >> + * >> + * Note: this function may sleep and should not be called from >> interrupt context >> + */ >> +ssize_t gh_msgq_recv(struct gunyah_device *ghdev, void *buff, size_t >> size, >> + const unsigned long flags) >> +{ >> + struct gh_msgq *msgq = ghdev_get_drvdata(ghdev); >> + ssize_t ret; >> + >> + do { >> + ret = __gh_msgq_recv(ghdev, buff, size); >> + >> + if (ret == -EAGAIN) { >> + if (flags & GH_MSGQ_NONBLOCK) >> + goto out; >> + if (wait_event_interruptible(msgq->wq, msgq->ready)) >> + ret = -ERESTARTSYS; >> + } >> + } while (ret == -EAGAIN); >> + >> +out: >> + return ret; >> +} >> +EXPORT_SYMBOL_GPL(gh_msgq_recv); >> + >> +static int gh_msgq_probe(struct gunyah_device *ghdev) >> +{ >> + struct gh_msgq *msgq; >> + >> + msgq = devm_kzalloc(&ghdev->dev, sizeof(*msgq), GFP_KERNEL); >> + if (!msgq) >> + return -ENOMEM; >> + ghdev_set_drvdata(ghdev, msgq); >> + >> + msgq->ready = true; /* Assume we can use the message queue right >> away */ >> + init_waitqueue_head(&msgq->wq); >> + spin_lock_init(&msgq->lock); >> + >> + return devm_request_irq(&ghdev->dev, ghdev->irq, >> gh_msgq_irq_handler, 0, >> + dev_name(&ghdev->dev), msgq); >> +} >> + >> +static struct gunyah_driver gh_msgq_tx_driver = { >> + .driver = { >> + .name = "gh_msgq_tx", >> + .owner = THIS_MODULE, >> + }, >> + .type = GUNYAH_DEVICE_TYPE_MSGQ_TX, >> + .probe = gh_msgq_probe, >> +}; >> + >> +static struct gunyah_driver gh_msgq_rx_driver = { >> + .driver = { >> + .name = "gh_msgq_rx", >> + .owner = THIS_MODULE, >> + }, >> + .type = GUNYAH_DEVICE_TYPE_MSGQ_RX, >> + .probe = gh_msgq_probe, > > If you have to duplicate the whole device structure just to bind to two > difference devices, it looks like a bad abstraction. Please check how > other busses have solved this issue. They did, believe me. > >> +}; > > MODULE_DEVICE_TABLE() ? > >> + >> +int __init gh_msgq_init(void) >> +{ >> + int ret; >> + >> + ret = gunyah_register_driver(&gh_msgq_tx_driver); >> + if (ret) >> + return ret; >> + >> + ret = gunyah_register_driver(&gh_msgq_rx_driver); >> + if (ret) >> + goto err_rx; >> + >> + return ret; >> +err_rx: >> + gunyah_unregister_driver(&gh_msgq_tx_driver); >> + return ret; >> +} >> + >> +void gh_msgq_exit(void) >> +{ >> + gunyah_unregister_driver(&gh_msgq_rx_driver); >> + gunyah_unregister_driver(&gh_msgq_tx_driver); >> +} >> diff --git a/drivers/virt/gunyah/sysfs.c b/drivers/virt/gunyah/sysfs.c >> index 220560cb3b1c..7589689e5e92 100644 >> --- a/drivers/virt/gunyah/sysfs.c >> +++ b/drivers/virt/gunyah/sysfs.c >> @@ -73,6 +73,8 @@ static ssize_t features_show(struct kobject *kobj, >> struct kobj_attribute *attr, >> if (GH_IDENTIFY_PARTITION_CSPACE(gunyah_api.flags)) >> len += sysfs_emit_at(buffer, len, "cspace "); >> + if (GH_IDENTIFY_MSGQUEUE(gunyah_api.flags)) >> + len += sysfs_emit_at(buffer, len, "message-queue "); > > Again, this should go to the sysfs patch. > >> len += sysfs_emit_at(buffer, len, "\n"); >> return len; >> @@ -142,7 +144,13 @@ static int __init gunyah_init(void) >> if (ret) >> goto err_sysfs; >> + ret = gh_msgq_init(); >> + if (ret) >> + goto err_bus; >> + > > Please stop beating everything in a single module. Having a provider > (bus) and a consumer (drivers for this bus) in a single module sounds > like an overkill. Or, a wrong abstraction. > > Please remind me, why do you need gunyah bus in the first place? I could > not find any other calls to gunyah_device_add in this series. Which > devices do you expect to be added in future? Would they require separate > drivers? > In a future series, I'll add the support to load other virtual machines. When running other virtual machines, additional gunyah devices are needed for doorbells (e.g. to emulate interrupts for paravirtualized devices) and to represent the vCPUs of that other VM. Other gunyah devices are also possible, but those are the immediate devices coming over the horizon. I had an offline discussion with Bjorn and he was recommending dropping the bus/device/driver design entirely. >> return ret; >> +err_bus: >> + gunyah_bus_exit(); >> err_sysfs: >> gh_sysfs_unregister(); >> return ret; >> @@ -151,6 +159,7 @@ module_init(gunyah_init); >> static void __exit gunyah_exit(void) >> { >> + gh_msgq_exit(); >> gunyah_bus_exit(); >> gh_sysfs_unregister(); >> } >> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h >> index ce35f4491773..099224f9d6d1 100644 >> --- a/include/linux/gunyah.h >> +++ b/include/linux/gunyah.h >> @@ -6,6 +6,7 @@ >> #ifndef _GUNYAH_H >> #define _GUNYAH_H >> +#include <linux/platform_device.h> >> #include <linux/device.h> >> #include <linux/types.h> >> #include <linux/errno.h> >> @@ -117,4 +118,16 @@ struct gunyah_driver { >> int gunyah_register_driver(struct gunyah_driver *ghdrv); >> void gunyah_unregister_driver(struct gunyah_driver *ghdrv); >> +#define GH_MSGQ_MAX_MSG_SIZE 1024 >> + >> +/* Possible flags to pass for Tx or Rx */ >> +#define GH_MSGQ_TX_PUSH BIT(0) >> +#define GH_MSGQ_NONBLOCK BIT(32) >> + >> +ssize_t gh_msgq_send(struct gunyah_device *ghdev, void *buff, size_t >> size, >> + const unsigned long flags); >> +ssize_t gh_msgq_recv(struct gunyah_device *ghdev, void *buff, size_t >> size, >> + const unsigned long flags); >> + >> + >> #endif > >
On 8/2/2022 2:24 AM, Dmitry Baryshkov wrote: > On 02/08/2022 00:12, Elliot Berman wrote: >> Gunyah is a Type-1 hypervisor independent of any >> high-level OS kernel, and runs in a higher CPU privilege level. It does >> not depend on any lower-privileged OS kernel/code for its core >> functionality. This increases its security and can support a much smaller >> trusted computing base than a Type-2 hypervisor. >> >> Gunyah is an open source hypervisor. The source repo is available at >> https://github.com/quic/gunyah-hypervisor. >> >> The diagram below shows the architecture. >> >> :: >> >> Primary VM Secondary VMs > > Is there any significant difference between Primary VM and other VMs? > The primary VM is started by RM. Secondary VMs are not otherwise special except that they are (usually) launched by the primary VM. >> +-----+ +-----+ | +-----+ +-----+ +-----+ >> | | | | | | | | | | | >> EL0 | APP | | APP | | | APP | | APP | | APP | >> | | | | | | | | | | | >> +-----+ +-----+ | +-----+ +-----+ +-----+ >> ---------------------|------------------------- >> +--------------+ | +----------------------+ >> | | | | | >> EL1 | Linux Kernel | | |Linux kernel/Other OS | ... >> | | | | | >> +--------------+ | +----------------------+ >> --------hvc/smc------|------hvc/smc------------ >> +----------------------------------------+ >> | | >> EL2 | Gunyah Hypervisor | >> | | >> +----------------------------------------+ >> >> Gunyah provides these following features. >> >> - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on >> physical CPUs and enables time-sharing of the CPUs. > > Is the scheduling provided behind the back of the OS or does it require > cooperation? > Gunyah supports both of these scheduling models. For instance, scheduling of resource manager and the primary VM are done by Gunyah itself. A VM that the primary VM launches could be scheduled by the primary VM itself (by making a hypercall requesting a vCPU be switched in), or by Gunyah itself. We've been calling the former "proxy scheduling" and this would be the default behavior of VMs. >> - Memory Management: Gunyah tracks memory ownership and use of all memory >> under its control. Memory partitioning between VMs is a fundamental >> security feature. >> - Interrupt Virtualization: All interrupts are handled in the hypervisor >> and routed to the assigned VM. >> - Inter-VM Communication: There are several different mechanisms provided >> for communicating between VMs. >> - Device Virtualization: Para-virtualization of devices is supported >> using >> inter-VM communication. Low level system features and devices such as >> interrupt controllers are supported with emulation where required. > > After reviewing some of the patches from the series, I'd like to > understand, what does it provide (and can be provided) to the VMs. > > I'd like to understand it first, before going deep into the API issues. > > 1) The hypervisor provides message queues, doorbells and vCPUs > > Each of resources has it's own capability ID. > Why is it called capability? Is it just a misname for the resource ID, > or has it any other meaning behind? If it is a capability, who is > capable of what? > We are following Gunyah's naming convention here. For each virtual machine, Gunyah maintains a table of resources which can be accessed by that VM. An entry in this table is called a "capability" and VMs can only access resources via this capability table. Hence, they get called "capability IDs" and not "resource IDs". A VM can have multiple capability IDs mapping to the same resource. If 2 VMs have access to the same resource, they may not be using the same capability ID to access that resource since the tables are independent per VM. > At this moment you create allocate two message queues with fixed IDs for > communication with resource manager. Then you use these message queues > to organize a console and a pack of tty devices. > > What other kinds of services does RM provide to the guest OS? > Do you expect any other drivers to be calling into the RM? > I want to establish the framework to build a VM loader for Gunyah. Internally, we are working with a prototype of a "generic VM loader" which works with crosvm [1]. In this generic VM loader, memory sharing, memory lending, cooperative scheduling, and raising virtual interrupts are all supported. Emulating virtio devices in userspace is supported in a way which feels very similar to KVM. Our internal VM loader uses an IOCTL interface which is similar to KVM's. > What is the usecase for the doorbells? Who provides doorbells? > The basic use case I'll start with is for userspace to create an IRQFD. Userspace can use the IRQFD to raise a doorbell (interrupt) on the other VM. > You mentioned that the RM generates DT overlays. What kind of > information goes to the overlay? > The info is described in Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml. > My current impression of this series is that you have misused the > concept of devices. Rather than exporting MSGQs and BELLs as > gunyah_devices and then using them from other drivers, I'd suggest > turning them into resources provided by the gunyah driver core. I > mentioned using the mailbox API for this. Another subsystem that might > ring the bell for you is the remoteproc, especially the rproc_subdev. > I had an offline discussion with Bjorn and he agreed with this approach here. He suggested avoiding using the device bus model and will go with smaller approach in v3. > I might be completely wrong about this, but if my in-mind picture of > Gunyah is correct, I'd have implemented the gunyah core subsytem as > mailbox provider, RM as a separate platform driver consuming these > mailboxes and in turn being a remoteproc driver, and consoles as > remoteproc subdevices. > The mailbox framework can only fit with message queues and not doorbells or vCPUs. The mailbox framework also relies on the mailbox being defined in the devicetree. RM is an exceptional case in that it is described in the devicetree. Message queues for other VMs would be dynamically created at runtime as/when that VM is created. Thus, the client of the message queue would need to "own" both the controller and client ends of the mailbox. RM is not loaded or managed by Linux, so I don't think remoteproc framework provides us any code re-use except for the subdevices code. Remoteproc is much larger framework than just the subdevices code, so I don't think it fits well overall. > I can assume that at some point you would like to use Gunyah to boot > secondary VMs from the primary VM by calling into RM, etc. > Most probably at this moment a VM would be allocated other bells, > message queues, etc. If this assumption is correct, them the VM can > become a separate device (remoteproc?) in the Linux device tree. > > I might be wrong in any of the assumptions above. Please feel free to > correct me. We can then think about a better API for your usecase. > We don't want to limit VM configuration to the devicetree as this limits the number and kinds of VMs that can be launched to build time. I'm not sure if you might have seen an early presentation of Gunyah at Linaro? In the early days of Gunyah, we had static configuration of VMs and many properties of the VMs were described in the devicetree. We are moving away from static configuration of VMs as much as possible. [1]: https://chromium.googlesource.com/chromiumos/platform/crosvm
[drive-by observation since one thing caught my interest...] On 2022-08-09 00:38, Elliot Berman wrote: >> I might be completely wrong about this, but if my in-mind picture of >> Gunyah is correct, I'd have implemented the gunyah core subsytem as >> mailbox provider, RM as a separate platform driver consuming these >> mailboxes and in turn being a remoteproc driver, and consoles as >> remoteproc subdevices. > > > The mailbox framework can only fit with message queues and not doorbells > or vCPUs. Is that so? There was a whole long drawn-out saga around the SCMI protocol using the Arm MHU mailbox as a set of doorbells for shared-memory payloads, but it did eventually get merged as the separate arm_mhu_db.c driver, so unless we're talking about some completely different notion of "doorbell"... :/ > The mailbox framework also relies on the mailbox being defined > in the devicetree. RM is an exceptional case in that it is described in > the devicetree. Message queues for other VMs would be dynamically > created at runtime as/when that VM is created. Thus, the client of the > message queue would need to "own" both the controller and client ends of > the mailbox. FWIW, if the mailbox API does fit conceptually then it looks like it shouldn't be *too* hard to better abstract the DT details in the framework itself and allow providers to offer additional means to validate channel requests, which might be more productive than inventing a whole new thing. Thanks, Robin.
On 8/9/2022 4:29 AM, Marc Zyngier wrote: > On Mon, 08 Aug 2022 23:22:48 +0100, > Elliot Berman <quic_eberman@quicinc.com> wrote: >> >> In a future series, I'll add the support to load other virtual >> machines. When running other virtual machines, additional gunyah >> devices are needed for doorbells (e.g. to emulate interrupts for >> paravirtualized devices) and to represent the vCPUs of that other >> VM. Other gunyah devices are also possible, but those are the >> immediate devices coming over the horizon. > > Can you elaborate on this "doorbell" aspect? If you signal interrupts > to guests, they should be signalled as actual interrupts, not as some > hypervisor-specific events, as we rely on the interrupt semantics for > most things. > > Or are you talking about injecting an interrupt from a guest into > another, the doorbell representing an interrupt source? > Doorbells can operate either of these modes: 1. As simple interrupt sources. The doorbell sender makes a hypercall and an interrupt is raised on the receiver. The hypervisor can be configured to raise a specific SPI on the receiver VM and simply acknowledging the SPI is enough to clear the interrupt assert. No hypervisor-specific code is needed on the receiver to handle these interrupts. This is the mode one would expect to use for paravirtualized devices. 2. As hypervisor-specific events which must be acknowledged using hypercalls. We aren't currently using this advanced use-case and no plans currently to post these. However, I can try to briefly explain: These doorbells can operate on a bitfield and the sender can assert flags on the bitmask; the receiver can decide which bits should trigger the interrupt and which SPI the doorbell "runs" on. The "user story" for this doorbell is to support multiple sender using the same doorbell object. Each sender has a few designated bits they should set. The receiver can choose which events it wants an interrupt to be raised for and then can process all the pending events. To re-iterate, we don't have an interesting use-case for this yet, so don't plan on post patches for this second mode of doorbell. > Thanks, > > M. >
On 8/9/2022 6:13 AM, Robin Murphy wrote: > [drive-by observation since one thing caught my interest...] > Appreciate all the comments. Jassi, I understood you have talked with some of our folks (Trilok and Carl) a few years ago about using the mailbox APIs. We were steered away from using mailboxes then. Is that still the recommendation today? > On 2022-08-09 00:38, Elliot Berman wrote: >>> I might be completely wrong about this, but if my in-mind picture of >>> Gunyah is correct, I'd have implemented the gunyah core subsytem as >>> mailbox provider, RM as a separate platform driver consuming these >>> mailboxes and in turn being a remoteproc driver, and consoles as >>> remoteproc subdevices. > >> >> The mailbox framework can only fit with message queues and not >> doorbells or vCPUs. > > Is that so? There was a whole long drawn-out saga around the SCMI > protocol using the Arm MHU mailbox as a set of doorbells for > shared-memory payloads, but it did eventually get merged as the separate > arm_mhu_db.c driver, so unless we're talking about some completely > different notion of "doorbell"... :/ > Doorbells will be harder to fit into mailbox API framework. - Simple doorbells don't have any TX done acknowledgement model at the doorbell layer (see bullet 1 from https://lore.kernel.org/all/68e241fd-16f0-96b4-eab8-369628292e03@quicinc.com/). Doorbell clients might have a doorbell acknowledgement flow, but the only client I have for doorbells doesn't. IRQFDs would send an empty message to the mailbox and immediately do a client-triggered TX_DONE. - Using mailboxes for the more advanced use-case doorbell forces client to use doorbells a certain way because each channel could be a bit on the bitmask, or the client could have complete control of the entire bitmask. I think implementing the mailbox API would force the otherwise-generic doorbell code to make that decision for clients. Further, I wanted to highlight one other challenge with fitting Gunyah message queues into mailbox API: - Message queues track a flag which indicates whether there is space available in the queue. The flag is returned on msgq_send. When the message queue is full, an interrupt is raised when there is more space available. This could be used as a TX_DONE indicator, but mailbox framework's API prevents us from doing mbox_chan_txdone inside the send_data channel op. I think this might be solvable by adding a new txdone mechanism. >> The mailbox framework also relies on the mailbox being defined in the >> devicetree. RM is an exceptional case in that it is described in the >> devicetree. Message queues for other VMs would be dynamically created >> at runtime as/when that VM is created. Thus, the client of the message >> queue would need to "own" both the controller and client ends of the >> mailbox. > > FWIW, if the mailbox API does fit conceptually then it looks like it > shouldn't be *too* hard to better abstract the DT details in the > framework itself and allow providers to offer additional means to > validate channel requests, which might be more productive than inventing > a whole new thing. > Some notes about fitting mailboxes into Gunyah IPC: - A single mailbox controller can't cover all the gunyah devices. The number of gunyah devices is not fixed and varies per VM launched. Mailbox controller would need to be per-VM or per-device, where each channel represents a capability. - The other device types (like vCPU) don't fit into message-based style framework. I'd like to have a consistent way of binding a device's function with the device. If we use mailbox API, some devices will use mailbox and others will use some other mechanism. I'd prefer to consistently use "some other mechanism" throughout. - TX and RX message queues are independent and "combining" a TX and RX message queue happens at client layer by the client requesting access to two otherwise unassociated message queues. A mailbox channel would either be associated with a TX message queue capability or an RX message queue capability. This isn't a major hurdle per se, but it decreases how cleanly we can use the mailbox APIs IMO. - A VM might only have a TX message queue and no RX message queue, or vice versa. We won't be able to require coupling a TX and RX message queue for the mailbox. - TX done acknowledgement doesn't fit Gunyah IPC (see above) and a new TX_DONE mode would need to be implemented. - Need to make it possible for a client to binding a mailbox channel without DT. I'm getting a bit apprehensive about the tweaks needed to make mailbox framework usable for Gunyah. Will there be enough code re-use and help with abstracting the direct-to-Gunyah APIs? IMO, there isn't, but opinions are welcome :) Thanks, Elliot
On Tue, Aug 9, 2022 at 7:07 PM Elliot Berman <quic_eberman@quicinc.com> wrote: > > On 8/9/2022 6:13 AM, Robin Murphy wrote: > > [drive-by observation since one thing caught my interest...] > > > Appreciate all the comments. > > Jassi, > > I understood you have talked with some of our folks (Trilok and Carl) a > few years ago about using the mailbox APIs. We were steered away from > using mailboxes then. Is that still the recommendation today? > Neither I nor Google remember any such conversation. Doorbell had always been supported by the api. It was the doorbell-mode of _mhu_ controller that had some contention. I haven't read the complete history of Gunyah yet, but from a quick look it uses the hvc/smc instruction as the "physical link" between entities (?). zynqmp-ipi-mailbox.c is one driver that uses smc in such a manner. And I know there are some platforms that don't call hvc/smc under mailbox api and I don't blame them. Let me educate myself with the background and get back.... unless you want to summarize a usecase that you doubt is supported. Thanks.
On 8/9/2022 9:12 PM, Jassi Brar wrote: > On Tue, Aug 9, 2022 at 7:07 PM Elliot Berman <quic_eberman@quicinc.com> wrote: > > I haven't read the complete history of Gunyah yet, but from a quick > look it uses the hvc/smc instruction as the "physical link" between > entities (?). zynqmp-ipi-mailbox.c is one driver that uses smc in > such a manner. And I know there are some platforms that don't call > hvc/smc under mailbox api and I don't blame them. > > Let me educate myself with the background and get back.... unless you > want to summarize a usecase that you doubt is supported. > Hi Jassi, Did you have chance to evaluate? I have given a summary in this mail, especially the last paragraph: https://lore.kernel.org/all/36303c20-5d30-2edd-0863-0cad804e3f8f@quicinc.com/ Thanks, Elliot
On 09/08/2022 19:50, Elliot Berman wrote: > > > On 8/9/2022 4:29 AM, Marc Zyngier wrote: >> On Mon, 08 Aug 2022 23:22:48 +0100, >> Elliot Berman <quic_eberman@quicinc.com> wrote: >>> >>> In a future series, I'll add the support to load other virtual >>> machines. When running other virtual machines, additional gunyah >>> devices are needed for doorbells (e.g. to emulate interrupts for >>> paravirtualized devices) and to represent the vCPUs of that other >>> VM. Other gunyah devices are also possible, but those are the >>> immediate devices coming over the horizon. >> >> Can you elaborate on this "doorbell" aspect? If you signal interrupts >> to guests, they should be signalled as actual interrupts, not as some >> hypervisor-specific events, as we rely on the interrupt semantics for >> most things. >> >> Or are you talking about injecting an interrupt from a guest into >> another, the doorbell representing an interrupt source? >> > > Doorbells can operate either of these modes: > 1. As simple interrupt sources. The doorbell sender makes a hypercall > and an interrupt is raised on the receiver. The hypervisor can be > configured to raise a specific SPI on the receiver VM and simply > acknowledging the SPI is enough to clear the interrupt assert. No > hypervisor-specific code is needed on the receiver to handle these > interrupts. This is the mode one would expect to use for > paravirtualized devices. This sounds good. > 2. As hypervisor-specific events which must be acknowledged using > hypercalls. We aren't currently using this advanced use-case and no > plans currently to post these. However, I can try to briefly > explain: These doorbells can operate on a bitfield and the sender > can assert flags on the bitmask; the receiver can decide which bits > should trigger the interrupt and which SPI the doorbell "runs" on. > The "user story" for this doorbell is to support multiple sender > using the same doorbell object. Each sender has a few designated > bits they should set. The receiver can choose which events it wants > an interrupt to be raised for and then can process all the pending > events. To re-iterate, we don't have an interesting use-case for > this yet, so don't plan on post patches for this second mode of > doorbell. Well. For me this sounds like 'we have such capability, no real usecase, but we want to support it anyway' kind of story. As history has shown multiple times, the order should be the opposite one. First you have the use case, then you create the API for it. Otherwise it is very easy to end up with the abstraction that looks good on the API side, but is very hard to fit into the actual user code. I would suggest to drop the second bullet for now and focus on getting the simple doorbells done and accepted into mainline.
On 09/08/2022 02:38, Elliot Berman wrote: > > > On 8/2/2022 2:24 AM, Dmitry Baryshkov wrote: >> I might be completely wrong about this, but if my in-mind picture of >> Gunyah is correct, I'd have implemented the gunyah core subsytem as >> mailbox provider, RM as a separate platform driver consuming these >> mailboxes and in turn being a remoteproc driver, and consoles as >> remoteproc subdevices. > > > The mailbox framework can only fit with message queues and not doorbells > or vCPUs. The mailbox framework also relies on the mailbox being defined > in the devicetree. RM is an exceptional case in that it is described in > the devicetree. Message queues for other VMs would be dynamically > created at runtime as/when that VM is created. Thus, the client of the > message queue would need to "own" both the controller and client ends of > the mailbox. I'd still suggest using the mailbox API for the doorbells. You do not have to implement the txdone, if I'm not mistaken. > > RM is not loaded or managed by Linux, so I don't think remoteproc > framework provides us any code re-use except for the subdevices code. > Remoteproc is much larger framework than just the subdevices code, so I > don't think it fits well overall. > >> I can assume that at some point you would like to use Gunyah to boot >> secondary VMs from the primary VM by calling into RM, etc. >> Most probably at this moment a VM would be allocated other bells, >> message queues, etc. If this assumption is correct, them the VM can >> become a separate device (remoteproc?) in the Linux device tree. >> >> I might be wrong in any of the assumptions above. Please feel free to >> correct me. We can then think about a better API for your usecase. >> > > We don't want to limit VM configuration to the devicetree as this limits > the number and kinds of VMs that can be launched to build time. I'm not > sure if you might have seen an early presentation of Gunyah at Linaro? > In the early days of Gunyah, we had static configuration of VMs and many > properties of the VMs were described in the devicetree. We are moving > away from static configuration of VMs as much as possible. ack, this is correct. > > [1]: https://chromium.googlesource.com/chromiumos/platform/crosvm >