[v10,00/26] Drivers for Gunyah hypervisor

Message ID	20230214211229.3239350-1-quic_eberman@quicinc.com
Headers	show Return-Path: <linux-arm-msm-owner@vger.kernel.org> From: Elliot Berman <quic_eberman@quicinc.com> To: Alex Elder <elder@linaro.org>, Srinivas Kandagatla <srinivas.kandagatla@linaro.org> CC: Elliot Berman <quic_eberman@quicinc.com>, Murali Nalajala <quic_mnalajal@quicinc.com>, Trilok Soni <quic_tsoni@quicinc.com>, "Srivatsa Vaddagiri" <quic_svaddagi@quicinc.com>, Carl van Schaik <quic_cvanscha@quicinc.com>, Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>, Dmitry Baryshkov <dmitry.baryshkov@linaro.org>, Bjorn Andersson <andersson@kernel.org>, Konrad Dybcio <konrad.dybcio@linaro.org>, Arnd Bergmann <arnd@arndb.de>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Rob Herring <robh+dt@kernel.org>, "Krzysztof Kozlowski" <krzysztof.kozlowski+dt@linaro.org>, Jonathan Corbet <corbet@lwn.net>, Bagas Sanjaya <bagasdotme@gmail.com>, Catalin Marinas <catalin.marinas@arm.com>, Jassi Brar <jassisinghbrar@gmail.com>, <linux-arm-msm@vger.kernel.org>, <devicetree@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linux-doc@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org> Subject: [PATCH v10 00/26] Drivers for Gunyah hypervisor Date: Tue, 14 Feb 2023 13:12:03 -0800 Message-ID: <20230214211229.3239350-1-quic_eberman@quicinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Drivers for Gunyah hypervisor \| expand [v10,00/26] Drivers for Gunyah hypervisor [v10,02/26] dt-bindings: Add binding for gunyah hypervisor [v10,04/26] virt: gunyah: Add hypercalls to identify Gunyah [v10,05/26] virt: gunyah: Identify hypervisor version [v10,08/26] gunyah: rsc_mgr: Add resource manager RPC core [v10,11/26] gunyah: rsc_mgr: Add RPC for sharing memory [v10,12/26] gunyah: vm_mgr: Add/remove user memory regions [v10,14/26] samples: Add sample userspace Gunyah VM Manager [v10,16/26] firmware: qcom_scm: Register Gunyah platform ops [v10,18/26] virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource [v10,19/26] gunyah: vm_mgr: Add framework to add VM Functions [v10,20/26] virt: gunyah: Add resource tickets [v10,23/26] virt: gunyah: Add hypercalls for sending doorbell [v10,25/26] virt: gunyah: Add ioeventfd

Elliot Berman Feb. 14, 2023, 9:12 p.m. UTC

Gunyah is a Type-1 hypervisor independent of any
high-level OS kernel, and runs in a higher CPU privilege level. It does
not depend on any lower-privileged OS kernel/code for its core
functionality. This increases its security and can support a much smaller
trusted computing base than a Type-2 hypervisor.

Gunyah is an open source hypervisor. The source repo is available at
https://github.com/quic/gunyah-hypervisor.

The diagram below shows the architecture.

::

         VM A                    VM B
     +-----+ +-----+  | +-----+ +-----+ +-----+
     |     | |     |  | |     | |     | |     |
 EL0 | APP | | APP |  | | APP | | APP | | APP |
     |     | |     |  | |     | |     | |     |
     +-----+ +-----+  | +-----+ +-----+ +-----+
 ---------------------|-------------------------
     +--------------+ | +----------------------+
     |              | | |                      |
 EL1 | Linux Kernel | | |Linux kernel/Other OS |   ...
     |              | | |                      |
     +--------------+ | +----------------------+
 --------hvc/smc------|------hvc/smc------------
     +----------------------------------------+
     |                                        |
 EL2 |            Gunyah Hypervisor           |
     |                                        |
     +----------------------------------------+

Gunyah provides these following features.

- Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
physical CPUs and enables time-sharing of the CPUs.
- Memory Management: Gunyah tracks memory ownership and use of all memory
under its control. Memory partitioning between VMs is a fundamental
security feature.
- Interrupt Virtualization: All interrupts are handled in the hypervisor
and routed to the assigned VM.
- Inter-VM Communication: There are several different mechanisms provided
for communicating between VMs.
- Device Virtualization: Para-virtualization of devices is supported using
inter-VM communication. Low level system features and devices such as
interrupt controllers are supported with emulation where required.

This series adds the basic framework for detecting that Linux is running
under Gunyah as a virtual machine, communication with the Gunyah Resource
Manager, and a virtual machine manager capable of launching virtual machines.

The series relies on two other patches posted separately:
 - https://lore.kernel.org/all/20230213181832.3489174-1-quic_eberman@quicinc.com/
 - https://lore.kernel.org/all/20230213232537.2040976-2-quic_eberman@quicinc.com/

Changes in v10:
 - Fix bisectability (end result of series is same, --fixups applied to wrong commits)
 - Convert GH_ERROR_* and GH_RM_ERROR_* to enums
 - Correct race condition between allocating/freeing user memory
 - Replace offsetof with struct_size
 - Series-wide renaming of functions to be more consistent
 - VM shutdown & restart support added in vCPU and VM Manager patches
 - Convert VM function name (string) to type (number)
 - Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
 - Remove defensive checks of hypervisor correctness
 - Clean ups to ioeventfd as suggested by Srivatsa

Changes in v9: https://lore.kernel.org/all/20230120224627.4053418-1-quic_eberman@quicinc.com/
 - Refactor Gunyah API flags to be exposed as feature flags at kernel level
 - Move mbox client cleanup into gunyah_msgq_remove()
 - Simplify gh_rm_call return value and response payload
 - Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series

Changes in v8: https://lore.kernel.org/all/20221219225850.2397345-1-quic_eberman@quicinc.com/
 - Treat VM manager as a library of RM
 - Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
   from Gunyah userspace

Changes in v7: https://lore.kernel.org/all/20221121140009.2353512-1-quic_eberman@quicinc.com/
 - Refactor to remove gunyah RM bus
 - Refactor allow multiple RM device instances
 - Bump UAPI to start at 0x0
 - Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations

Changes in v6: https://lore.kernel.org/all/20221026185846.3983888-1-quic_eberman@quicinc.com/
 - *Replace gunyah-console with gunyah VM Manager*
 - Move include/asm-generic/gunyah.h into include/linux/gunyah.h
 - s/gunyah_msgq/gh_msgq/
 - Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.

Changes in v5: https://lore.kernel.org/all/20221011000840.289033-1-quic_eberman@quicinc.com/
 - Dropped sysfs nodes
 - Switch from aux bus to Gunyah RM bus for the subdevices
 - Cleaning up RM console

Changes in v4: https://lore.kernel.org/all/20220928195633.2348848-1-quic_eberman@quicinc.com/
 - Tidied up documentation throughout based on questions/feedback received
 - Switched message queue implementation to use mailboxes
 - Renamed "gunyah_device" as "gunyah_resource"

Changes in v3: https://lore.kernel.org/all/20220811214107.1074343-1-quic_eberman@quicinc.com/
 - /Maintained/Supported/ in MAINTAINERS
 - Tidied up documentation throughout based on questions/feedback received
 - Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
 - Drop opaque typedefs
 - Move sysfs nodes under /sys/hypervisor/gunyah/
 - Moved Gunyah console driver to drivers/tty/
 - Reworked gunyah_device design to drop the Gunyah bus.

Changes in v2: https://lore.kernel.org/all/20220801211240.597859-1-quic_eberman@quicinc.com/
 - DT bindings clean up
 - Switch hypercalls to follow SMCCC 

v1: https://lore.kernel.org/all/20220223233729.1571114-1-quic_eberman@quicinc.com/

Elliot Berman (26):
  docs: gunyah: Introduce Gunyah Hypervisor
  dt-bindings: Add binding for gunyah hypervisor
  gunyah: Common types and error codes for Gunyah hypercalls
  virt: gunyah: Add hypercalls to identify Gunyah
  virt: gunyah: Identify hypervisor version
  virt: gunyah: msgq: Add hypercalls to send and receive messages
  mailbox: Add Gunyah message queue mailbox
  gunyah: rsc_mgr: Add resource manager RPC core
  gunyah: rsc_mgr: Add VM lifecycle RPC
  gunyah: vm_mgr: Introduce basic VM Manager
  gunyah: rsc_mgr: Add RPC for sharing memory
  unyah: vm_mgr: Add/remove user memory regions
  gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
  samples: Add sample userspace Gunyah VM Manager
  gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
  firmware: qcom_scm: Register Gunyah platform ops
  docs: gunyah: Document Gunyah VM Manager
  virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
  gunyah: vm_mgr: Add framework to add VM Functions
  virt: gunyah: Add resource tickets
  virt: gunyah: Add IO handlers
  virt: gunyah: Add proxy-scheduled vCPUs
  virt: gunyah: Add hypercalls for sending doorbell
  virt: gunyah: Add irqfd interface
  virt: gunyah: Add ioeventfd
  MAINTAINERS: Add Gunyah hypervisor drivers section

 .../bindings/firmware/gunyah-hypervisor.yaml  |  82 ++
 .../userspace-api/ioctl/ioctl-number.rst      |   1 +
 Documentation/virt/gunyah/index.rst           | 114 +++
 Documentation/virt/gunyah/message-queue.rst   |  69 ++
 Documentation/virt/gunyah/vm-manager.rst      | 193 +++++
 Documentation/virt/index.rst                  |   1 +
 MAINTAINERS                                   |  13 +
 arch/arm64/Kbuild                             |   1 +
 arch/arm64/gunyah/Makefile                    |   3 +
 arch/arm64/gunyah/gunyah_hypercall.c          | 146 ++++
 arch/arm64/include/asm/gunyah.h               |  23 +
 drivers/firmware/Kconfig                      |   2 +
 drivers/firmware/qcom_scm.c                   | 100 +++
 drivers/mailbox/Makefile                      |   2 +
 drivers/mailbox/gunyah-msgq.c                 | 214 +++++
 drivers/virt/Kconfig                          |   2 +
 drivers/virt/Makefile                         |   1 +
 drivers/virt/gunyah/Kconfig                   |  46 +
 drivers/virt/gunyah/Makefile                  |  11 +
 drivers/virt/gunyah/gunyah.c                  |  54 ++
 drivers/virt/gunyah/gunyah_ioeventfd.c        | 113 +++
 drivers/virt/gunyah/gunyah_irqfd.c            | 160 ++++
 drivers/virt/gunyah/gunyah_platform_hooks.c   |  80 ++
 drivers/virt/gunyah/gunyah_vcpu.c             | 463 ++++++++++
 drivers/virt/gunyah/rsc_mgr.c                 | 798 +++++++++++++++++
 drivers/virt/gunyah/rsc_mgr.h                 | 169 ++++
 drivers/virt/gunyah/rsc_mgr_rpc.c             | 419 +++++++++
 drivers/virt/gunyah/vm_mgr.c                  | 801 ++++++++++++++++++
 drivers/virt/gunyah/vm_mgr.h                  |  70 ++
 drivers/virt/gunyah/vm_mgr_mm.c               | 258 ++++++
 include/linux/gunyah.h                        | 198 +++++
 include/linux/gunyah_rsc_mgr.h                | 171 ++++
 include/linux/gunyah_vm_mgr.h                 | 119 +++
 include/uapi/linux/gunyah.h                   | 191 +++++
 samples/Kconfig                               |  10 +
 samples/Makefile                              |   1 +
 samples/gunyah/.gitignore                     |   2 +
 samples/gunyah/Makefile                       |   6 +
 samples/gunyah/gunyah_vmm.c                   | 270 ++++++
 samples/gunyah/sample_vm.dts                  |  68 ++
 40 files changed, 5445 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
 create mode 100644 Documentation/virt/gunyah/index.rst
 create mode 100644 Documentation/virt/gunyah/message-queue.rst
 create mode 100644 Documentation/virt/gunyah/vm-manager.rst
 create mode 100644 arch/arm64/gunyah/Makefile
 create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
 create mode 100644 arch/arm64/include/asm/gunyah.h
 create mode 100644 drivers/mailbox/gunyah-msgq.c
 create mode 100644 drivers/virt/gunyah/Kconfig
 create mode 100644 drivers/virt/gunyah/Makefile
 create mode 100644 drivers/virt/gunyah/gunyah.c
 create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
 create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
 create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
 create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
 create mode 100644 drivers/virt/gunyah/rsc_mgr.c
 create mode 100644 drivers/virt/gunyah/rsc_mgr.h
 create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
 create mode 100644 drivers/virt/gunyah/vm_mgr.c
 create mode 100644 drivers/virt/gunyah/vm_mgr.h
 create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
 create mode 100644 include/linux/gunyah.h
 create mode 100644 include/linux/gunyah_rsc_mgr.h
 create mode 100644 include/linux/gunyah_vm_mgr.h
 create mode 100644 include/uapi/linux/gunyah.h
 create mode 100644 samples/gunyah/.gitignore
 create mode 100644 samples/gunyah/Makefile
 create mode 100644 samples/gunyah/gunyah_vmm.c
 create mode 100644 samples/gunyah/sample_vm.dts


base-commit: 3ebb0ac55efaf1d0fb1b106f852c114e5021f7eb

Elliot Berman Feb. 16, 2023, 5:20 p.m. UTC | #1

On 2/15/2023 10:35 PM, Greg Kroah-Hartman wrote:
> On Tue, Feb 14, 2023 at 01:24:26PM -0800, Elliot Berman wrote:
>> +	case GH_VM_SET_DTB_CONFIG: {
>> +		struct gh_vm_dtb_config dtb_config;
>> +
>> +		if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
>> +			return -EFAULT;
>> +
>> +		dtb_config.size = PAGE_ALIGN(dtb_config.size);
>> +		ghvm->dtb_config = dtb_config;
> 
> Do you really mean to copy this tiny structure twice (once from
> userspace and the second time off of the stack)?  If so, why?

Ah, yes this can be optimized to copy directly.
> 
> And where are the values of the structure checked for validity?  Can any
> 64bit value work for size and "gpa"?
> 

The values get checked when starting the VM

static int gh_vm_start(struct gh_vm *ghvm)
	...
	mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, 
ghvm->dtb_config.size);
	if (!mapping) {
		pr_warn("Failed to find the memory_handle for DTB\n");
		ret = -EINVAL;
		goto err;
	}

If user passes an address that they've not set up, then 
gh_vm_mem_find_mapping returns NULL and GH_VM_START ioctl fails.

I've not done the check from the GH_VM_SET_DTB_CONFIG ioctl itself 
because I didn't want to require userspace to share the memory first. 
We'd need to check again anyway since user could SET_USER_MEMORY, 
SET_DTB_CONFIG, SET_USER_MEMORY (remove), VM_START.

Thanks,
Elliot

Srivatsa Vaddagiri Feb. 20, 2023, 9:15 a.m. UTC | #2

* Elliot Berman <quic_eberman@quicinc.com> [2023-02-14 13:24:26]:

>  static void gh_vm_free(struct work_struct *work)
>  {
>  	struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
>  	struct gh_vm_mem *mapping, *tmp;
>  	int ret;
>  
> -	mutex_lock(&ghvm->mm_lock);
> -	list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> -		gh_vm_mem_reclaim(ghvm, mapping);
> -		kfree(mapping);
> +	switch (ghvm->vm_status) {
> +unknown_state:
> +	case GH_RM_VM_STATUS_RUNNING:
> +		gh_vm_stop(ghvm);
> +		fallthrough;
> +	case GH_RM_VM_STATUS_INIT_FAILED:
> +	case GH_RM_VM_STATUS_LOAD:
> +	case GH_RM_VM_STATUS_LOAD_FAILED:
> +		mutex_lock(&ghvm->mm_lock);
> +		list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> +			gh_vm_mem_reclaim(ghvm, mapping);
> +			kfree(mapping);
> +		}
> +		mutex_unlock(&ghvm->mm_lock);
> +		fallthrough;
> +	case GH_RM_VM_STATUS_NO_STATE:
> +		ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> +		if (ret)
> +			pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> +		gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> +		put_gh_rm(ghvm->rm);
> +		kfree(ghvm);
> +		break;
> +	default:
> +		pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
> +		goto unknown_state;

'goto unknown_state' here leads to a infinite loop AFAICS. For example consider
the case  where VM_START failed (due to mem_lend operation) causing VM state to
be GH_RM_VM_STATUS_RESET. A subsequent close(vmfd) can leads to that forever
loop.

//snip


> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> +	struct gh_vm_mem *mapping;
> +	u64 dtb_offset;
> +	u32 mem_handle;
> +	int ret;
> +
> +	down_write(&ghvm->status_lock);
> +	if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> +		up_write(&ghvm->status_lock);
> +		return 0;
> +	}
> +
> +	ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +
> +	list_for_each_entry(mapping, &ghvm->memory_mappings, list) {

We don't seem to have the right lock here while walking the list.

Srivatsa Vaddagiri Feb. 20, 2023, 9:54 a.m. UTC | #3

* Srivatsa Vaddagiri <quic_svaddagi@quicinc.com> [2023-02-20 14:45:55]:

> * Elliot Berman <quic_eberman@quicinc.com> [2023-02-14 13:24:26]:
> 
> >  static void gh_vm_free(struct work_struct *work)
> >  {
> >  	struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> >  	struct gh_vm_mem *mapping, *tmp;
> >  	int ret;
> >  
> > -	mutex_lock(&ghvm->mm_lock);
> > -	list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> > -		gh_vm_mem_reclaim(ghvm, mapping);
> > -		kfree(mapping);
> > +	switch (ghvm->vm_status) {
> > +unknown_state:
> > +	case GH_RM_VM_STATUS_RUNNING:
> > +		gh_vm_stop(ghvm);
> > +		fallthrough;
> > +	case GH_RM_VM_STATUS_INIT_FAILED:
> > +	case GH_RM_VM_STATUS_LOAD:
> > +	case GH_RM_VM_STATUS_LOAD_FAILED:
> > +		mutex_lock(&ghvm->mm_lock);
> > +		list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> > +			gh_vm_mem_reclaim(ghvm, mapping);
> > +			kfree(mapping);
> > +		}
> > +		mutex_unlock(&ghvm->mm_lock);
> > +		fallthrough;
> > +	case GH_RM_VM_STATUS_NO_STATE:
> > +		ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> > +		if (ret)
> > +			pr_warn("Failed to deallocate vmid: %d\n", ret);
> > +
> > +		gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> > +		put_gh_rm(ghvm->rm);
> > +		kfree(ghvm);
> > +		break;
> > +	default:
> > +		pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
> > +		goto unknown_state;
> 
> 'goto unknown_state' here leads to a infinite loop AFAICS. For example consider
> the case  where VM_START failed (due to mem_lend operation) causing VM state to
> be GH_RM_VM_STATUS_RESET. A subsequent close(vmfd) can leads to that forever
> loop.

Hmm ..that's not a good example perhaps (VM state is set to
GH_RM_VM_STATUS_INIT_FAILED in failed case). Nevertheless I think we should
avoid the goto in case of unknown state.


- vatsa

Srinivas Kandagatla Feb. 20, 2023, 1:59 p.m. UTC | #4

On 14/02/2023 21:23, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   Documentation/virt/gunyah/message-queue.rst |   8 +
>   drivers/mailbox/Makefile                    |   2 +
>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>   include/linux/gunyah.h                      |  56 +++++
>   4 files changed, 280 insertions(+)
>   create mode 100644 drivers/mailbox/gunyah-msgq.c
> 
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index 0667b3eb1ff9..082085e981e0 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
>         |               |         |                 |         |               |
>         |               |         |                 |         |               |
>         +---------------+         +-----------------+         +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> +   :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)	+= mtk-cmdq-mailbox.o
>   
>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)	+= zynqmp-ipi-mailbox.o
>   
> +obj-$(CONFIG_GUNYAH)		+= gunyah-msgq.o

Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not 
CONFIG_GUNYAH_MBOX?

> +
>   obj-$(CONFIG_SUN6I_MSGBOX)	+= sun6i-msgbox.o
>   
>   obj-$(CONFIG_SPRD_MBOX)		+= sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..03ffaa30ce9b
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>

...

> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> +	struct gh_msgq *msgq = data;
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)

Tasklets have been long deprecated, consider using workqueues in this 
particular case.


> +{
> +	struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
..

> +	tasklet_schedule(&msgq->txdone_tasklet);
> +
> +	return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> +	.send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client should register an .rx_callback() and the message queue driver will
> + * push all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> +		     struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
> +{
> +	int ret;
> +
> +	/* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> +	if ((!tx_ghrsc && !rx_ghrsc) ||
> +	    (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
> +	    (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
> +		return -EINVAL;
> +
> +	if (gh_api_version() != GUNYAH_API_V1) {
> +		pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
dev_err(parent

would make this more useful

> +			gh_api_version(), GUNYAH_API_V1);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
> +		return -EOPNOTSUPP;
> +
> +	msgq->tx_ghrsc = tx_ghrsc;
> +	msgq->rx_ghrsc = rx_ghrsc;
> +
> +	msgq->mbox.dev = parent;
> +	msgq->mbox.ops = &gh_msgq_ops;
> +	msgq->mbox.num_chans = 1;
> +	msgq->mbox.txdone_irq = true;
> +	msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);
> +	if (!msgq->mbox.chans)
> +		return -ENOMEM;
> +
> +	if (msgq->tx_ghrsc) {
> +		ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> +				msgq);
> +		if (ret)
> +			goto err_chans;
> +	}
> +
> +	if (msgq->rx_ghrsc) {
> +		ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> +						IRQF_ONESHOT, "gh_msgq_rx", msgq);
> +		if (ret)
> +			goto err_tx_irq;
> +	}
> +
> +	tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> +	ret = mbox_controller_register(&msgq->mbox);
> +	if (ret)
> +		goto err_rx_irq;
> +
> +	ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> +	if (ret)
> +		goto err_mbox;
> +
> +	return 0;
> +err_mbox:
> +	mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> +	if (msgq->rx_ghrsc)
> +		free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> +	if (msgq->tx_ghrsc)
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> +	kfree(msgq->mbox.chans);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> +	mbox_controller_unregister(&msgq->mbox);
> +
> +	if (msgq->rx_ghrsc)
> +		free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> +	if (msgq->tx_ghrsc)
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> +	kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");

Srinivas Kandagatla Feb. 20, 2023, 1:59 p.m. UTC | #5

minor nits below,

On 14/02/2023 21:12, Elliot Berman wrote:
> Add hypercalls to identify when Linux is running a virtual machine under
> Gunyah.
> 
> There are two calls to help identify Gunyah:
> 
> 1. gh_hypercall_get_uid() returns a UID when running under a Gunyah
>     hypervisor.
> 2. gh_hypercall_hyp_identify() returns build information and a set of
>     feature flags that are supported by Gunyah.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   arch/arm64/Kbuild                    |  1 +
>   arch/arm64/gunyah/Makefile           |  3 ++
>   arch/arm64/gunyah/gunyah_hypercall.c | 61 ++++++++++++++++++++++++++++
>   drivers/virt/Kconfig                 |  2 +
>   drivers/virt/gunyah/Kconfig          | 13 ++++++
>   include/linux/gunyah.h               | 33 +++++++++++++++
>   6 files changed, 113 insertions(+)
>   create mode 100644 arch/arm64/gunyah/Makefile
>   create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
>   create mode 100644 drivers/virt/gunyah/Kconfig
> 
> diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild
> index 5bfbf7d79c99..e4847ba0e3c9 100644
> --- a/arch/arm64/Kbuild
> +++ b/arch/arm64/Kbuild
> @@ -3,6 +3,7 @@ obj-y			+= kernel/ mm/ net/
>   obj-$(CONFIG_KVM)	+= kvm/
>   obj-$(CONFIG_XEN)	+= xen/
>   obj-$(subst m,y,$(CONFIG_HYPERV))	+= hyperv/
> +obj-$(CONFIG_GUNYAH)	+= gunyah/
>   obj-$(CONFIG_CRYPTO)	+= crypto/
>   
>   # for cleaning
> diff --git a/arch/arm64/gunyah/Makefile b/arch/arm64/gunyah/Makefile
> new file mode 100644
> index 000000000000..84f1e38cafb1
> --- /dev/null
> +++ b/arch/arm64/gunyah/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_GUNYAH) += gunyah_hypercall.o
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> new file mode 100644
> index 000000000000..f30d06ee80cf
> --- /dev/null
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -0,0 +1,61 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/module.h>
> +#include <linux/gunyah.h>
> +
> +static const uint32_t gunyah_known_uuids[][4] = {
> +	{0x19bd54bd, 0x0b37571b, 0x946f609b, 0x54539de6}, /* QC_HYP (Qualcomm's build) */
> +	{0x673d5f14, 0x9265ce36, 0xa4535fdb, 0xc1d58fcd}, /* GUNYAH (open source build) */
> +};
> +
> +bool arch_is_gunyah_guest(void)
> +{
> +	struct arm_smccc_res res;
> +	u32 uid[4];
> +	int i;
> +
> +	arm_smccc_1_1_hvc(ARM_SMCCC_VENDOR_HYP_CALL_UID_FUNC_ID, &res);
> +
> +	uid[0] = lower_32_bits(res.a0);
> +	uid[1] = lower_32_bits(res.a1);
> +	uid[2] = lower_32_bits(res.a2);
> +	uid[3] = lower_32_bits(res.a3);
> +
> +	for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
> +		if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
> +			break;
> +
> +	return i != ARRAY_SIZE(gunyah_known_uuids);

you could probably make this more readable by:


for (i = 0; i < ARRAY_SIZE(gunyah_known_uuids); i++)
	if (!memcmp(uid, gunyah_known_uuids[i], sizeof(uid)))
		return true;

return false;

> +}
> +EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
> +
> +#define GH_HYPERCALL(fn)	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_64, \
> +						   ARM_SMCCC_OWNER_VENDOR_HYP, \
> +						   fn)
> +
> +#define GH_HYPERCALL_HYP_IDENTIFY		GH_HYPERCALL(0x8000)
> +
> +/**
> + * gh_hypercall_hyp_identify() - Returns build information and feature flags
> + *                               supported by Gunyah.
> + * @hyp_identity: filled by the hypercall with the API info and feature flags.
> + */
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_hvc(GH_HYPERCALL_HYP_IDENTIFY, &res);
> +
> +	hyp_identity->api_info = res.a0;
> +	hyp_identity->flags[0] = res.a1;
> +	hyp_identity->flags[1] = res.a2;
> +	hyp_identity->flags[2] = res.a3;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
> index f79ab13a5c28..85bd6626ffc9 100644
> --- a/drivers/virt/Kconfig
> +++ b/drivers/virt/Kconfig
> @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig"
>   
>   source "drivers/virt/coco/tdx-guest/Kconfig"
>   
> +source "drivers/virt/gunyah/Kconfig"
> +
>   endif
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> new file mode 100644
> index 000000000000..1a737694c333
> --- /dev/null
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -0,0 +1,13 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +config GUNYAH
> +	tristate "Gunyah Virtualization drivers"
> +	depends on ARM64
> +	depends on MAILBOX
> +	help
> +	  The Gunyah drivers are the helper interfaces that run in a guest VM
> +	  such as basic inter-VM IPC and signaling mechanisms, and higher level
> +	  services such as memory/device sharing, IRQ sharing, and so on.
> +
> +	  Say Y/M here to enable the drivers needed to interact in a Gunyah
> +	  virtual environment.
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 59ef4c735ae8..3fef2854c5e1 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -6,8 +6,10 @@
>   #ifndef _LINUX_GUNYAH_H
>   #define _LINUX_GUNYAH_H
>   
> +#include <linux/bitfield.h>
>   #include <linux/errno.h>
>   #include <linux/limits.h>
> +#include <linux/types.h>
>   
>   /******************************************************************************/
>   /* Common arch-independent definitions for Gunyah hypercalls                  */
> @@ -79,4 +81,35 @@ static inline int gh_remap_error(enum gh_error gh_error)
>   	}
>   }
>   
> +enum gh_api_feature {
> +	GH_API_FEATURE_DOORBELL,
> +	GH_API_FEATURE_MSGQUEUE,
> +	GH_API_FEATURE_VCPU,
> +	GH_API_FEATURE_MEMEXTENT,
> +};
> +
> +bool arch_is_gunyah_guest(void);
> +
> +u16 gh_api_version(void);
> +bool gh_api_has_feature(enum gh_api_feature feature);
gh_api_has_feature or arch_is_gunyah_guest is in this patch, this should 
probably moved to the respecitive patch that implements these functions.

--srini
> +
> +#define GUNYAH_API_V1			1
> +
> +#define GH_API_INFO_API_VERSION_MASK	GENMASK_ULL(13, 0)
> +#define GH_API_INFO_BIG_ENDIAN		BIT_ULL(14)
> +#define GH_API_INFO_IS_64BIT		BIT_ULL(15)
> +#define GH_API_INFO_VARIANT_MASK	GENMASK_ULL(63, 56)
> +
> +#define GH_IDENTIFY_DOORBELL			BIT_ULL(1)
> +#define GH_IDENTIFY_MSGQUEUE			BIT_ULL(2)
> +#define GH_IDENTIFY_VCPU			BIT_ULL(5)
> +#define GH_IDENTIFY_MEMEXTENT			BIT_ULL(6)
> +
> +struct gh_hypercall_hyp_identify_resp {
> +	u64 api_info;
> +	u64 flags[3];
> +};
> +
> +void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
> +
>   #endif

Srinivas Kandagatla Feb. 21, 2023, 7:50 a.m. UTC | #6

On 14/02/2023 21:23, Elliot Berman wrote:
> 
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   drivers/virt/gunyah/Makefile      |   2 +-
>   drivers/virt/gunyah/rsc_mgr.h     |  45 ++++++
>   drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
>   include/linux/gunyah_rsc_mgr.h    |  73 ++++++++++
>   4 files changed, 345 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
> 
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index cc864ff5abbb..de29769f2f3f 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>   
>   obj-$(CONFIG_GUNYAH) += gunyah.o
>   
> -gunyah_rsc_mgr-y += rsc_mgr.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index d4e799a7526f..7406237bc66d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,4 +74,49 @@ struct gh_rm;
>   int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
>   		void **resp_buf, size_t *resp_buff_size);
>   
<----------------------------
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID			0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID		0x56000002
> +#define GH_RM_RPC_VM_START			0x56000004
> +#define GH_RM_RPC_VM_STOP			0x56000005
> +#define GH_RM_RPC_VM_RESET			0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE		0x56000009
> +#define GH_RM_RPC_VM_INIT			0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES		0x56000020
> +#define GH_RM_RPC_VM_GET_VMID			0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> +	__le16 vmid;
> +	__le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> +	__le16 vmid;
> +	__le16 reserved0;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +struct gh_rm_vm_stop_req {
> +	__le16 vmid;
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP	BIT(0)
> +	u8 flags;
> +	u8 reserved;
> +#define GH_RM_VM_STOP_REASON_FORCE_STOP		3
> +	__le32 stop_reason;
> +} __packed;
> +
> +/* Call: VM_CONFIG_IMAGE */
> +struct gh_rm_vm_config_image_req {
> +	__le16 vmid;
> +	__le16 auth_mech;
> +	__le32 mem_handle;
> +	__le64 image_offset;
> +	__le64 image_size;
> +	__le64 dtb_offset;
> +	__le64 dtb_size;
> +} __packed;
> +
> +/* Call: GET_HYP_RESOURCES */
> +
-------------------------------->

All the above structures are very much internal to rsc_mgr_rpc.c and 
interface to the rsc_mgr_rpc is already abstracted with function arguments

ex:

int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum 
gh_rm_vm_auth_mechanism auth_mechanism, u32 mem_handle, u64 
image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)

So why do we need these structs and defines in header file at all?
you should proabably consider moving them to the .c file.


>   #endif
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> new file mode 100644
> index 000000000000..4515cdd80106
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -0,0 +1,226 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +

Why new line here?

> +#include "rsc_mgr.h"
> +
> +/*
...

> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> +		u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
> +{
> +	struct gh_rm_vm_config_image_req req_payload = { 0 };
> +	size_t resp_size;
> +	void *resp;
> +
> +	req_payload.vmid = cpu_to_le16(vmid);
> +	req_payload.auth_mech = cpu_to_le16(auth_mechanism);
> +	req_payload.mem_handle = cpu_to_le32(mem_handle);
> +	req_payload.image_offset = cpu_to_le64(image_offset);
> +	req_payload.image_size = cpu_to_le64(image_size);
> +	req_payload.dtb_offset = cpu_to_le64(dtb_offset);
> +	req_payload.dtb_size = cpu_to_le64(dtb_size);
> +
> +	return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
> +			&resp, &resp_size);
> +}
> +

--srini

Srinivas Kandagatla Feb. 21, 2023, 10:46 a.m. UTC | #7

On 14/02/2023 21:23, Elliot Berman wrote:
> 
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
> 
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   .../userspace-api/ioctl/ioctl-number.rst      |   1 +
>   drivers/virt/gunyah/Makefile                  |   2 +-
>   drivers/virt/gunyah/rsc_mgr.c                 |  37 +++++-
>   drivers/virt/gunyah/vm_mgr.c                  | 118 ++++++++++++++++++
>   drivers/virt/gunyah/vm_mgr.h                  |  22 ++++
>   include/uapi/linux/gunyah.h                   |  23 ++++
>   6 files changed, 201 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/virt/gunyah/vm_mgr.c
>   create mode 100644 drivers/virt/gunyah/vm_mgr.h
>   create mode 100644 include/uapi/linux/gunyah.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 0a1882e296ae..2513324ae7be 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code  Seq#    Include File                                           Comments
>   'F'   DD     video/sstfb.h                                           conflict!
>   'G'   00-3F  drivers/misc/sgi-gru/grulib.h                           conflict!
>   'G'   00-0F  xen/gntalloc.h, xen/gntdev.h                            conflict!
> +'G'   00-0f  linux/gunyah.h                                          conflict!
>   'H'   00-7F  linux/hiddev.h                                          conflict!
>   'H'   00-0F  linux/hidraw.h                                          conflict!
>   'H'   01     linux/mei.h                                             conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index de29769f2f3f..03951cf82023 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -2,5 +2,5 @@
>   
>   obj-$(CONFIG_GUNYAH) += gunyah.o
>   
> -gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 2a47139873a8..73c5a6b7cbbc 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -16,8 +16,10 @@
>   #include <linux/completion.h>
>   #include <linux/gunyah_rsc_mgr.h>
>   #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>   
>   #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>   
>   #define RM_RPC_API_VERSION_MASK		GENMASK(3, 0)
>   #define RM_RPC_HEADER_WORDS_MASK	GENMASK(7, 4)
> @@ -103,6 +105,8 @@ struct gh_rm {
>   	struct kmem_cache *cache;
>   	struct mutex send_lock;
>   	struct blocking_notifier_head nh;
> +
> +	struct miscdevice miscdev;
>   };
>   
>   static struct gh_rm_connection *gh_rm_alloc_connection(__le32 msg_id, u8 type)
> @@ -509,6 +513,21 @@ void put_gh_rm(struct gh_rm *rm)
>   }
>   EXPORT_SYMBOL_GPL(put_gh_rm);
>   
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> +	struct miscdevice *miscdev = filp->private_data;
> +	struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> +	return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> +	.owner		= THIS_MODULE,
> +	.unlocked_ioctl	= gh_dev_ioctl,
> +	.compat_ioctl	= compat_ptr_ioctl,
> +	.llseek		= noop_llseek,
> +};
> +
>   static int gh_msgq_platform_probe_direction(struct platform_device *pdev,
>   					bool tx, int idx, struct gunyah_resource *ghrsc)
>   {
> @@ -567,7 +586,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
>   	rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>   	rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>   
> -	return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	if (ret)
> +		goto err_cache;
> +
> +	rm->miscdev.name = "gunyah";
> +	rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> +	rm->miscdev.fops = &gh_dev_fops;
> +
> +	ret = misc_register(&rm->miscdev);
> +	if (ret)
> +		goto err_msgq;
> +
> +	return 0;
> +err_msgq:
> +	mbox_free_channel(gh_msgq_chan(&rm->msgq));
> +	gh_msgq_remove(&rm->msgq);
>   err_cache:
>   	kmem_cache_destroy(rm->cache);
>   	return ret;
> @@ -577,6 +611,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
>   {
>   	struct gh_rm *rm = platform_get_drvdata(pdev);
>   
> +	misc_deregister(&rm->miscdev);
>   	mbox_free_channel(gh_msgq_chan(&rm->msgq));
>   	gh_msgq_remove(&rm->msgq);
>   	kmem_cache_destroy(rm->cache);
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> new file mode 100644
> index 000000000000..fd890a57172e
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -0,0 +1,118 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static void gh_vm_free(struct work_struct *work)
> +{
> +	struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
> +	int ret;
> +
> +	ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> +	if (ret)
> +		pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> +	put_gh_rm(ghvm->rm);
> +	kfree(ghvm);
> +}
> +
> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> +{
> +	struct gh_vm *ghvm;
> +	int vmid;
> +
> +	vmid = gh_rm_alloc_vmid(rm, 0);
> +	if (vmid < 0)
> +		return ERR_PTR(vmid);
> +
> +	ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
> +	if (!ghvm) {
> +		gh_rm_dealloc_vmid(rm, vmid);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +	get_gh_rm(rm);
> +
> +	ghvm->vmid = vmid;
> +	ghvm->rm = rm;
> +
> +	INIT_WORK(&ghvm->free_work, gh_vm_free);
> +
> +	return ghvm;
> +}
> +
> +static int gh_vm_release(struct inode *inode, struct file *filp)
> +{
> +	struct gh_vm *ghvm = filp->private_data;
> +
> +	/* VM will be reset and make RM calls which can interruptible sleep.
> +	 * Defer to a work so this thread can receive signal.
> +	 */
> +	schedule_work(&ghvm->free_work);
> +	return 0;
> +}
> +
> +static const struct file_operations gh_vm_fops = {
> +	.release = gh_vm_release,

> +	.compat_ioctl	= compat_ptr_ioctl,

This line should go with the patch that adds real ioctl

> +	.llseek = noop_llseek,
> +};
> +
> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
Not sure what is the gain of this multiple levels of redirection.

How about

long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
{
...
}

and rsc_mgr just call it as part of its ioctl call

static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned 
long arg)
{
	struct miscdevice *miscdev = filp->private_data;
	struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);

	switch (cmd) {
	case GH_CREATE_VM:
		return gh_dev_create_vm(rm, arg);
	default:
		return -ENOIOCTLCMD;
	}
}


> +{
> +	struct gh_vm *ghvm;
> +	struct file *file;
> +	int fd, err;
> +
> +	/* arg reserved for future use. */
> +	if (arg)
> +		return -EINVAL;

The only code path I see here is via GH_CREATE_VM ioctl which obviously 
does not take any arguments, so if you are thinking of using the 
argument for architecture-specific VM flags.  Then this needs to be 
properly done by making the ABI aware of this.

As you mentioned zero value arg imply an "unauthenticated VM" type, but 
this was not properly encoded in the userspace ABI. Why not make it 
future compatible. How about adding arguments to GH_CREATE_VM and pass 
the required information correctly.
Note that once the ABI is accepted then you will not be able to change 
it, other than adding a new one.

> +
> +	ghvm = gh_vm_alloc(rm);
> +	if (IS_ERR(ghvm))
> +		return PTR_ERR(ghvm);
> +
> +	fd = get_unused_fd_flags(O_CLOEXEC);
> +	if (fd < 0) {
> +		err = fd;
> +		goto err_destroy_vm;
> +	}
> +
> +	file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> +	if (IS_ERR(file)) {
> +		err = PTR_ERR(file);
> +		goto err_put_fd;
> +	}
> +
> +	fd_install(fd, file);
> +
> +	return fd;
> +
> +err_put_fd:
> +	put_unused_fd(fd);
> +err_destroy_vm:
> +	kfree(ghvm);
> +	return err;
> +}
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
> +{
> +	switch (cmd) {
> +	case GH_CREATE_VM:
> +		return gh_dev_ioctl_create_vm(rm, arg);
> +	default:
> +		return -ENOIOCTLCMD;
> +	}
> +}
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> new file mode 100644
> index 000000000000..76954da706e9
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GH_PRIV_VM_MGR_H
> +#define _GH_PRIV_VM_MGR_H
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
> +
> +struct gh_vm {
> +	u16 vmid;
> +	struct gh_rm *rm;
> +
> +	struct work_struct free_work;
> +};
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> new file mode 100644
> index 000000000000..10ba32d2b0a6
> --- /dev/null
> +++ b/include/uapi/linux/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_GUNYAH
> +#define _UAPI_LINUX_GUNYAH
> +
> +/*
> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
> + */
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define GH_IOCTL_TYPE			'G'
> +
> +/*
> + * ioctls for /dev/gunyah fds:
> + */
> +#define GH_CREATE_VM			_IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */

Can HLOS forcefully destroy a VM?
If so should we have a corresponding DESTROY IOCTL?

--srini

> +
> +#endif

Srivatsa Vaddagiri Feb. 21, 2023, 1:06 p.m. UTC | #8

* Elliot Berman <quic_eberman@quicinc.com> [2023-02-14 13:24:26]:

> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> +	struct gh_vm_mem *mapping;
> +	u64 dtb_offset;
> +	u32 mem_handle;
> +	int ret;
> +
> +	down_write(&ghvm->status_lock);
> +	if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> +		up_write(&ghvm->status_lock);
> +		return 0;
> +	}
> +
> +	ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +
> +	list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> +		switch (mapping->share_type) {
> +		case VM_MEM_LEND:
> +			ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
> +			break;
> +		case VM_MEM_SHARE:
> +			ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
> +			break;
> +		}
> +		if (ret) {
> +			pr_warn("Failed to %s parcel %d: %d\n",
> +				mapping->share_type == VM_MEM_LEND ? "lend" : "share",
> +				mapping->parcel.label,
> +				ret);
> +			goto err;
> +		}
> +	}
> +
> +	mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, ghvm->dtb_config.size);

It may be some optimization to derive DTB 'mapping' in the first loop you have
above (that lends/shares all mappings)


> +	if (!mapping) {
> +		pr_warn("Failed to find the memory_handle for DTB\n");
> +		ret = -EINVAL;
> +		goto err;
> +	}

Srivatsa Vaddagiri Feb. 21, 2023, 1:06 p.m. UTC | #9

* Elliot Berman <quic_eberman@quicinc.com> [2023-02-14 13:23:54]:

> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> +{
> +	struct gh_vm *ghvm;
> +	struct file *file;
> +	int fd, err;
> +
> +	/* arg reserved for future use. */
> +	if (arg)
> +		return -EINVAL;
> +
> +	ghvm = gh_vm_alloc(rm);
> +	if (IS_ERR(ghvm))
> +		return PTR_ERR(ghvm);
> +
> +	fd = get_unused_fd_flags(O_CLOEXEC);
> +	if (fd < 0) {
> +		err = fd;
> +		goto err_destroy_vm;
> +	}
> +
> +	file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> +	if (IS_ERR(file)) {
> +		err = PTR_ERR(file);
> +		goto err_put_fd;
> +	}
> +
> +	fd_install(fd, file);
> +
> +	return fd;
> +
> +err_put_fd:
> +	put_unused_fd(fd);
> +err_destroy_vm:
> +	kfree(ghvm);

Need a put_gh_rm() also in this case

> +	return err;
> +}

Srinivas Kandagatla Feb. 21, 2023, 2:17 p.m. UTC | #10

On 14/02/2023 21:24, Elliot Berman wrote:
> 
> Add remaining ioctls to support non-proxy VM boot:
> 
>   - Gunyah Resource Manager uses the VM's devicetree to configure the
>     virtual machine. The location of the devicetree in the guest's
>     virtual memory can be declared via the SET_DTB_CONFIG ioctl.
>   - Trigger start of the virtual machine with VM_START ioctl.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   drivers/virt/gunyah/vm_mgr.c    | 229 ++++++++++++++++++++++++++++++--
>   drivers/virt/gunyah/vm_mgr.h    |  10 ++
>   drivers/virt/gunyah/vm_mgr_mm.c |  23 ++++
>   include/linux/gunyah_rsc_mgr.h  |   6 +
>   include/uapi/linux/gunyah.h     |  13 ++
>   5 files changed, 268 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index 84102bac03cc..fa324385ade5 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -9,37 +9,114 @@
>   #include <linux/file.h>
>   #include <linux/gunyah_rsc_mgr.h>
>   #include <linux/miscdevice.h>
> +#include <linux/mm.h>
>   #include <linux/module.h>
>   
>   #include <uapi/linux/gunyah.h>
>   
>   #include "vm_mgr.h"
>   
> +static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
> +{
> +	struct gh_rm_vm_status_payload *payload = data;
> +
> +	if (payload->vmid != ghvm->vmid)
> +		return NOTIFY_OK;
Is this even possible? If yes, then this is a bug somewhere, we should 
not be getting notifications for something that does not belong to this vm.
What is the typical case for such behavior? comment would be useful.


> +
> +	/* All other state transitions are synchronous to a corresponding RM call */
> +	if (payload->vm_status == GH_RM_VM_STATUS_RESET) {
> +		down_write(&ghvm->status_lock);
> +		ghvm->vm_status = payload->vm_status;
> +		up_write(&ghvm->status_lock);
> +		wake_up(&ghvm->vm_status_wait);
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
> +{
> +	struct gh_rm_vm_exited_payload *payload = data;
> +
> +	if (payload->vmid != ghvm->vmid)
> +		return NOTIFY_OK;
same

> +
> +	down_write(&ghvm->status_lock);
> +	ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> +	up_write(&ghvm->status_lock);
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static int gh_vm_rm_notification(struct notifier_block *nb, unsigned long action, void *data)
> +{
> +	struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
> +
> +	switch (action) {
> +	case GH_RM_NOTIFICATION_VM_STATUS:
> +		return gh_vm_rm_notification_status(ghvm, data);
> +	case GH_RM_NOTIFICATION_VM_EXITED:
> +		return gh_vm_rm_notification_exited(ghvm, data);
> +	default:
> +		return NOTIFY_OK;
> +	}
> +}
> +
> +static void gh_vm_stop(struct gh_vm *ghvm)
> +{
> +	int ret;
> +
> +	down_write(&ghvm->status_lock);
> +	if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
> +		ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
> +		if (ret)
> +			pr_warn("Failed to stop VM: %d\n", ret);
Should we not bail out from this fail path?


> +	}
> +
> +	ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
> +	up_write(&ghvm->status_lock);
> +}
> +
>   static void gh_vm_free(struct work_struct *work)
>   {
>   	struct gh_vm *ghvm = container_of(work, struct gh_vm, free_work);
>   	struct gh_vm_mem *mapping, *tmp;
>   	int ret;
>   
> -	mutex_lock(&ghvm->mm_lock);
> -	list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> -		gh_vm_mem_reclaim(ghvm, mapping);
> -		kfree(mapping);
> +	switch (ghvm->vm_status) {
> +unknown_state:

Never seen this style of using goto from switch to a new label in 
switch case. Am sure this is some kinda trick but its not helping readers.

Can we rewrite this using a normal semantics.

may be a do while could help.


> +	case GH_RM_VM_STATUS_RUNNING:
> +		gh_vm_stop(ghvm);
> +		fallthrough;
> +	case GH_RM_VM_STATUS_INIT_FAILED:
> +	case GH_RM_VM_STATUS_LOAD:
> +	case GH_RM_VM_STATUS_LOAD_FAILED:
> +		mutex_lock(&ghvm->mm_lock);
> +		list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, list) {
> +			gh_vm_mem_reclaim(ghvm, mapping);
> +			kfree(mapping);
> +		}
> +		mutex_unlock(&ghvm->mm_lock);
> +		fallthrough;
> +	case GH_RM_VM_STATUS_NO_STATE:
> +		ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> +		if (ret)
> +			pr_warn("Failed to deallocate vmid: %d\n", ret);
> +
> +		gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
> +		put_gh_rm(ghvm->rm);
> +		kfree(ghvm);
> +		break;
> +	default:
> +		pr_err("VM is unknown state: %d, assuming it's running.\n", ghvm->vm_status);
vm_status did not change do we not endup here again?

> +		goto unknown_state;
>   	}
> -	mutex_unlock(&ghvm->mm_lock);
> -
> -	ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
> -	if (ret)
> -		pr_warn("Failed to deallocate vmid: %d\n", ret);
> -
> -	put_gh_rm(ghvm->rm);
> -	kfree(ghvm);
>   }
>   
>   static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>   {
>   	struct gh_vm *ghvm;
> -	int vmid;
> +	int vmid, ret;
>   
>   	vmid = gh_rm_alloc_vmid(rm, 0);
>   	if (vmid < 0)
> @@ -56,13 +133,123 @@ static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>   	ghvm->vmid = vmid;
>   	ghvm->rm = rm;
>   
> +	init_waitqueue_head(&ghvm->vm_status_wait);
> +	ghvm->nb.notifier_call = gh_vm_rm_notification;
> +	ret = gh_rm_notifier_register(rm, &ghvm->nb);
> +	if (ret) {
> +		put_gh_rm(rm);
> +		gh_rm_dealloc_vmid(rm, vmid);
> +		kfree(ghvm);
> +		return ERR_PTR(ret);
> +	}
> +
>   	mutex_init(&ghvm->mm_lock);
>   	INIT_LIST_HEAD(&ghvm->memory_mappings);
> +	init_rwsem(&ghvm->status_lock);
>   	INIT_WORK(&ghvm->free_work, gh_vm_free);
> +	ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>   
>   	return ghvm;
>   }
>   
> +static int gh_vm_start(struct gh_vm *ghvm)
> +{
> +	struct gh_vm_mem *mapping;
> +	u64 dtb_offset;
> +	u32 mem_handle;
> +	int ret;
> +
> +	down_write(&ghvm->status_lock);
> +	if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
> +		up_write(&ghvm->status_lock);
> +		return 0;
> +	}
> +
> +	ghvm->vm_status = GH_RM_VM_STATUS_RESET;
> +

<------
should we not take ghvm->mm_lock here to make sure that list is 
consistent while processing.
> +	list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> +		switch (mapping->share_type) {
> +		case VM_MEM_LEND:
> +			ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
> +			break;
> +		case VM_MEM_SHARE:
> +			ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
> +			break;
> +		}
> +		if (ret) {
> +			pr_warn("Failed to %s parcel %d: %d\n",
> +				mapping->share_type == VM_MEM_LEND ? "lend" : "share",
> +				mapping->parcel.label,
> +				ret);
> +			goto err;
> +		}
> +	}
--->

> +
> +	mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, ghvm->dtb_config.size);
> +	if (!mapping) {
> +		pr_warn("Failed to find the memory_handle for DTB\n");

What wil happen to the mappings that are lend or shared?

> +		ret = -EINVAL;
> +		goto err;
> +	}
> +
> +	mem_handle = mapping->parcel.mem_handle;
> +	dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
> +
> +	ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, mem_handle,

where is authentication mechanism (auth) comming from? Who is supposed 
to set this value?

Should it come from userspace? if so I do not see any UAPI facility to 
do that via VM_START ioctl.


> +				0, 0, dtb_offset, ghvm->dtb_config.size);
> +	if (ret) {
> +		pr_warn("Failed to configure VM: %d\n", ret);
> +		goto err;
> +	}
> +
> +	ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
> +	if (ret) {
> +		pr_warn("Failed to initialize VM: %d\n", ret);
> +		goto err;
> +	}
> +
> +	ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
> +	if (ret) {
> +		pr_warn("Failed to start VM: %d\n", ret);
> +		goto err;
> +	}
> +
> +	ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
> +	up_write(&ghvm->status_lock);
> +	return ret;
> +err:
> +	ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
> +	up_write(&ghvm->status_lock);

Am really not sure if we are doing right thing in the error path, there 
are multiple cases that seems to be not handled or if it was not 
required no comments to clarify this are documented.
ex: if vm start fails then what happes with memory mapping or do we need 
to un-configure vm or un-init vm from hypervisor side?

if none of this is required its useful to add come clear comments.

> +	return ret;
> +}
> +
> +static int gh_vm_ensure_started(struct gh_vm *ghvm)
> +{
> +	int ret;
> +
> +retry:
> +	ret = down_read_interruptible(&ghvm->status_lock);
> +	if (ret)
> +		return ret;
> +
> +	/* Unlikely because VM is typically started */
> +	if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
> +		up_read(&ghvm->status_lock);
> +		ret = gh_vm_start(ghvm);
> +		if (ret)
> +			goto out;
> +		goto retry;
> +	}

do while will do better job here w.r.t to readablity.

> +
> +	/* Unlikely because VM is typically running */
> +	if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
> +		ret = -ENODEV;
> +
> +out:
> +	up_read(&ghvm->status_lock);
> +	return ret;
> +}
> +
>   static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>   {
>   	struct gh_vm *ghvm = filp->private_data;
> @@ -88,6 +275,22 @@ static long gh_vm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
>   			r = gh_vm_mem_free(ghvm, region.label);
>   		break;
>   	}
> +	case GH_VM_SET_DTB_CONFIG: {
> +		struct gh_vm_dtb_config dtb_config;
> +
> +		if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
> +			return -EFAULT;
> +
> +		dtb_config.size = PAGE_ALIGN(dtb_config.size);
> +		ghvm->dtb_config = dtb_config;
> +
> +		r = 0;
> +		break;
> +	}
> +	case GH_VM_START: {
> +		r = gh_vm_ensure_started(ghvm);
> +		break;
> +	}
>   	default:
>   		r = -ENOTTY;
>   		break;
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> index 97bc00c34878..e9cf56647cc2 100644
> --- a/drivers/virt/gunyah/vm_mgr.h
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -10,6 +10,8 @@
>   #include <linux/list.h>
>   #include <linux/miscdevice.h>
>   #include <linux/mutex.h>
> +#include <linux/rwsem.h>
> +#include <linux/wait.h>
>   
>   #include <uapi/linux/gunyah.h>
>   
> @@ -33,6 +35,13 @@ struct gh_vm_mem {
>   struct gh_vm {
>   	u16 vmid;
>   	struct gh_rm *rm;
> +	enum gh_rm_vm_auth_mechanism auth;
> +	struct gh_vm_dtb_config dtb_config;
> +
> +	struct notifier_block nb;
> +	enum gh_rm_vm_status vm_status;
> +	wait_queue_head_t vm_status_wait;
> +	struct rw_semaphore status_lock;
>   
>   	struct work_struct free_work;
>   	struct mutex mm_lock;
> @@ -43,5 +52,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *regio
>   void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
>   int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
>   struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size);
>   
>   #endif
> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> index 03e71a36ea3b..128b90da555a 100644
> --- a/drivers/virt/gunyah/vm_mgr_mm.c
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -52,6 +52,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping)
>   	list_del(&mapping->list);
>   }
>   
> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, u32 size)
naming is bit missleading we already have
gh_vm_mem_find/__gh_vm_mem_find which is returning mapping based on label
now with gh_vm_mem_find_mapping() is doing same thing but with address.

Can we rename them clearly
gh_vm_mem_find_mapping_by_label()
gh_vm_mem_find_mapping_by_addr()

> +{

> +	struct gh_vm_mem *mapping = NULL;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&ghvm->mm_lock);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	list_for_each_entry(mapping, &ghvm->memory_mappings, list) {
> +		if (gpa >= mapping->guest_phys_addr &&
> +			(gpa + size <= mapping->guest_phys_addr +
> +			(mapping->npages << PAGE_SHIFT))) {
> +			goto unlock;
> +		}
> +	}
> +
> +	mapping = NULL;
> +unlock:
> +	mutex_unlock(&ghvm->mm_lock);
> +	return mapping;
> +}
> +
>   struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label)
>   {
>   	struct gh_vm_mem *mapping;
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 2d8b8b6cc394..9cffee6f9b4e 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -32,6 +32,12 @@ struct gh_rm_vm_exited_payload {
>   #define GH_RM_NOTIFICATION_VM_EXITED		 0x56100001
>   
>   enum gh_rm_vm_status {
> +	/**
> +	 * RM doesn't have a state where load partially failed because
> +	 * only Linux
> +	 */
> +	GH_RM_VM_STATUS_LOAD_FAILED	= -1,
> +
>   	GH_RM_VM_STATUS_NO_STATE	= 0,
>   	GH_RM_VM_STATUS_INIT		= 1,
>   	GH_RM_VM_STATUS_READY		= 2,
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> index d85d12119a48..d899bba6a4c6 100644
> --- a/include/uapi/linux/gunyah.h
> +++ b/include/uapi/linux/gunyah.h
> @@ -53,4 +53,17 @@ struct gh_userspace_memory_region {
>   #define GH_VM_SET_USER_MEM_REGION	_IOW(GH_IOCTL_TYPE, 0x1, \
>   						struct gh_userspace_memory_region)
>   
> +/**
> + * struct gh_vm_dtb_config - Set the location of the VM's devicetree blob
> + * @gpa: Address of the VM's devicetree in guest memory.
> + * @size: Maximum size of the devicetree.
> + */
> +struct gh_vm_dtb_config {
> +	__u64 gpa;
> +	__u64 size;
> +};
> +#define GH_VM_SET_DTB_CONFIG	_IOW(GH_IOCTL_TYPE, 0x2, struct gh_vm_dtb_config)
> +
> +#define GH_VM_START		_IO(GH_IOCTL_TYPE, 0x3)
> +
>   #endif

Srinivas Kandagatla Feb. 21, 2023, 2:51 p.m. UTC | #11

On 14/02/2023 21:24, Elliot Berman wrote:
> 
> On Qualcomm platforms, there is a firmware entity which controls access
> to physical pages. In order to share memory with another VM, this entity
> needs to be informed that the guest VM should have access to the memory.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   drivers/virt/gunyah/Kconfig                 |  4 ++
>   drivers/virt/gunyah/Makefile                |  1 +
>   drivers/virt/gunyah/gunyah_platform_hooks.c | 80 +++++++++++++++++++++
>   drivers/virt/gunyah/rsc_mgr.h               |  3 +
>   drivers/virt/gunyah/rsc_mgr_rpc.c           | 12 +++-
>   include/linux/gunyah_rsc_mgr.h              | 17 +++++
>   6 files changed, 115 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
> 
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index 1a737694c333..de815189dab6 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -4,6 +4,7 @@ config GUNYAH
>   	tristate "Gunyah Virtualization drivers"
>   	depends on ARM64
>   	depends on MAILBOX
> +	select GUNYAH_PLATFORM_HOOKS
>   	help
>   	  The Gunyah drivers are the helper interfaces that run in a guest VM
>   	  such as basic inter-VM IPC and signaling mechanisms, and higher level
> @@ -11,3 +12,6 @@ config GUNYAH
>   
>   	  Say Y/M here to enable the drivers needed to interact in a Gunyah
>   	  virtual environment.
> +
> +config GUNYAH_PLATFORM_HOOKS
> +	tristate
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index ff8bc4925392..6b8f84dbfe0d 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,6 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> +obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
>   
>   gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
> diff --git a/drivers/virt/gunyah/gunyah_platform_hooks.c b/drivers/virt/gunyah/gunyah_platform_hooks.c
> new file mode 100644
> index 000000000000..e67e2361b592
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_platform_hooks.c
> @@ -0,0 +1,80 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/rwsem.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include "rsc_mgr.h"
> +
> +static struct gunyah_rm_platform_ops *rm_platform_ops;
> +static DECLARE_RWSEM(rm_platform_ops_lock);

Why do we need this read/write lock or this global rm_platform_ops here, 
AFAIU, there will be only one instance of platform_ops per platform.

This should be a core part of the gunyah and its driver early setup, 
that should give us pretty much lock less behaviour.

We should be able to determine by Hypervisor UUID that its on Qualcomm 
platform or not, during early gunyah setup which should help us setup 
the platfrom ops accordingly.

This should also help cleanup some of the gunyah code that was added 
futher down in this patchset.


--srini

> +
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> +	int ret = 0;
> +
> +	down_read(&rm_platform_ops_lock);
> +	if (rm_platform_ops && rm_platform_ops->pre_mem_share)
> +		ret = rm_platform_ops->pre_mem_share(rm, mem_parcel);
> +	up_read(&rm_platform_ops_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_pre_mem_share);
> +
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> +	int ret = 0;
> +
> +	down_read(&rm_platform_ops_lock);
> +	if (rm_platform_ops && rm_platform_ops->post_mem_reclaim)
> +		ret = rm_platform_ops->post_mem_reclaim(rm, mem_parcel);
> +	up_read(&rm_platform_ops_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_platform_post_mem_reclaim);
> +
> +int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> +{
> +	int ret = 0;
> +
> +	down_write(&rm_platform_ops_lock);
> +	if (!rm_platform_ops)
> +		rm_platform_ops = platform_ops;
> +	else
> +		ret = -EEXIST;
> +	up_write(&rm_platform_ops_lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_register_platform_ops);
> +
> +void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> +{
> +	down_write(&rm_platform_ops_lock);
> +	if (rm_platform_ops == platform_ops)
> +		rm_platform_ops = NULL;
> +	up_write(&rm_platform_ops_lock);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_unregister_platform_ops);
> +
> +static void _devm_gh_rm_unregister_platform_ops(void *data)
> +{
> +	gh_rm_unregister_platform_ops(data);
> +}
> +
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops)
> +{
> +	int ret;
> +
> +	ret = gh_rm_register_platform_ops(ops);
> +	if (ret)
> +		return ret;
> +
> +	return devm_add_action(dev, _devm_gh_rm_unregister_platform_ops, ops);
> +}
> +EXPORT_SYMBOL_GPL(devm_gh_rm_register_platform_ops);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Platform Hooks");
> diff --git a/drivers/virt/gunyah/rsc_mgr.h b/drivers/virt/gunyah/rsc_mgr.h
> index 9b23cefe02b0..e536169df41e 100644
> --- a/drivers/virt/gunyah/rsc_mgr.h
> +++ b/drivers/virt/gunyah/rsc_mgr.h
> @@ -74,6 +74,9 @@ struct gh_rm;
>   int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void *req_buff, size_t req_buff_size,
>   		void **resp_buf, size_t *resp_buff_size);
>   
> +int gh_rm_platform_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +int gh_rm_platform_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +
>   /* Message IDs: Memory Management */
>   #define GH_RM_RPC_MEM_LEND			0x51000012
>   #define GH_RM_RPC_MEM_SHARE			0x51000013
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index 0c83b097fec9..0b12696bf069 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -116,6 +116,12 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
>   	if (!msg)
>   		return -ENOMEM;
>   
> +	ret = gh_rm_platform_pre_mem_share(rm, p);
> +	if (ret) {
> +		kfree(msg);
> +		return ret;
> +	}
> +
>   	req_header = msg;
>   	acl_section = (void *)req_header + sizeof(*req_header);
>   	mem_section = (void *)acl_section + struct_size(acl_section, entries, p->n_acl_entries);
> @@ -139,8 +145,10 @@ static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_
>   	ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
>   	kfree(msg);
>   
> -	if (ret)
> +	if (ret) {
> +		gh_rm_platform_post_mem_reclaim(rm, p);
>   		return ret;
> +	}
>   
>   	p->mem_handle = le32_to_cpu(*resp);
>   
> @@ -204,7 +212,7 @@ int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
>   	if (ret)
>   		return ret;
>   
> -	return ret;
> +	return gh_rm_platform_post_mem_reclaim(rm, parcel);
>   }
>   
>   /**
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 9cffee6f9b4e..dc05d5b1e1a3 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -147,4 +147,21 @@ int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
>   				struct gh_rm_hyp_resources **resources);
>   int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
>   
> +struct gunyah_rm_platform_ops {
> +	int (*pre_mem_share)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +	int (*post_mem_reclaim)(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel);
> +};
> +
> +#if IS_ENABLED(CONFIG_GUNYAH_PLATFORM_HOOKS)
> +int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
> +void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops);
> +int devm_gh_rm_register_platform_ops(struct device *dev, struct gunyah_rm_platform_ops *ops);
> +#else
> +static inline int gh_rm_register_platform_ops(struct gunyah_rm_platform_ops *platform_ops)
> +	{ return 0; }
> +static inline void gh_rm_unregister_platform_ops(struct gunyah_rm_platform_ops *platform_ops) { }
> +static inline int devm_gh_rm_register_platform_ops(struct device *dev,
> +	struct gunyah_rm_platform_ops *ops) { return 0; }
> +#endif
> +
>   #endif

Elliot Berman Feb. 21, 2023, 9:22 p.m. UTC | #12

On 2/21/2023 6:51 AM, Srinivas Kandagatla wrote:
> 
> 
> On 14/02/2023 21:24, Elliot Berman wrote:
[snip]
>> +
>> +static struct gunyah_rm_platform_ops *rm_platform_ops;
>> +static DECLARE_RWSEM(rm_platform_ops_lock);
> 
> Why do we need this read/write lock or this global rm_platform_ops here, 
> AFAIU, there will be only one instance of platform_ops per platform.
> 
> This should be a core part of the gunyah and its driver early setup, 
> that should give us pretty much lock less behaviour.
> 
> We should be able to determine by Hypervisor UUID that its on Qualcomm 
> platform or not, during early gunyah setup which should help us setup 
> the platfrom ops accordingly.
> 
> This should also help cleanup some of the gunyah code that was added 
> futher down in this patchset.

I'm guessing the direction to take is:

   config GUNYAH
     select QCOM_SCM if ARCH_QCOM

and have vm_mgr call directly into qcom_scm driver if the UID matches?

We have an Android requirement to enable CONFIG_GUNYAH=y and 
CONFIG_QCOM_SCM=m, but it wouldn't be possible with this design. The 
platform hooks implementation allows GUNYAH and QCOM_SCM to be enabled 
without setting lower bound of the other.

- Elliot

Srinivas Kandagatla Feb. 22, 2023, 10:21 a.m. UTC | #13

On 21/02/2023 21:22, Elliot Berman wrote:
> 
> 
> On 2/21/2023 6:51 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 14/02/2023 21:24, Elliot Berman wrote:
> [snip]
>>> +
>>> +static struct gunyah_rm_platform_ops *rm_platform_ops;
>>> +static DECLARE_RWSEM(rm_platform_ops_lock);
>>
>> Why do we need this read/write lock or this global rm_platform_ops 
>> here, AFAIU, there will be only one instance of platform_ops per 
>> platform.
>>
>> This should be a core part of the gunyah and its driver early setup, 
>> that should give us pretty much lock less behaviour.
>>
>> We should be able to determine by Hypervisor UUID that its on Qualcomm 
>> platform or not, during early gunyah setup which should help us setup 
>> the platfrom ops accordingly.
>>
>> This should also help cleanup some of the gunyah code that was added 
>> futher down in this patchset.
> 
> I'm guessing the direction to take is:
> 
>    config GUNYAH
>      select QCOM_SCM if ARCH_QCOM

This is how other kernel drivers use SCM.

> 
> and have vm_mgr call directly into qcom_scm driver if the UID matches?

Yes that is the plan, we could have these callbacks as part key data 
structure like struct gh_rm and update it at very early in setup stage 
based on UUID match.

> 
> We have an Android requirement to enable CONFIG_GUNYAH=y and 
> CONFIG_QCOM_SCM=m, but it wouldn't be possible with this design. The 

Am not sure how this will work, if gunyah for QCOM Platform is depended 
on SCM then there is no way that gunyah could be a inbuilt and make scm 
a module.

On the other hand with the existing design gunyah will not be functional 
until scm driver is loaded and platform hooks are registered. This 
runtime dependency design does not express the dependency correctly and 
the only way to know if gunyah is functional is keep trying which can 
only work after scm driver is probed.

This also raises the design question on how much of platform hooks 
dependency is captured at gunyah core and api level, with state of 
current code /dev/gunyah will be created even without platform hooks and 
let the userspace use it which then only fail at hyp call level.

Other issue with current design is, scm module can be unloaded under the 
hood leaving gunyah with NULL pointers to those platform hook functions. 
This is the kind of issues we could see if the dependency is not 
expressed from bottom up.

The current design is not really capturing the depended components 
accurately.

Considering platform hooks as a core resource to gunyah on Qualcomm 
platform is something that needs attention. If we can fix that then it 
might be doable to have QCOM_SCM=m and CONFIG_GUNYAH=y.

--srini
> platform hooks implementation allows GUNYAH and QCOM_SCM to be enabled 
> without setting lower bound of the other.
> 
> - Elliot

Elliot Berman Feb. 23, 2023, 12:50 a.m. UTC | #14

On 2/21/2023 6:17 AM, Srinivas Kandagatla wrote:
> 
> 
> On 14/02/2023 21:24, Elliot Berman wrote:
>>
>> Add remaining ioctls to support non-proxy VM boot:
>>
>>   - Gunyah Resource Manager uses the VM's devicetree to configure the
>>     virtual machine. The location of the devicetree in the guest's
>>     virtual memory can be declared via the SET_DTB_CONFIGioctl.
>>   - Trigger start of the virtual machine with VM_START ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
>> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   drivers/virt/gunyah/vm_mgr.c    | 229 ++++++++++++++++++++++++++++++--
>>   drivers/virt/gunyah/vm_mgr.h    |  10 ++
>>   drivers/virt/gunyah/vm_mgr_mm.c |  23 ++++
>>   include/linux/gunyah_rsc_mgr.h  |   6 +
>>   include/uapi/linux/gunyah.h     |  13 ++
>>   5 files changed, 268 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
>> index 84102bac03cc..fa324385ade5 100644
>> --- a/drivers/virt/gunyah/vm_mgr.c
>> +++ b/drivers/virt/gunyah/vm_mgr.c
>> @@ -9,37 +9,114 @@
>>   #include <linux/file.h>
>>   #include <linux/gunyah_rsc_mgr.h>
>>   #include <linux/miscdevice.h>
>> +#include <linux/mm.h>
>>   #include <linux/module.h>
>>   #include <uapi/linux/gunyah.h>
>>   #include "vm_mgr.h"
>> +static int gh_vm_rm_notification_status(struct gh_vm *ghvm, void *data)
>> +{
>> +    struct gh_rm_vm_status_payload *payload = data;
>> +
>> +    if (payload->vmid != ghvm->vmid)
>> +        return NOTIFY_OK;
> Is this even possible? If yes, then this is a bug somewhere, we should 
> not be getting notifications for something that does not belong to this vm.
> What is the typical case for such behavior? comment would be useful.
> 

VM manager has reigstered to receive all notifications. If there are 
multiple VMs running, then the notifier callback receives notifications 
about all VMs. I've not yet implemented any filtering at resource 
manager level because it added lot of processing code in the resource 
manager that is easily done in the notifier callback.

> 
>> +
>> +    /* All other state transitions are synchronous to a corresponding 
>> RM call */
>> +    if (payload->vm_status == GH_RM_VM_STATUS_RESET){
>> +        down_write(&ghvm->status_lock);
>> +        ghvm->vm_status = payload->vm_status;
>> +        up_write(&ghvm->status_lock);
>> +        wake_up(&ghvm->vm_status_wait);
>> +    }
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static int gh_vm_rm_notification_exited(struct gh_vm *ghvm, void *data)
>> +{
>> +    struct gh_rm_vm_exited_payload *payload = data;
>> +
>> +    if (payload->vmid != ghvm->vmid)
>> +        return NOTIFY_OK;
> same
> 
>> +
>> +    down_write(&ghvm->status_lock);
>> +    ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
>> +    up_write(&ghvm->status_lock);
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static int gh_vm_rm_notification(struct notifier_block *nb, unsigned 
>> long action, void *data)
>> +{
>> +    struct gh_vm *ghvm = container_of(nb, struct gh_vm, nb);
>> +
>> +    switch (action) {
>> +    case GH_RM_NOTIFICATION_VM_STATUS:
>> +        return gh_vm_rm_notification_status(ghvm, data);
>> +    case GH_RM_NOTIFICATION_VM_EXITED:
>> +        return gh_vm_rm_notification_exited(ghvm, data);
>> +    default:
>> +        return NOTIFY_OK;
>> +    }
>> +}
>> +
>> +static void gh_vm_stop(struct gh_vm *ghvm)
>> +{
>> +    int ret;
>> +
>> +    down_write(&ghvm->status_lock);
>> +    if (ghvm->vm_status == GH_RM_VM_STATUS_RUNNING) {
>> +        ret = gh_rm_vm_stop(ghvm->rm, ghvm->vmid);
>> +        if (ret)
>> +            pr_warn("Failed to stop VM: %d\n", ret);
> Should we not bail out from this fail path?
> 

This is called in the gh_vm_free path and we have some options here when 
we get some error while stopping a VM. So far, my strategy has been to 
ignore error as best we can and continue. We might get further errors, 
but we can also continue to clean up some more resources.

If there's an error, I'm not sure if there is a proper strategy to get 
someone to retry later: userspace is closing all its references to the 
VM and we need to stop the VM and clean up all our resources. Nitro 
Enclaves and ACRN suffer similar

> 
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_EXITED;
>> +    up_write(&ghvm->status_lock);
>> +}
>> +
>>   static void gh_vm_free(struct work_struct *work)
>>   {
>>       struct gh_vm *ghvm = container_of(work,struct gh_vm, free_work);
>>       struct gh_vm_mem *mapping, *tmp;
>>       int ret;
>> -    mutex_lock(&ghvm->mm_lock);
>> -    list_for_each_entry_safe(mapping, tmp, &ghvm->memory_mappings, 
>> list) {
>> -        gh_vm_mem_reclaim(ghvm, mapping);
>> -        kfree(mapping);
>> +    switch (ghvm->vm_status) {
>> +unknown_state:
> 
> Never seen this style of using goto from switch to a new label in switch 
> case. Am sure this is some kinda trick but its not helping readers.
> 
> Can we rewrite this using a normal semantics.
> 
> may be a do while could help.
> 

Srivatsa suggested dropping the goto, I can do that.
> 
>> +    case GH_RM_VM_STATUS_RUNNING:
>> +        gh_vm_stop(ghvm);
>> +        fallthrough;
>> +    case GH_RM_VM_STATUS_INIT_FAILED:
>> +    case GH_RM_VM_STATUS_LOAD:
>> +    case GH_RM_VM_STATUS_LOAD_FAILED:
>> +        mutex_lock(&ghvm->mm_lock);
>> +        list_for_each_entry_safe(mapping, tmp, 
>> &ghvm->memory_mappings, list) {
>> +            gh_vm_mem_reclaim(ghvm, mapping);
>> +            kfree(mapping);
>> +        }
>> +        mutex_unlock(&ghvm->mm_lock);
>> +        fallthrough;
>> +    case GH_RM_VM_STATUS_NO_STATE:
>> +        ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>> +        if (ret)
>> +            pr_warn("Failed to deallocate vmid: %d\n", ret);
>> +
>> +        gh_rm_notifier_unregister(ghvm->rm, &ghvm->nb);
>> +        put_gh_rm(ghvm->rm);
>> +        kfree(ghvm);
>> +        break;
>> +    default:
>> +        pr_err("VM is unknown state:%d, assuming it's running.\n", 
>> ghvm->vm_status);
> vm_status did not change do we not endup here again?
> 
>> +        goto unknown_state;
>>       }
>> -    mutex_unlock(&ghvm->mm_lock);
>> -
>> -    ret = gh_rm_dealloc_vmid(ghvm->rm, ghvm->vmid);
>> -    if (ret)
>> -        pr_warn("Failed to deallocate vmid: %d\n", ret);
>> -
>> -    put_gh_rm(ghvm->rm);
>> -    kfree(ghvm);
>>   }
>>   static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>>   {
>>       struct gh_vm *ghvm;
>> -    int vmid;
>> +    int vmid, ret;
>>       vmid = gh_rm_alloc_vmid(rm, 0);
>>       if (vmid < 0)
>> @@ -56,13 +133,123 @@ static __must_check struct gh_vm 
>> *gh_vm_alloc(struct gh_rm *rm)
>>       ghvm->vmid = vmid;
>>       ghvm->rm = rm;
>> +    init_waitqueue_head(&ghvm->vm_status_wait);
>> +    ghvm->nb.notifier_call = gh_vm_rm_notification;
>> +    ret = gh_rm_notifier_register(rm, &ghvm->nb);
>> +    if (ret) {
>> +        put_gh_rm(rm);
>> +        gh_rm_dealloc_vmid(rm, vmid);
>> +        kfree(ghvm);
>> +        return ERR_PTR(ret);
>> +    }
>> +
>>       mutex_init(&ghvm->mm_lock);
>>       INIT_LIST_HEAD(&ghvm->memory_mappings);
>> +    init_rwsem(&ghvm->status_lock);
>>       INIT_WORK(&ghvm->free_work, gh_vm_free);
>> +    ghvm->vm_status = GH_RM_VM_STATUS_LOAD;
>>       return ghvm;
>>   }
>> +static int gh_vm_start(struct gh_vm *ghvm)
>> +{
>> +    struct gh_vm_mem *mapping;
>> +    u64 dtb_offset;
>> +    u32 mem_handle;
>> +    int ret;
>> +
>> +    down_write(&ghvm->status_lock);
>> +    if (ghvm->vm_status != GH_RM_VM_STATUS_LOAD) {
>> +        up_write(&ghvm->status_lock);
>> +        return 0;
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_RESET;
>> +
> 
> <------
> should we not take ghvm->mm_lock here to make sure that list is 
> consistent while processing.

Done.

>> +    list_for_each_entry(mapping, &ghvm->memory_mappings,list) {
>> +        switch (mapping->share_type){
>> +        case VM_MEM_LEND:
>> +            ret = gh_rm_mem_lend(ghvm->rm, &mapping->parcel);
>> +            break;
>> +        case VM_MEM_SHARE:
>> +            ret = gh_rm_mem_share(ghvm->rm, &mapping->parcel);
>> +            break;
>> +        }
>> +        if (ret) {
>> +            pr_warn("Failed to %s parcel %d: %d\n",
>> +                mapping->share_type == VM_MEM_LEND ? "lend" : "share",
>> +                mapping->parcel.label,
>> +                ret);
>> +            gotoerr;
>> +        }
>> +    }
> --->
> 
>> +
>> +    mapping = gh_vm_mem_find_mapping(ghvm, ghvm->dtb_config.gpa, 
>> ghvm->dtb_config.size);
>> +    if (!mapping) {
>> +        pr_warn("Failed to find the memory_handle for DTB\n");
> 
> What wil happen to the mappings that are lend or shared?
> 

When the VM is cleaned up (on final destruction), the mappings are 
reclaimed.

>> +        ret = -EINVAL;
>> +        goto err;
>> +    }
>> +
>> +    mem_handle = mapping->parcel.mem_handle;
>> +    dtb_offset = ghvm->dtb_config.gpa - mapping->guest_phys_addr;
>> +
>> +    ret = gh_rm_vm_configure(ghvm->rm, ghvm->vmid, ghvm->auth, 
>> mem_handle,
> 
> where is authentication mechanism (auth) comming from? Who is supposed 
> to set this value?
> 
> Should it come from userspace? if so I do not see any UAPI facility to 
> do that via VM_START ioctl.
> 

Right, we are only adding the support for unauthenticated VMs for now. 
There would be further UAPI facilities to set the authentication type.

> 
>> +                0, 0, dtb_offset, ghvm->dtb_config.size);
>> +    if (ret) {
>> +        pr_warn("Failed to configureVM: %d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ret = gh_rm_vm_init(ghvm->rm, ghvm->vmid);
>> +    if (ret) {
>> +        pr_warn("Failed to initialize VM: %d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ret = gh_rm_vm_start(ghvm->rm, ghvm->vmid);
>> +    if (ret) {
>> +        pr_warn("Failed to start VM:%d\n", ret);
>> +        goto err;
>> +    }
>> +
>> +    ghvm->vm_status = GH_RM_VM_STATUS_RUNNING;
>> +    up_write(&ghvm->status_lock);
>> +    return ret;
>> +err:
>> +    ghvm->vm_status = GH_RM_VM_STATUS_INIT_FAILED;
>> +    up_write(&ghvm->status_lock);
> 
> Am really not sure if we are doing right thing in the error path, there 
> are multiple cases that seems to be not handled or if it was not 
> required no comments to clarify this are documented.
> ex: if vm start fails then what happes with memory mapping or do we need 
> to un-configure vm or un-init vm from hypervisor side?
> 
> if none of this is required its useful to add come clear comments.
> 

It is required and done in the VM cleanup path. I'll add comment with 
this info.

>> +    return ret;
>> +}
>> +
>> +static int gh_vm_ensure_started(struct gh_vm *ghvm)
>> +{
>> +    int ret;
>> +
>> +retry:
>> +    ret = down_read_interruptible(&ghvm->status_lock);
>> +    if (ret)
>> +        return ret;
>> +
>> +    /* Unlikely because VM is typically started */
>> +    if (unlikely(ghvm->vm_status == GH_RM_VM_STATUS_LOAD)) {
>> +        up_read(&ghvm->status_lock);
>> +        ret = gh_vm_start(ghvm);
>> +        if (ret)
>> +            gotoout;
>> +        goto retry;
>> +    }
> 
> do while will do better job here w.r.t to readablity.
> 

I think do while and my current "goto retry" imply a long loop is 
possible. The "goto retry" or while loop is guaranteed to run only once 
because gh_vm_start will always bring VM out of GH_RM_VM_STATUS_LOAD.

How about this?

-               goto retry;
+               /** gh_vm_start() is guaranteed to bring status out of
+                * GH_RM_VM_STATUS_LOAD, thus inifitely recursive call 
is not
+                * possible
+                */
+               return gh_vm_ensure_started(ghvm);



>> +
>> +    /* Unlikely because VM is typically running */
>> +    if (unlikely(ghvm->vm_status != GH_RM_VM_STATUS_RUNNING))
>> +        ret = -ENODEV;
>> +
>> +out:
>> +    up_read(&ghvm->status_lock);
>> +    return ret;
>> +}
>> +
>>   static long gh_vm_ioctl(struct file *filp, unsigned int cmd, 
>> unsigned long arg)
>>   {
>>       struct gh_vm *ghvm = filp->private_data;
>> @@ -88,6 +275,22 @@ static long gh_vm_ioctl(struct file *filp, 
>> unsigned int cmd, unsigned long arg)
>>               r = gh_vm_mem_free(ghvm, region.label);
>>           break;
>>       }
>> +    case GH_VM_SET_DTB_CONFIG: {
>> +        struct gh_vm_dtb_config dtb_config;
>> +
>> +        if (copy_from_user(&dtb_config, argp, sizeof(dtb_config)))
>> +            return -EFAULT;
>> +
>> +        dtb_config.size = PAGE_ALIGN(dtb_config.size);
>> +        ghvm->dtb_config = dtb_config;
>> +
>> +        r = 0;
>> +        break;
>> +    }
>> +    case GH_VM_START: {
>> +        r = gh_vm_ensure_started(ghvm);
>> +        break;
>> +    }
>>       default:
>>           r = -ENOTTY;
>>           break;
>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>> index 97bc00c34878..e9cf56647cc2 100644
>> --- a/drivers/virt/gunyah/vm_mgr.h
>> +++ b/drivers/virt/gunyah/vm_mgr.h
>> @@ -10,6 +10,8 @@
>>   #include <linux/list.h>
>>   #include <linux/miscdevice.h>
>>   #include <linux/mutex.h>
>> +#include <linux/rwsem.h>
>> +#include <linux/wait.h>
>>   #include <uapi/linux/gunyah.h>
>> @@ -33,6 +35,13 @@ struct gh_vm_mem {
>>   struct gh_vm {
>>       u16 vmid;
>>       struct gh_rm *rm;
>> +    enum gh_rm_vm_auth_mechanism auth;
>> +    struct gh_vm_dtb_config dtb_config;
>> +
>> +    struct notifier_block nb;
>> +    enum gh_rm_vm_status vm_status;
>> +    wait_queue_head_t vm_status_wait;
>> +    struct rw_semaphore status_lock;
>>       struct work_struct free_work;
>>       struct mutex mm_lock;
>> @@ -43,5 +52,6 @@ int gh_vm_mem_alloc(struct gh_vm *ghvm, struct 
>> gh_userspace_memory_region *regio
>>   void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct gh_vm_mem *mapping);
>>   int gh_vm_mem_free(struct gh_vm *ghvm, u32 label);
>>   struct gh_vm_mem *gh_vm_mem_find(struct gh_vm *ghvm, u32 label);
>> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, 
>> u32 size);
>>   #endif
>> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c 
>> b/drivers/virt/gunyah/vm_mgr_mm.c
>> index 03e71a36ea3b..128b90da555a 100644
>> --- a/drivers/virt/gunyah/vm_mgr_mm.c
>> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
>> @@ -52,6 +52,29 @@ void gh_vm_mem_reclaim(struct gh_vm *ghvm, struct 
>> gh_vm_mem *mapping)
>>       list_del(&mapping->list);
>>   }
>> +struct gh_vm_mem *gh_vm_mem_find_mapping(struct gh_vm *ghvm, u64 gpa, 
>> u32 size)
> naming is bit missleading we already have
> gh_vm_mem_find/__gh_vm_mem_find which is returning mapping based on label
> now with gh_vm_mem_find_mapping() is doing same thing but with address.
> 
> Can we rename them clearly
> gh_vm_mem_find_mapping_by_label()
> gh_vm_mem_find_mapping_by_addr()
> 

Done.

- Elliot

Srinivas Kandagatla Feb. 23, 2023, 10:08 a.m. UTC | #15

On 22/02/2023 00:27, Elliot Berman wrote:
> 
>>> +    .llseek = noop_llseek,
>>> +};
>>> +
>>> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
>> Not sure what is the gain of this multiple levels of redirection.
>>
>> How about
>>
>> long gh_dev_create_vm(struct gh_rm *rm, unsigned long arg)
>> {
>> ...
>> }
>>
>> and rsc_mgr just call it as part of its ioctl call
>>
>> static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned 
>> long arg)
>> {
>>      struct miscdevice *miscdev = filp->private_data;
>>      struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
>>
>>      switch (cmd) {
>>      case GH_CREATE_VM:
>>          return gh_dev_create_vm(rm, arg);
>>      default:
>>          return -ENOIOCTLCMD;
>>      }
>> }
>>
> 
> I'm anticipating we will add further /dev/gunyah ioctls and I thought it 
> would be cleaner to have all that in vm_mgr.c itself.
> 
>>
>>> +{
>>> +    struct gh_vm *ghvm;
>>> +    struct file *file;
>>> +    int fd, err;
>>> +
>>> +    /* arg reserved for future use. */
>>> +    if (arg)
>>> +        return -EINVAL;
>>
>> The only code path I see here is via GH_CREATE_VM ioctl which 
>> obviously does not take any arguments, so if you are thinking of using 
>> the argument for architecture-specific VM flags.  Then this needs to 
>> be properly done by making the ABI aware of this.
> 
> It is documented in Patch 17 (Document Gunyah VM Manager)
> 
> +GH_CREATE_VM
> +~~~~~~~~~~~~
> +
> +Creates a Gunyah VM. The argument is reserved for future use and must 
> be 0.
> 
But this conficts with the UAPIs that have been defined. GH_CREATE_VM 
itself is defined to take no parameters.

#define GH_CREATE_VM                    _IO(GH_IOCTL_TYPE, 0x0)

so where are you expecting the argument to come from?

>>
>> As you mentioned zero value arg imply an "unauthenticated VM" type, 
>> but this was not properly encoded in the userspace ABI. Why not make 
>> it future compatible. How about adding arguments to GH_CREATE_VM and 
>> pass the required information correctly.
>> Note that once the ABI is accepted then you will not be able to change 
>> it, other than adding a new one.
>>
> 
> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure yet 
> what arguments to add here.
> 
> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't 

Sorry, that is exactly what we want to avoid, we can not change the UAPI 
its going to break the userspace.

> break compatibility with old kernels; old kernels reject it as -EINVAL.

If you have userspace built with older kernel headers then that will 
break. Am not sure about old-kernels.

What exactly is the argument that you want to add to GH_CREATE_VM?

If you want to keep GH_CREATE_VM with no arguments that is fine but 
remove the conflicting comments in the code and document so that its not 
misleading readers/reviewers that the UAPI is going to be modified in 
near future.


> 
>>> +
>>> +    ghvm = gh_vm_alloc(rm);
>>> +    if (IS_ERR(ghvm))
>>> +        return PTR_ERR(ghvm);
>>> +
>>> +    fd = get_unused_fd_flags(O_CLOEXEC);
>>> +    if (fd < 0) {
>>> +        err = fd;
>>> +        goto err_destroy_vm;
>>> +    }
>>> +
>>> +    file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
>>> +    if (IS_ERR(file)) {
>>> +        err = PTR_ERR(file);
>>> +        goto err_put_fd;
>>> +    }
>>> +
>>> +    fd_install(fd, file);
>>> +
>>> +    return fd;
>>> +
>>> +err_put_fd:
>>> +    put_unused_fd(fd);
>>> +err_destroy_vm:
>>> +    kfree(ghvm);
>>> +    return err;
>>> +}
>>> +
>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, 
>>> unsigned long arg)
>>> +{
>>> +    switch (cmd) {
>>> +    case GH_CREATE_VM:
>>> +        return gh_dev_ioctl_create_vm(rm, arg);
>>> +    default:
>>> +        return -ENOIOCTLCMD;
>>> +    }
>>> +}
>>> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
>>> new file mode 100644
>>> index 000000000000..76954da706e9
>>> --- /dev/null
>>> +++ b/drivers/virt/gunyah/vm_mgr.h
>>> @@ -0,0 +1,22 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>>> rights reserved.
>>> + */
>>> +
>>> +#ifndef _GH_PRIV_VM_MGR_H
>>> +#define _GH_PRIV_VM_MGR_H
>>> +
>>> +#include <linux/gunyah_rsc_mgr.h>
>>> +
>>> +#include <uapi/linux/gunyah.h>
>>> +
>>> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, 
>>> unsigned long arg);
>>> +
>>> +struct gh_vm {
>>> +    u16 vmid;
>>> +    struct gh_rm *rm;
>>> +
>>> +    struct work_struct free_work;
>>> +};
>>> +
>>> +#endif
>>> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
>>> new file mode 100644
>>> index 000000000000..10ba32d2b0a6
>>> --- /dev/null
>>> +++ b/include/uapi/linux/gunyah.h
>>> @@ -0,0 +1,23 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>>> rights reserved.
>>> + */
>>> +
>>> +#ifndef _UAPI_LINUX_GUNYAH
>>> +#define _UAPI_LINUX_GUNYAH
>>> +
>>> +/*
>>> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
>>> + */
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/ioctl.h>
>>> +
>>> +#define GH_IOCTL_TYPE            'G'
>>> +
>>> +/*
>>> + * ioctls for /dev/gunyah fds:
>>> + */
>>> +#define GH_CREATE_VM            _IO(GH_IOCTL_TYPE, 0x0) /* Returns a 
>>> Gunyah VM fd */
>>
>> Can HLOS forcefully destroy a VM?
>> If so should we have a corresponding DESTROY IOCTL?
> 
> It can forcefully destroy unauthenticated and protected virtual 
> machines. I don't have a userspace usecase for a DESTROY ioctl yet, 
> maybe this can be added later? By the way, the VM is forcefully 
that should be fine, but its also nice to add it for completeness, but 
not a compulsory atm

> destroyed when VM refcount is dropped to 0 (close(vm_fd) and any other 
> relevant file descriptors).
I have noticed that path.

--srini
> 
> - Elliot

Srinivas Kandagatla Feb. 23, 2023, 10:25 a.m. UTC | #16

On 23/02/2023 00:15, Elliot Berman wrote:
> 
> 
> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 14/02/2023 21:23, Elliot Berman wrote:
>>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>>> to 1024 bytes. This driver supports pairing a receiver message queue and
>>> a transmitter message queue to expose a single mailbox channel.
>>>
>>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>>> ---
>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>   drivers/mailbox/Makefile                   |   2 +
>>>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>>>   include/linux/gunyah.h                      |  56 +++++
>>>   4 files changed, 280 insertions(+)
>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>
>>> diff --git a/Documentation/virt/gunyah/message-queue.rst 
>>> b/Documentation/virt/gunyah/message-queue.rst
>>> index 0667b3eb1ff9..082085e981e0 100644
>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs 
>>> (and two capability IDs).
>>>         |               |         |                 | |               |
>>>         |               |         |                 | |               |
>>>         +---------------+         +-----------------+ +---------------+
>>> +
>>> +Gunyah message queues are exposed as mailboxes. To create the 
>>> mailbox, create
>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY 
>>> interrupt,
>>> +all messages in the RX message queue are read and pushed via the 
>>> `rx_callback`
>>> +of the registered mbox_client.
>>> +
>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>> +   :identifiers: gh_msgq_init
>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>> index fc9376117111..5f929bb55e9a 100644
>>> --- a/drivers/mailbox/Makefile
>>> +++ b/drivers/mailbox/Makefile
>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>
>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not 
>> CONFIG_GUNYAH_MBOX?
>>
> 
> There was some previous discussion about this:
> 
> https://lore.kernel.org/all/2a7bb5f2-1286-b661-659a-a5037150eae8@quicinc.com/
> 
>>> +
>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>> diff --git a/drivers/mailbox/gunyah-msgq.c 
>>> b/drivers/mailbox/gunyah-msgq.c
>>> new file mode 100644
>>> index 000000000000..03ffaa30ce9b
>>> --- /dev/null
>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>> @@ -0,0 +1,214 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>>> rights reserved.
>>> + */
>>> +
>>> +#include <linux/mailbox_controller.h>
>>> +#include <linux/module.h>
>>> +#include <linux/interrupt.h>
>>> +#include <linux/gunyah.h>
>>> +#include <linux/printk.h>
>>> +#include <linux/init.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/wait.h>
>>
>> ...
>>
>>> +/* Fired when message queue transitions from "full" to "space 
>>> available" to send messages */
>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>> +{
>>> +    struct gh_msgq *msgq = data;
>>> +
>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>> +
>>> +    return IRQ_HANDLED;
>>> +}
>>> +
>>> +/* Fired after sending message and hypercall told us there was more 
>>> space available. */
>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>
>> Tasklets have been long deprecated, consider using workqueues in this 
>> particular case.
>>
> 
> Workqueues have higher latency and tasklets came as recommendation from 
> Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
> 
> I did some quick unscientific measurements of ~1000x samples. The median 
> latency for resource manager went from 25.5 us (tasklet) to 26 us 
> (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us (13% 
> slower). Obviously, the outliers for workqueues were much more extreme.

TBH, this is expected because we are only testing resource manager, Note 
  the advantage that you will see shifting from tasket to workqueues is 
on overall system latencies and some drivers performance that need to 
react to events.

please take some time to read this nice article about this 
https://lwn.net/Articles/830964/


--srini
> 
>>
>>> +{
>>> +    struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, 
>>> txdone_tasklet);
>>> +
>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>>> +}
>>> +
>>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>>> +{
>> ..
>>
>>> +    tasklet_schedule(&msgq->txdone_tasklet);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static struct mbox_chan_ops gh_msgq_ops = {
>>> +    .send_data = gh_msgq_send_data,
>>> +};
>>> +
>>> +/**
>>> + * gh_msgq_init() - Initialize a Gunyah message queue with an 
>>> mbox_client
>>> + * @parent: optional, device parent used for the mailbox controller
>>> + * @msgq: Pointer to the gh_msgq to initialize
>>> + * @cl: A mailbox client to bind to the mailbox channel that the 
>>> message queue creates
>>> + * @tx_ghrsc: optional, the transmission side of the message queue
>>> + * @rx_ghrsc: optional, the receiving side of the message queue
>>> + *
>>> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most 
>>> message queue use cases come with
>>> + * a pair of message queues to facilitate bidirectional 
>>> communication. When tx_ghrsc is set,
>>> + * the client can send messages with 
>>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>>> + * is set, the mbox_client should register an .rx_callback() and the 
>>> message queue driver will
>>> + * push all available messages upon receiving the RX ready 
>>> interrupt. The messages should be
>>> + * consumed or copied by the client right away as the 
>>> gh_msgq_rx_data will be replaced/destroyed
>>> + * after the callback.
>>> + *
>>> + * Returns - 0 on success, negative otherwise
>>> + */
>>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct 
>>> mbox_client *cl,
>>> +             struct gunyah_resource *tx_ghrsc, struct 
>>> gunyah_resource *rx_ghrsc)
>>> +{
>>> +    int ret;
>>> +
>>> +    /* Must have at least a tx_ghrsc or rx_ghrsc and that they are 
>>> the right device types */
>>> +    if ((!tx_ghrsc && !rx_ghrsc) ||
>>> +        (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
>>> +        (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
>>> +        return -EINVAL;
>>> +
>>> +    if (gh_api_version() != GUNYAH_API_V1) {
>>> +        pr_err("Unrecognized gunyah version: %u. Currently 
>>> supported: %d\n",
>> dev_err(parent
>>
>> would make this more useful
>>
> 
> Done.
> 
> - Elliot

Alex Elder Feb. 23, 2023, 9:11 p.m. UTC | #17

On 2/14/23 3:23 PM, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   Documentation/virt/gunyah/message-queue.rst |   8 +
>   drivers/mailbox/Makefile                    |   2 +
>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>   include/linux/gunyah.h                      |  56 +++++
>   4 files changed, 280 insertions(+)
>   create mode 100644 drivers/mailbox/gunyah-msgq.c
> 
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index 0667b3eb1ff9..082085e981e0 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
>         |               |         |                 |         |               |
>         |               |         |                 |         |               |
>         +---------------+         +-----------------+         +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> +   :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)	+= mtk-cmdq-mailbox.o
>   
>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)	+= zynqmp-ipi-mailbox.o
>   
> +obj-$(CONFIG_GUNYAH)		+= gunyah-msgq.o
> +
>   obj-$(CONFIG_SUN6I_MSGBOX)	+= sun6i-msgbox.o
>   
>   obj-$(CONFIG_SPRD_MBOX)		+= sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..03ffaa30ce9b
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c

You use a dash in this source file name, but an underscore
everywhere else.  Unless there's a good reason to do this,
please be consistent (use "gunyah_msgq.c").

> @@ -0,0 +1,214 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> +	struct gh_msgq *msgq = data;
> +	struct gh_msgq_rx_data rx_data;
> +	enum gh_error err;
> +	bool ready = true;
> +
> +	while (ready) {
> +		err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> +				(uintptr_t)&rx_data.data, sizeof(rx_data.data),
> +				&rx_data.length, &ready);
> +		if (err != GH_ERROR_OK) {
> +			if (err != GH_ERROR_MSGQUEUE_EMPTY)

Srini mentioned something about this too.  In many
(all?) cases, there is a device pointer available,
so you should use dev_*() functions rather than pr_*().

In this particular case, I'm not sure why/when the
mbox.dev pointer would be null.  Also, dev_*() handles
the case of a null device pointer, and it reports the
device name (just as you do here).

> +				pr_warn("Failed to receive data from msgq for %s: %d\n",
> +					msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "", err);
> +			break;
> +		}
> +		mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> +	struct gh_msgq *msgq = data;
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> +	struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> +	struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> +	struct gh_msgq_tx_data *msgq_data = data;
> +	u64 tx_flags = 0;
> +	enum gh_error gh_error;

Above you named the variable "err".  It helps readability
if you use a very consistent naming convention for variables
of a certain type when they are used a lot.

> +	bool ready;
> +
> +	if (msgq_data->push)
> +		tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> +	gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length,
> +					(uintptr_t)msgq_data->data, tx_flags, &ready);
> +
> +	/**
> +	 * unlikely because Linux tracks state of msgq and should not try to
> +	 * send message when msgq is full.
> +	 */
> +	if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> +		return -EAGAIN;
> +
> +	/**
> +	 * Propagate all other errors to client. If we return error to mailbox
> +	 * framework, then no other messages can be sent and nobody will know
> +	 * to retry this message.
> +	 */
> +	msgq->last_ret = gh_remap_error(gh_error);
> +
> +	/**
> +	 * This message was successfully sent, but message queue isn't ready to
> +	 * receive more messages because it's now full. Mailbox framework

Maybe:  s/receive/accept/

> +	 * requires that we only report that message was transmitted when
> +	 * we're ready to transmit another message. We'll get that in the form
> +	 * of tx IRQ once the other side starts to drain the msgq.
> +	 */
> +	if (gh_error == GH_ERROR_OK && !ready)
> +		return 0;
> +
> +	/**
> +	 * We can send more messages. Mailbox framework requires that tx done
> +	 * happens asynchronously to sending the message. Gunyah message queues
> +	 * tell us right away on the hypercall return whether we can send more
> +	 * messages. To work around this, defer the txdone to a tasklet.
> +	 */
> +	tasklet_schedule(&msgq->txdone_tasklet);
> +
> +	return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> +	.send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: optional, device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most message queue use cases come with

s/should be/must be/

> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client should register an .rx_callback() and the message queue driver will

s/should register/must register/

A general comment on this code is that you sort of half define
a Gunyah message queue API.  You define an initialization
function and an exit function, but you also expose the fact
that you use the mailbox framework in implementation.  This
despite avoiding defining it as an mbox in the DTS file.

It might be hard to avoid that I guess.  But to me it would be
nice if there were a more distinct Gunyah message queue API,
which would provide a send_message() function, for example.
And in that case, perhaps you would pass in the tx_done and/or
rx_data callbacks to this function (since they're required).

All that said, this is (currently?) only used by the resource
manager, so making a beautiful API might not be that important.
Do you envision this being used to communicate with other VMs
in the future?

> + * push all available messages upon receiving the RX ready interrupt. The messages should be

Maybe: s/push/deliver/

> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> +		     struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc)
> +{
> +	int ret;
> +
> +	/* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> +	if ((!tx_ghrsc && !rx_ghrsc) ||
> +	    (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
> +	    (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
> +		return -EINVAL;
> +
> +	if (gh_api_version() != GUNYAH_API_V1) {
> +		pr_err("Unrecognized gunyah version: %u. Currently supported: %d\n",
> +			gh_api_version(), GUNYAH_API_V1);
> +		return -EOPNOTSUPP;
> +	}
> +
> +	if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
> +		return -EOPNOTSUPP;

Can Gunyah even function if it doesn't have the MSGQUEUE feature?
Will there ever be a Gunyah implementation that does not support
it?  Perhaps this test could be done in gunyah_init() instead.

For that matter, you could verify the result of gh_api_version()
at that time also.

> +
> +	msgq->tx_ghrsc = tx_ghrsc;
> +	msgq->rx_ghrsc = rx_ghrsc;
> +
> +	msgq->mbox.dev = parent;
> +	msgq->mbox.ops = &gh_msgq_ops;
> +	msgq->mbox.num_chans = 1;
> +	msgq->mbox.txdone_irq = true;
> +	msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, sizeof(*msgq->mbox.chans), GFP_KERNEL);

 From what I can tell, you will always use exactly one mailbox channel.
So you could just do kzalloc(sizeof()...).

> +	if (!msgq->mbox.chans)
> +		return -ENOMEM;
> +
> +	if (msgq->tx_ghrsc) {

	if (tx_ghrsc) {

The irq field is assumed to be valid.  Are there any
sanity checks you could perform?  Again this is only
used for the resource manager right now, so maybe
it's OK.

> +		ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",

		ret = request_irq(tx_ghrsc->irq, ...


> +				msgq);
> +		if (ret)
> +			goto err_chans;
> +	}
> +
> +	if (msgq->rx_ghrsc) {
> +		ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> +						IRQF_ONESHOT, "gh_msgq_rx", msgq);
> +		if (ret)
> +			goto err_tx_irq;
> +	}
> +
> +	tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +
> +	ret = mbox_controller_register(&msgq->mbox);
> +	if (ret)
> +		goto err_rx_irq;
> +
> +	ret = mbox_bind_client(gh_msgq_chan(msgq), cl);


> +	if (ret)
> +		goto err_mbox;
> +
> +	return 0;
> +err_mbox:
> +	mbox_controller_unregister(&msgq->mbox);
> +err_rx_irq:
> +	if (msgq->rx_ghrsc)
> +		free_irq(msgq->rx_ghrsc->irq, msgq);
> +err_tx_irq:
> +	if (msgq->tx_ghrsc)
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +err_chans:
> +	kfree(msgq->mbox.chans);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{

Is there any need to un-bind the client?

> +	mbox_controller_unregister(&msgq->mbox);
> +
> +	if (msgq->rx_ghrsc)
> +		free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> +	if (msgq->tx_ghrsc)
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> +	kfree(msgq->mbox.chans);
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index cb6df4eec5c2..2e13669c6363 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,67 @@
>   
>   #include <linux/bitfield.h>
>   #include <linux/errno.h>
> +#include <linux/interrupt.h>
>   #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
>   #include <linux/types.h>
>   
> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
> +enum gunyah_resource_type {
> +	GUNYAH_RESOURCE_TYPE_BELL_TX	= 0,
> +	GUNYAH_RESOURCE_TYPE_BELL_RX	= 1,
> +	GUNYAH_RESOURCE_TYPE_MSGQ_TX	= 2,
> +	GUNYAH_RESOURCE_TYPE_MSGQ_RX	= 3,
> +	GUNYAH_RESOURCE_TYPE_VCPU	= 4,

The maximum value here must fit in 8 bits.  I guess
there's no risk right now of using that up, but you
use negative values in some cases elsewhere.

> +};
> +
> +struct gunyah_resource {
> +	enum gunyah_resource_type type;
> +	u64 capid;
> +	int irq;

request_irq() defines the IRQ value to be an unsigned int.

> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE	240
> +
> +struct gh_msgq_tx_data {
> +	size_t length;
> +	bool push;
> +	char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> +	size_t length;
> +	char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> +	struct gunyah_resource *tx_ghrsc;
> +	struct gunyah_resource *rx_ghrsc;
> +
> +	/* msgq private */
> +	int last_ret; /* Linux error, not GH_STATUS_* */
> +	struct mbox_controller mbox;
> +	struct tasklet_struct txdone_tasklet;

Can the msgq_client be embedded here too?  (I don't really
know whether msgq and msgq_client are one-to one.)

> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> +		     struct gunyah_resource *tx_ghrsc, struct gunyah_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);

I suggested:

int gh_msgq_send(struct gh_msgq, struct gh_msgq_tx_data *data);

					-Alex

> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> +	return &msgq->mbox.chans[0];
> +}
> +
>   /******************************************************************************/
>   /* Common arch-independent definitions for Gunyah hypercalls                  */
> +
>   #define GH_CAPID_INVAL	U64_MAX
>   #define GH_VMID_ROOT_VM	0xff
>

Alex Elder Feb. 23, 2023, 9:58 p.m. UTC | #18

On 2/14/23 3:12 PM, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
> 
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   include/linux/gunyah.h | 82 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 82 insertions(+)
>   create mode 100644 include/linux/gunyah.h
> 
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..59ef4c735ae8
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,82 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls                  */
> +#define GH_CAPID_INVAL	U64_MAX
> +#define GH_VMID_ROOT_VM	0xff
> +
> +enum gh_error {
> +	GH_ERROR_OK			= 0,
> +	GH_ERROR_UNIMPLEMENTED		= -1,
> +	GH_ERROR_RETRY			= -2,

Do you expect this type to have a particular size?
Since you specify negative values, it matters, and
it's possible that this forces it to be a 4-byte value
(though I'm not sure what the rules are).  In other
words, UNIMPLEMENTED could conceivably have value 0xff
or 0xffffffff.  I'm not even sure you can tell whether
an enum is interpreted as signed or unsigned.

It's not usually a good thing to do, but this *could*
be a case where you do a typedef to represent this as
a signed value of a certain bit width.  (But don't do
that unless someone else says that's worth doing.)

					-Alex

> +
> +	GH_ERROR_ARG_INVAL		= 1,
> +	GH_ERROR_ARG_SIZE		= 2,
> +	GH_ERROR_ARG_ALIGN		= 3,
> +
> +	GH_ERROR_NOMEM			= 10,
> +
> +	GH_ERROR_ADDR_OVFL		= 20,
> +	GH_ERROR_ADDR_UNFL		= 21,
> +	GH_ERROR_ADDR_INVAL		= 22,
> +
> +	GH_ERROR_DENIED			= 30,
> +	GH_ERROR_BUSY			= 31,
> +	GH_ERROR_IDLE			= 32,
> +
> +	GH_ERROR_IRQ_BOUND		= 40,
> +	GH_ERROR_IRQ_UNBOUND		= 41,
> +
> +	GH_ERROR_CSPACE_CAP_NULL	= 50,
> +	GH_ERROR_CSPACE_CAP_REVOKED	= 51,
> +	GH_ERROR_CSPACE_WRONG_OBJ_TYPE	= 52,
> +	GH_ERROR_CSPACE_INSUF_RIGHTS	= 53,
> +	GH_ERROR_CSPACE_FULL		= 54,
> +
> +	GH_ERROR_MSGQUEUE_EMPTY		= 60,
> +	GH_ERROR_MSGQUEUE_FULL		= 61,
> +};
> +
> +/**
> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_remap_error(enum gh_error gh_error)
> +{
> +	switch (gh_error) {
> +	case GH_ERROR_OK:
> +		return 0;
> +	case GH_ERROR_NOMEM:
> +		return -ENOMEM;
> +	case GH_ERROR_DENIED:
> +	case GH_ERROR_CSPACE_CAP_NULL:
> +	case GH_ERROR_CSPACE_CAP_REVOKED:
> +	case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> +	case GH_ERROR_CSPACE_INSUF_RIGHTS:
> +	case GH_ERROR_CSPACE_FULL:
> +		return -EACCES;
> +	case GH_ERROR_BUSY:
> +	case GH_ERROR_IDLE:
> +	case GH_ERROR_IRQ_BOUND:
> +	case GH_ERROR_IRQ_UNBOUND:
> +	case GH_ERROR_MSGQUEUE_FULL:
> +	case GH_ERROR_MSGQUEUE_EMPTY:

Is an empty message queue really busy?

> +		return -EBUSY;
> +	case GH_ERROR_UNIMPLEMENTED:
> +	case GH_ERROR_RETRY:
> +		return -EOPNOTSUPP;
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +#endif

Elliot Berman Feb. 23, 2023, 11:10 p.m. UTC | #19

On 2/23/2023 1:36 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>>
>> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
>>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   drivers/virt/gunyah/Makefile      |   2 +-
>>   drivers/virt/gunyah/rsc_mgr.h     |  45 ++++++
>>   drivers/virt/gunyah/rsc_mgr_rpc.c | 226 ++++++++++++++++++++++++++++++
>>   include/linux/gunyah_rsc_mgr.h    |  73 ++++++++++
>>   4 files changed, 345 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>>
>> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
>> index cc864ff5abbb..de29769f2f3f 100644
>> --- a/drivers/virt/gunyah/Makefile
>> +++ b/drivers/virt/gunyah/Makefile
>> @@ -2,5 +2,5 @@
>>   obj-$(CONFIG_GUNYAH) += gunyah.o
>> -gunyah_rsc_mgr-y += rsc_mgr.o
>> +gunyah_rsc_mgr-y += rsc_mgr.o rsc_mgr_rpc.o
>>   obj-$(CONFIG_GUNYAH) += gunyah_rsc_mgr.o
>> diff --git a/drivers/virt/gunyah/rsc_mgr.h 
>> b/drivers/virt/gunyah/rsc_mgr.h
>> index d4e799a7526f..7406237bc66d 100644
>> --- a/drivers/virt/gunyah/rsc_mgr.h
>> +++ b/drivers/virt/gunyah/rsc_mgr.h
>> @@ -74,4 +74,49 @@ struct gh_rm;
>>   int gh_rm_call(struct gh_rm *rsc_mgr, u32 message_id, void 
>> *req_buff, size_t req_buff_size,
>>           void **resp_buf, size_t *resp_buff_size);
>> +/* Message IDs: VM Management */
>> +#define GH_RM_RPC_VM_ALLOC_VMID            0x56000001
>> +#define GH_RM_RPC_VM_DEALLOC_VMID        0x56000002
>> +#define GH_RM_RPC_VM_START            0x56000004
>> +#define GH_RM_RPC_VM_STOP            0x56000005
>> +#define GH_RM_RPC_VM_RESET            0x56000006
>> +#define GH_RM_RPC_VM_CONFIG_IMAGE        0x56000009
>> +#define GH_RM_RPC_VM_INIT            0x5600000B
>> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES        0x56000020
>> +#define GH_RM_RPC_VM_GET_VMID            0x56000024
>> +
>> +struct gh_rm_vm_common_vmid_req {
>> +    __le16 vmid;
>> +    __le16 reserved0;
>> +} __packed;
>> +
>> +/* Call: VM_ALLOC */
>> +struct gh_rm_vm_alloc_vmid_resp {
>> +    __le16 vmid;
>> +    __le16 reserved0;
>> +} __packed;
>> +
>> +/* Call: VM_STOP */
>> +struct gh_rm_vm_stop_req {
>> +    __le16 vmid;
>> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP    BIT(0)
>> +    u8 flags;
>> +    u8 reserved;
>> +#define GH_RM_VM_STOP_REASON_FORCE_STOP        3
> 
> I suggested this before and you honored it.  Now I'll suggest
> it again, and ask you to do it throughout the driver.
> 
> Please separate the definitions of constant values that
> certain fields can take on from the structure definition.
> I think doing it the way you have here makes it harder to
> understand the structure definition.
> 
> You could define an anonymous enumerated type to hold
> the values meant to be held by each field.
> 

Done.

>> +    __le32 stop_reason;
>> +} __packed;
>> +
>> +/* Call: VM_CONFIG_IMAGE */
>> +struct gh_rm_vm_config_image_req {
>> +    __le16 vmid;
>> +    __le16 auth_mech;
>> +    __le32 mem_handle;
>> +    __le64 image_offset;
>> +    __le64 image_size;
>> +    __le64 dtb_offset;
>> +    __le64 dtb_size;
>> +} __packed;
>> +
>> +/* Call: GET_HYP_RESOURCES */
>> +
>>   #endif
>> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c 
>> b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> new file mode 100644
>> index 000000000000..4515cdd80106
>> --- /dev/null
>> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
>> @@ -0,0 +1,226 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>> rights reserved.
>> + */
>> +
>> +#include <linux/gunyah_rsc_mgr.h>
>> +
>> +#include "rsc_mgr.h"
>> +
>> +/*
>> + * Several RM calls take only a VMID as a parameter and give only 
>> standard
>> + * response back. Deduplicate boilerplate code by using this common 
>> call.
>> + */
>> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, 
>> u16 vmid)
>> +{
>> +    struct gh_rm_vm_common_vmid_req req_payload = {
>> +        .vmid = cpu_to_le16(vmid),
>> +    };
>> +    size_t resp_size;
>> +    void *resp;
>> +
>> +    return gh_rm_call(rm, message_id, &req_payload, 
>> sizeof(req_payload), &resp, &resp_size);
>> +}
>> +
>> +/**
>> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM 
>> identifier.
>> + * @rm: Handle to a Gunyah resource manager
>> + * @vmid: Use GH_VMID_INVAL or 0 to dynamically allocate a VM. A 
>> reserved VMID can
>> + *        be supplied to request allocation of a platform-defined VM.
> 
> Honestly, I'd rather just see 0 (and *not* GH_VMID_INVAL) be the
> special value to mean "dynamically allocate the VMID."  It seems
> 0 is a reserved VMID anyway, and GH_VMID_INVAL might as well be
> treated here as an invalid parameter.

Done.

> 
> Is there any definitition of which VMIDs are reserved?  Like,
> anything under 1024?

It's platform dependent. On Qualcomm platforms, VMIDs <= 63 
(QCOM_SCM_MAX_MANAGED_VMID) are reserved. Of those reserved VMIDs, 
Gunyah only allows us to allocate the "special VMs" (today: TUIVM, 
CPUSYSVM, OEMVM). Passing any value except 0, tuivm_vmid, cpusysvm_vmid, 
or oemvm_vmid returns an error.

On current non-Qualcomm platforms, there aren't any reserved VMIDs so 
passing anything but 0 returns an error.

Thanks,
Elliot

> 
> That's it on this patch for now.
> 
>                      -Alex
> 
>> + *
>> + * Returns - the allocated VMID or negative value on error
>> + */
>> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
>> +{
>> +    struct gh_rm_vm_common_vmid_req req_payload = { 0 };
>> +    struct gh_rm_vm_alloc_vmid_resp *resp_payload;
>> +    size_t resp_size;
>> +    void *resp;
>> +    int ret;
> 
> . . .
>

Elliot Berman Feb. 23, 2023, 11:15 p.m. UTC | #20

On 2/23/2023 2:25 AM, Srinivas Kandagatla wrote:
> 
> 
> On 23/02/2023 00:15, Elliot Berman wrote:
>>
>>
>> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>>
>>>
>>> On 14/02/2023 21:23, Elliot Berman wrote:
>>>> Gunyah message queues are a unidirectional inter-VM pipe for 
>>>> messages up
>>>> to 1024 bytes. This driver supports pairing a receiver message queue 
>>>> and
>>>> a transmitter message queue to expose a single mailbox channel.
>>>>
>>>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>>>> ---
>>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>>   drivers/mailbox/Makefile                   |   2 +
>>>>   drivers/mailbox/gunyah-msgq.c               | 214 
>>>> ++++++++++++++++++++
>>>>   include/linux/gunyah.h                      |  56 +++++
>>>>   4 files changed, 280 insertions(+)
>>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>>
>>>> diff --git a/Documentation/virt/gunyah/message-queue.rst 
>>>> b/Documentation/virt/gunyah/message-queue.rst
>>>> index 0667b3eb1ff9..082085e981e0 100644
>>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs 
>>>> (and two capability IDs).
>>>>         |               |         |                 | |               |
>>>>         |               |         |                 | |               |
>>>>         +---------------+         +-----------------+ +---------------+
>>>> +
>>>> +Gunyah message queues are exposed as mailboxes. To create the 
>>>> mailbox, create
>>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY 
>>>> interrupt,
>>>> +all messages in the RX message queue are read and pushed via the 
>>>> `rx_callback`
>>>> +of the registered mbox_client.
>>>> +
>>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>>> +   :identifiers: gh_msgq_init
>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>> index fc9376117111..5f929bb55e9a 100644
>>>> --- a/drivers/mailbox/Makefile
>>>> +++ b/drivers/mailbox/Makefile
>>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>>
>>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not 
>>> CONFIG_GUNYAH_MBOX?
>>>
>>
>> There was some previous discussion about this:
>>
>> https://lore.kernel.org/all/2a7bb5f2-1286-b661-659a-a5037150eae8@quicinc.com/
>>
>>>> +
>>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>>> diff --git a/drivers/mailbox/gunyah-msgq.c 
>>>> b/drivers/mailbox/gunyah-msgq.c
>>>> new file mode 100644
>>>> index 000000000000..03ffaa30ce9b
>>>> --- /dev/null
>>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>>> @@ -0,0 +1,214 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +/*
>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>>>> rights reserved.
>>>> + */
>>>> +
>>>> +#include <linux/mailbox_controller.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/interrupt.h>
>>>> +#include <linux/gunyah.h>
>>>> +#include <linux/printk.h>
>>>> +#include <linux/init.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/wait.h>
>>>
>>> ...
>>>
>>>> +/* Fired when message queue transitions from "full" to "space 
>>>> available" to send messages */
>>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>>> +{
>>>> +    struct gh_msgq *msgq = data;
>>>> +
>>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>>> +
>>>> +    return IRQ_HANDLED;
>>>> +}
>>>> +
>>>> +/* Fired after sending message and hypercall told us there was more 
>>>> space available. */
>>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>>
>>> Tasklets have been long deprecated, consider using workqueues in this 
>>> particular case.
>>>
>>
>> Workqueues have higher latency and tasklets came as recommendation 
>> from Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
>>
>> I did some quick unscientific measurements of ~1000x samples. The 
>> median latency for resource manager went from 25.5 us (tasklet) to 26 
>> us (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us (13% 
>> slower). Obviously, the outliers for workqueues were much more extreme.
> 
> TBH, this is expected because we are only testing resource manager, Note 
>   the advantage that you will see shifting from tasket to workqueues is 
> on overall system latencies and some drivers performance that need to 
> react to events.
> 
> please take some time to read this nice article about this 
> https://lwn.net/Articles/830964/
> 

Hmm, this article is from 2020 and there was another effort in 2007. 
Neither seems to have succeeded. I'd like to stick to same mechanisms as 
other mailbox controllers.

Jassi, do you have any preferences?

Thanks,
Elliot

Alex Elder Feb. 23, 2023, 11:55 p.m. UTC | #21

On 2/14/23 3:25 PM, Elliot Berman wrote:
> 
> Document the ioctls and usage of Gunyah VM Manager driver.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   Documentation/virt/gunyah/index.rst      |   1 +
>   Documentation/virt/gunyah/vm-manager.rst | 106 +++++++++++++++++++++++
>   2 files changed, 107 insertions(+)
>   create mode 100644 Documentation/virt/gunyah/vm-manager.rst
> 
> diff --git a/Documentation/virt/gunyah/index.rst b/Documentation/virt/gunyah/index.rst
> index 45adbbc311db..b204b85e86db 100644
> --- a/Documentation/virt/gunyah/index.rst
> +++ b/Documentation/virt/gunyah/index.rst
> @@ -7,6 +7,7 @@ Gunyah Hypervisor
>   .. toctree::
>      :maxdepth: 1
>   
> +   vm-manager
>      message-queue
>   
>   Gunyah is a Type-1 hypervisor which is independent of any OS kernel, and runs in
> diff --git a/Documentation/virt/gunyah/vm-manager.rst b/Documentation/virt/gunyah/vm-manager.rst
> new file mode 100644
> index 000000000000..c0126cfeadc7
> --- /dev/null
> +++ b/Documentation/virt/gunyah/vm-manager.rst
> @@ -0,0 +1,106 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=======================
> +Virtual Machine Manager
> +=======================
> +
> +The Gunyah Virtual Machine Manager is a Linux driver to support launching
> +virtual machines using Gunyah. It presently supports launching non-proxy
> +scheduled Linux-like virtual machines.
> +
> +Except for some basic information about the location of initial binaries,
> +most of the configuration about a Gunyah virtual machine is described in the
> +VM's devicetree. The devicetree is generated by userspace. Interacting with the
> +virtual machine is still done via the kernel and VM configuration requires some
> +of the corresponding functionality to be set up in the kernel. For instance,
> +sharing userspace memory with a VM is done via the GH_VM_SET_USER_MEM_REGION
> +ioctl. The VM itself is configured to use the memory region via the
> +devicetree.
> +
> +Sample Userspace VMM
> +====================
> +
> +A sample userspace VMM is included in samples/gunyah/ along with a minimal
> +devicetree that can be used to launch a VM. To build this sample, enable
> +CONFIG_SAMPLE_GUNYAH.
> +
> +IOCTLs and userspace VMM flows
> +==============================
> +
> +The kernel exposes a char device interface at /dev/gunyah.
> +
> +To create a VM, use the GH_CREATE_VM ioctl. A successful call will return a
> +"Gunyah VM" file descriptor.
> +
> +/dev/gunyah API Descriptions
> +----------------------------
> +
> +GH_CREATE_VM
> +~~~~~~~~~~~~
> +
> +Creates a Gunyah VM. The argument is reserved for future use and must be 0.

I wouldn't say it "must be zero".  Instead maybe say it is
is currently ignored.

> +
> +Gunyah VM API Descriptions
> +--------------------------
> +
> +GH_VM_SET_USER_MEM_REGION
> +~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> +  struct gh_userspace_memory_region {
> +	__u32 label;
> +	__u32 flags;
> +	__u64 guest_phys_addr;
> +	__u64 memory_size;
> +	__u64 userspace_addr;
> +  };
> +
> +This ioctl allows the user to create or delete a memory parcel for a guest
> +virtual machine. Each memory region is uniquely identified by a label;
> +attempting to create two regions with the same label is not allowed.

Must the label be unique across a single instance of Gunyah (on
a single physical machine)?  Or is it unique within a VM?  Or
something else?  (It's not universally unique, right?)

> +
> +While VMM is guest-agnostic and allows runtime addition of memory regions,
> +Linux guest virtual machines do not support accepting memory regions at runtime.
> +Thus, memory regions should be provided before starting the VM and the VM must
> +be configured to accept these at boot-up.
> +
> +The guest physical address is used by Linux kernel to check that the requested
> +user regions do not overlap and to help find the corresponding memory region
> +for calls like GH_VM_SET_DTB_CONFIG. It should be page aligned.

The physical address must be page aligned.  Can a page be anything
other than 4096 bytes?

> +
> +memory_size and userspace_addr should be page-aligned.

Not should, must, right?  (The address isn't rounded down to a
page boundary, for example.)

> +
> +The flags field of gh_userspace_memory_region accepts the following bits. All
> +other bits must be 0 and are reserved for future use. The ioctl will return
> +-EINVAL if an unsupported bit is detected.
> +
> +  - GH_MEM_ALLOW_READ/GH_MEM_ALLOW_WRITE/GH_MEM_ALLOW_EXEC sets read/write/exec
> +    permissions for the guest, respectively.
> +  - GH_MEM_LENT means that the memory will be unmapped from the host and be
> +    unaccessible by the host while the guest has the region.
> +
> +To add a memory region, call GH_VM_SET_USER_MEM_REGION with fields set as
> +described above.
> +
> +To delete a memory region, call GH_VM_SET_USER_MEM_REGION with label set to the
> +desired region and memory_size set to 0.
> +
> +GH_VM_SET_DTB_CONFIG
> +~~~~~~~~~~~~~~~~~~~~
> +
> +::
> +
> +  struct gh_vm_dtb_config {
> +	__u64 gpa;

What is "gpa"?  Guest physical address?  Gunyah pseudo address?
Can this have a longer and more descriptive name please?

> +	__u64 size;
> +  };
> +
> +This ioctl sets the location of the VM's devicetree blob and is used by Gunyah
> +Resource Manager to allocate resources. The guest physical memory should be part
> +of the primary memory parcel provided to the VM prior to GH_VM_START.

Any alignment constraints?  (If not, you could say "there are no
alignment constraints on the address or size.")

> +
> +GH_VM_START
> +~~~~~~~~~~~
> +
> +This ioctl starts the VM.

Is there anything you can say about what gets returned for
these (at least for significant cases, like permission
problems or something)?

Are IOCTLs the normal way for virtual machine mechanisms
to set up things like this?  (Noob question.)

					-Alex

Alex Elder Feb. 24, 2023, 12:15 a.m. UTC | #22

On 2/14/23 3:23 PM, Elliot Berman wrote:
> Add hypercalls to send and receive messages on a Gunyah message queue.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>   arch/arm64/gunyah/gunyah_hypercall.c | 32 ++++++++++++++++++++++++++++
>   include/linux/gunyah.h               |  7 ++++++
>   2 files changed, 39 insertions(+)
> 
> diff --git a/arch/arm64/gunyah/gunyah_hypercall.c b/arch/arm64/gunyah/gunyah_hypercall.c
> index f30d06ee80cf..2ca9ab098ff6 100644
> --- a/arch/arm64/gunyah/gunyah_hypercall.c
> +++ b/arch/arm64/gunyah/gunyah_hypercall.c
> @@ -38,6 +38,8 @@ EXPORT_SYMBOL_GPL(arch_is_gunyah_guest);
>   						   fn)
>   
>   #define GH_HYPERCALL_HYP_IDENTIFY		GH_HYPERCALL(0x8000)
> +#define GH_HYPERCALL_MSGQ_SEND			GH_HYPERCALL(0x801B)
> +#define GH_HYPERCALL_MSGQ_RECV			GH_HYPERCALL(0x801C)
>   
>   /**
>    * gh_hypercall_hyp_identify() - Returns build information and feature flags
> @@ -57,5 +59,35 @@ void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identi
>   }
>   EXPORT_SYMBOL_GPL(gh_hypercall_hyp_identify);
>   
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
> +					bool *ready)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_SEND, capid, size, buff, tx_flags, 0, &res);
> +
> +	if (res.a0 == GH_ERROR_OK)
> +		*ready = res.a1;
> +
> +	return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_send);
> +
> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
> +					bool *ready)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_hvc(GH_HYPERCALL_MSGQ_RECV, capid, buff, size, 0, &res);
> +
> +	if (res.a0 == GH_ERROR_OK) {
> +		*recv_size = res.a1;

Is there any chance the 64-bit size is incompatible
with size_t?  (Too big?)

> +		*ready = res.a2;

		*ready = !!res.a2;

> +	}
> +
> +	return res.a0;
> +}
> +EXPORT_SYMBOL_GPL(gh_hypercall_msgq_recv);
> +
>   MODULE_LICENSE("GPL");
>   MODULE_DESCRIPTION("Gunyah Hypervisor Hypercalls");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 3fef2854c5e1..cb6df4eec5c2 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -112,4 +112,11 @@ struct gh_hypercall_hyp_identify_resp {
>   
>   void gh_hypercall_hyp_identify(struct gh_hypercall_hyp_identify_resp *hyp_identity);
>   
> +#define GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH		BIT(0)
> +
> +enum gh_error gh_hypercall_msgq_send(u64 capid, size_t size, uintptr_t buff, int tx_flags,
> +					bool *ready);

Why uintptr_t?  Why not just pass a host pointer (void *)
and do whatever conversion is necessary inside the function?

					-Alex

> +enum gh_error gh_hypercall_msgq_recv(u64 capid, uintptr_t buff, size_t size, size_t *recv_size,
> +					bool *ready);
> +
>   #endif

Srinivas Kandagatla Feb. 24, 2023, 7:49 a.m. UTC | #23

On 23/02/2023 23:15, Elliot Berman wrote:
> 
> 
> On 2/23/2023 2:25 AM, Srinivas Kandagatla wrote:
>>
>>
>> On 23/02/2023 00:15, Elliot Berman wrote:
>>>
>>>
>>> On 2/20/2023 5:59 AM, Srinivas Kandagatla wrote:
>>>>
>>>>
>>>> On 14/02/2023 21:23, Elliot Berman wrote:
>>>>> Gunyah message queues are a unidirectional inter-VM pipe for 
>>>>> messages up
>>>>> to 1024 bytes. This driver supports pairing a receiver message 
>>>>> queue and
>>>>> a transmitter message queue to expose a single mailbox channel.
>>>>>
>>>>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>>>>> ---
>>>>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>>>>   drivers/mailbox/Makefile                   |   2 +
>>>>>   drivers/mailbox/gunyah-msgq.c               | 214 
>>>>> ++++++++++++++++++++
>>>>>   include/linux/gunyah.h                      |  56 +++++
>>>>>   4 files changed, 280 insertions(+)
>>>>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>>>>
>>>>> diff --git a/Documentation/virt/gunyah/message-queue.rst 
>>>>> b/Documentation/virt/gunyah/message-queue.rst
>>>>> index 0667b3eb1ff9..082085e981e0 100644
>>>>> --- a/Documentation/virt/gunyah/message-queue.rst
>>>>> +++ b/Documentation/virt/gunyah/message-queue.rst
>>>>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs 
>>>>> (and two capability IDs).
>>>>>         |               |         |                 | 
>>>>> |               |
>>>>>         |               |         |                 | 
>>>>> |               |
>>>>>         +---------------+         +-----------------+ 
>>>>> +---------------+
>>>>> +
>>>>> +Gunyah message queues are exposed as mailboxes. To create the 
>>>>> mailbox, create
>>>>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY 
>>>>> interrupt,
>>>>> +all messages in the RX message queue are read and pushed via the 
>>>>> `rx_callback`
>>>>> +of the registered mbox_client.
>>>>> +
>>>>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>>>>> +   :identifiers: gh_msgq_init
>>>>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>>>>> index fc9376117111..5f929bb55e9a 100644
>>>>> --- a/drivers/mailbox/Makefile
>>>>> +++ b/drivers/mailbox/Makefile
>>>>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>>>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>>>>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>>>>
>>>> Why are we reusing CONFIG_GUNYAH Kconfig symbol for mailbox, why not 
>>>> CONFIG_GUNYAH_MBOX?
>>>>
>>>
>>> There was some previous discussion about this:
>>>
>>> https://lore.kernel.org/all/2a7bb5f2-1286-b661-659a-a5037150eae8@quicinc.com/
>>>
>>>>> +
>>>>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>>>>   obj-$(CONFIG_SPRD_MBOX)       += sprd-mailbox.o
>>>>> diff --git a/drivers/mailbox/gunyah-msgq.c 
>>>>> b/drivers/mailbox/gunyah-msgq.c
>>>>> new file mode 100644
>>>>> index 000000000000..03ffaa30ce9b
>>>>> --- /dev/null
>>>>> +++ b/drivers/mailbox/gunyah-msgq.c
>>>>> @@ -0,0 +1,214 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>> +/*
>>>>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>>>>> rights reserved.
>>>>> + */
>>>>> +
>>>>> +#include <linux/mailbox_controller.h>
>>>>> +#include <linux/module.h>
>>>>> +#include <linux/interrupt.h>
>>>>> +#include <linux/gunyah.h>
>>>>> +#include <linux/printk.h>
>>>>> +#include <linux/init.h>
>>>>> +#include <linux/slab.h>
>>>>> +#include <linux/wait.h>
>>>>
>>>> ...
>>>>
>>>>> +/* Fired when message queue transitions from "full" to "space 
>>>>> available" to send messages */
>>>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>>>> +{
>>>>> +    struct gh_msgq *msgq = data;
>>>>> +
>>>>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>>>>> +
>>>>> +    return IRQ_HANDLED;
>>>>> +}
>>>>> +
>>>>> +/* Fired after sending message and hypercall told us there was 
>>>>> more space available. */
>>>>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>>>>
>>>> Tasklets have been long deprecated, consider using workqueues in 
>>>> this particular case.
>>>>
>>>
>>> Workqueues have higher latency and tasklets came as recommendation 
>>> from Jassi. drivers/mailbox/imx-mailbox.c uses tasklets in the same way.
>>>
>>> I did some quick unscientific measurements of ~1000x samples. The 
>>> median latency for resource manager went from 25.5 us (tasklet) to 26 
>>> us (workqueue) (2% slower). The mean went from 28.7 us to 32.5 us 
>>> (13% slower). Obviously, the outliers for workqueues were much more 
>>> extreme.
>>
>> TBH, this is expected because we are only testing resource manager, 
>> Note   the advantage that you will see shifting from tasket to 
>> workqueues is on overall system latencies and some drivers performance 
>> that need to react to events.
>>
>> please take some time to read this nice article about this 
>> https://lwn.net/Articles/830964/
>>
> 
> Hmm, this article is from 2020 and there was another effort in 2007. 
> Neither seems to have succeeded. I'd like to stick to same mechanisms as 
> other mailbox controllers.

I don't want to block this series because of this. We will have more 
opportunity to improve this once some system wide profiling is done.

AFAIU, In this system we will have atleast 2 tasklets between VM and RM 
and 2 per inter-vm, so if the number of tasklets increase in the system 
will be potentially spending more time in soft irq handling it.

At somepoint in time its good to get some profiling done using 
bcc/softirqs to see how much time is spent on softirqs.


--srini

> 
> Jassi, do you have any preferences?
> 
> Thanks,
> Elliot
> 
>

Arnd Bergmann Feb. 24, 2023, 1:20 p.m. UTC | #24

On Fri, Feb 24, 2023, at 11:29, Srinivas Kandagatla wrote:
> On 23/02/2023 22:40, Elliot Berman wrote:

>>>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure 
>>>> yet what arguments to add here.
>>>>
>>>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't 
>>>
>>> Sorry, that is exactly what we want to avoid, we can not change the 
>>> UAPI its going to break the userspace.
>>>
>>>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>>>
>>> If you have userspace built with older kernel headers then that will 
>>> break. Am not sure about old-kernels.
>>>
>>> What exactly is the argument that you want to add to GH_CREATE_VM?
>>>
>>> If you want to keep GH_CREATE_VM with no arguments that is fine but 
>>> remove the conflicting comments in the code and document so that its 
>>> not misleading readers/reviewers that the UAPI is going to be modified 
>>> in near future.
>>>
>>>
>> 
>> The convention followed here comes from KVM_CREATE_VM. Is this ioctl 
>> considered bad example?
>> 
>
> It is recommended to only use _IO for commands without arguments, and 
> use pointers for passing data. Even though _IO can indicate either 
> commands with no argument or passing an integer value instead of a 
> pointer. Am really not sure how this works in compat case.
>
> Am sure there are tricks that can be done with just using _IO() macro 
> (ex vfio), but this does not mean that we should not use _IOW to be more 
> explicit on the type and size of argument that we are expecting.
>
> On the other hand If its really not possible to change this IOCTL to 
> _IOW and argument that you are referring would be with in integer range, 
> then what you have with _IO macro should work.

Passing an 'unsigned long' value instead of a pointer is fine for compat
mode, as a 32-bit compat_ulong_t always fits inside of the 64-bit
unsigned long. The downside is that portable code cannot have a
single ioctl handler function that takes both commands with pointers
and other commands with integer arguments, as some architectures
(i.e. s390, possibly arm64+morello in the future) need to mangle
pointer arguments using compat_ptr() but must not do that on integer
arguments.

     Arnd

Elliot Berman Feb. 24, 2023, 9:57 p.m. UTC | #25

On 2/23/2023 1:11 PM, Alex Elder wrote:
> On 2/14/23 3:23 PM, Elliot Berman wrote:
>> Gunyah message queues are a unidirectional inter-VM pipe for messages up
>> to 1024 bytes. This driver supports pairing a receiver message queue and
>> a transmitter message queue to expose a single mailbox channel.
>>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   Documentation/virt/gunyah/message-queue.rst |   8 +
>>   drivers/mailbox/Makefile                    |   2 +
>>   drivers/mailbox/gunyah-msgq.c               | 214 ++++++++++++++++++++
>>   include/linux/gunyah.h                      |  56 +++++
>>   4 files changed, 280 insertions(+)
>>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>>
>> diff --git a/Documentation/virt/gunyah/message-queue.rst 
>> b/Documentation/virt/gunyah/message-queue.rst
>> index 0667b3eb1ff9..082085e981e0 100644
>> --- a/Documentation/virt/gunyah/message-queue.rst
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -59,3 +59,11 @@ vIRQ: two TX message queues will have two vIRQs 
>> (and two capability IDs).
>>         |               |         |                 |         
>> |               |
>>         |               |         |                 |         
>> |               |
>>         +---------------+         +-----------------+         
>> +---------------+
>> +
>> +Gunyah message queues are exposed as mailboxes. To create the 
>> mailbox, create
>> +a mbox_client and call `gh_msgq_init`. On receipt of the RX_READY 
>> interrupt,
>> +all messages in the RX message queue are read and pushed via the 
>> `rx_callback`
>> +of the registered mbox_client.
>> +
>> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
>> +   :identifiers: gh_msgq_init
>> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
>> index fc9376117111..5f929bb55e9a 100644
>> --- a/drivers/mailbox/Makefile
>> +++ b/drivers/mailbox/Makefile
>> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)    += mtk-cmdq-mailbox.o
>>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)    += zynqmp-ipi-mailbox.o
>> +obj-$(CONFIG_GUNYAH)        += gunyah-msgq.o
>> +
>>   obj-$(CONFIG_SUN6I_MSGBOX)    += sun6i-msgbox.o
>>   obj-$(CONFIG_SPRD_MBOX)        += sprd-mailbox.o
>> diff --git a/drivers/mailbox/gunyah-msgq.c 
>> b/drivers/mailbox/gunyah-msgq.c
>> new file mode 100644
>> index 000000000000..03ffaa30ce9b
>> --- /dev/null
>> +++ b/drivers/mailbox/gunyah-msgq.c
> 
> You use a dash in this source file name, but an underscore
> everywhere else.  Unless there's a good reason to do this,
> please be consistent (use "gunyah_msgq.c").
> 
>> @@ -0,0 +1,214 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>> rights reserved.
>> + */
>> +
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/module.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/gunyah.h>
>> +#include <linux/printk.h>
>> +#include <linux/init.h>
>> +#include <linux/slab.h>
>> +#include <linux/wait.h>
>> +
>> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct 
>> gh_msgq, mbox))
>> +
>> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
>> +{
>> +    struct gh_msgq *msgq = data;
>> +    struct gh_msgq_rx_data rx_data;
>> +    enum gh_error err;
>> +    bool ready = true;
>> +
>> +    while (ready) {
>> +        err = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
>> +                (uintptr_t)&rx_data.data, sizeof(rx_data.data),
>> +                &rx_data.length, &ready);
>> +        if (err != GH_ERROR_OK) {
>> +            if (err != GH_ERROR_MSGQUEUE_EMPTY)
> 
> Srini mentioned something about this too.  In many
> (all?) cases, there is a device pointer available,
> so you should use dev_*() functions rather than pr_*().
> 
> In this particular case, I'm not sure why/when the
> mbox.dev pointer would be null.  Also, dev_*() handles
> the case of a null device pointer, and it reports the
> device name (just as you do here).
> 
>> +                pr_warn("Failed to receive data from msgq for %s: %d\n",
>> +                    msgq->mbox.dev ? dev_name(msgq->mbox.dev) : "", 
>> err);
>> +            break;
>> +        }
>> +        mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
>> +    }
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired when message queue transitions from "full" to "space 
>> available" to send messages */
>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>> +{
>> +    struct gh_msgq *msgq = data;
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), 0);
>> +
>> +    return IRQ_HANDLED;
>> +}
>> +
>> +/* Fired after sending message and hypercall told us there was more 
>> space available. */
>> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
>> +{
>> +    struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, 
>> txdone_tasklet);
>> +
>> +    mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
>> +}
>> +
>> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
>> +{
>> +    struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
>> +    struct gh_msgq_tx_data *msgq_data = data;
>> +    u64 tx_flags = 0;
>> +    enum gh_error gh_error;
> 
> Above you named the variable "err".  It helps readability
> if you use a very consistent naming convention for variables
> of a certain type when they are used a lot.
> 
>> +    bool ready;
>> +
>> +    if (msgq_data->push)
>> +        tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
>> +
>> +    gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, 
>> msgq_data->length,
>> +                    (uintptr_t)msgq_data->data, tx_flags, &ready);
>> +
>> +    /**
>> +     * unlikely because Linux tracks state of msgq and should not try to
>> +     * send message when msgq is full.
>> +     */
>> +    if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
>> +        return -EAGAIN;
>> +
>> +    /**
>> +     * Propagate all other errors to client. If we return error to 
>> mailbox
>> +     * framework, then no other messages can be sent and nobody will 
>> know
>> +     * to retry this message.
>> +     */
>> +    msgq->last_ret = gh_remap_error(gh_error);
>> +
>> +    /**
>> +     * This message was successfully sent, but message queue isn't 
>> ready to
>> +     * receive more messages because it's now full. Mailbox framework
> 
> Maybe:  s/receive/accept/
> 
>> +     * requires that we only report that message was transmitted when
>> +     * we're ready to transmit another message. We'll get that in the 
>> form
>> +     * of tx IRQ once the other side starts to drain the msgq.
>> +     */
>> +    if (gh_error == GH_ERROR_OK && !ready)
>> +        return 0;
>> +
>> +    /**
>> +     * We can send more messages. Mailbox framework requires that tx 
>> done
>> +     * happens asynchronously to sending the message. Gunyah message 
>> queues
>> +     * tell us right away on the hypercall return whether we can send 
>> more
>> +     * messages. To work around this, defer the txdone to a tasklet.
>> +     */
>> +    tasklet_schedule(&msgq->txdone_tasklet);
>> +
>> +    return 0;
>> +}
>> +
>> +static struct mbox_chan_ops gh_msgq_ops = {
>> +    .send_data = gh_msgq_send_data,
>> +};
>> +
>> +/**
>> + * gh_msgq_init() - Initialize a Gunyah message queue with an 
>> mbox_client
>> + * @parent: optional, device parent used for the mailbox controller
>> + * @msgq: Pointer to the gh_msgq to initialize
>> + * @cl: A mailbox client to bind to the mailbox channel that the 
>> message queue creates
>> + * @tx_ghrsc: optional, the transmission side of the message queue
>> + * @rx_ghrsc: optional, the receiving side of the message queue
>> + *
>> + * At least one of tx_ghrsc and rx_ghrsc should be not NULL. Most 
>> message queue use cases come with
> 
> s/should be/must be/
> 
>> + * a pair of message queues to facilitate bidirectional 
>> communication. When tx_ghrsc is set,
>> + * the client can send messages with 
>> mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
>> + * is set, the mbox_client should register an .rx_callback() and the 
>> message queue driver will
> 
> s/should register/must register/
> 
> A general comment on this code is that you sort of half define
> a Gunyah message queue API.  You define an initialization
> function and an exit function, but you also expose the fact
> that you use the mailbox framework in implementation.  This
> despite avoiding defining it as an mbox in the DTS file.
> 
> It might be hard to avoid that I guess.  But to me it would be
> nice if there were a more distinct Gunyah message queue API,
> which would provide a send_message() function, for example.
> And in that case, perhaps you would pass in the tx_done and/or
> rx_data callbacks to this function (since they're required).

I can write a wrapper for send_message, but I think it limits the code 
re-use of mailbox framework.

> 
> All that said, this is (currently?) only used by the resource
> manager, so making a beautiful API might not be that important.
> Do you envision this being used to communicate with other VMs
> in the future?
> 
>> + * push all available messages upon receiving the RX ready interrupt. 
>> The messages should be
> 
> Maybe: s/push/deliver/
> 
>> + * consumed or copied by the client right away as the gh_msgq_rx_data 
>> will be replaced/destroyed
>> + * after the callback.
>> + *
>> + * Returns - 0 on success, negative otherwise
>> + */
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct 
>> mbox_client *cl,
>> +             struct gunyah_resource *tx_ghrsc, struct gunyah_resource 
>> *rx_ghrsc)
>> +{
>> +    int ret;
>> +
>> +    /* Must have at least a tx_ghrsc or rx_ghrsc and that they are 
>> the right device types */
>> +    if ((!tx_ghrsc && !rx_ghrsc) ||
>> +        (tx_ghrsc && tx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_TX) ||
>> +        (rx_ghrsc && rx_ghrsc->type != GUNYAH_RESOURCE_TYPE_MSGQ_RX))
>> +        return -EINVAL;
>> +
>> +    if (gh_api_version() != GUNYAH_API_V1) {
>> +        pr_err("Unrecognized gunyah version: %u. Currently supported: 
>> %d\n",
>> +            gh_api_version(), GUNYAH_API_V1);
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    if (!gh_api_has_feature(GH_API_FEATURE_MSGQUEUE))
>> +        return -EOPNOTSUPP;
> 
> Can Gunyah even function if it doesn't have the MSGQUEUE feature?
> Will there ever be a Gunyah implementation that does not support
> it?  Perhaps this test could be done in gunyah_init() instead.

I don't think we will ever have a Gunyah implementation that doesn't 
support message queues. Perhaps some long distant Gunyah will use IPC 
mechanism X instead of message queues and the message queue support is 
dropped.

> 
> For that matter, you could verify the result of gh_api_version()
> at that time also.
> 

Moved the gh_api_version() check to gunyah_init()

>> +
>> +    msgq->tx_ghrsc = tx_ghrsc;
>> +    msgq->rx_ghrsc = rx_ghrsc;
>> +
>> +    msgq->mbox.dev = parent;
>> +    msgq->mbox.ops = &gh_msgq_ops;
>> +    msgq->mbox.num_chans = 1;
>> +    msgq->mbox.txdone_irq = true;
>> +    msgq->mbox.chans = kcalloc(msgq->mbox.num_chans, 
>> sizeof(*msgq->mbox.chans), GFP_KERNEL);
> 
>  From what I can tell, you will always use exactly one mailbox channel.
> So you could just do kzalloc(sizeof()...).
> 

If it's all the same, I'd like to keep it as kcalloc because chans is 
expected to be an array with num_chans size. It seems more correct to 
use kcalloc.

>> +    if (!msgq->mbox.chans)
>> +        return -ENOMEM;
>> +
>> +    if (msgq->tx_ghrsc) {
> 
>      if (tx_ghrsc) {
> 
> The irq field is assumed to be valid.  Are there any
> sanity checks you could perform?  Again this is only
> used for the resource manager right now, so maybe
> it's OK.
> 

We should safely assume irq field is valid. If we need to be skeptical 
of irq, we'd also need to be skeptical of capid and there's not validity 
check to perform there. struct gunyah_resource's are either filled from 
DT (in this case) or would be created by resource manager which does 
validity checks.

>> +        ret = request_irq(msgq->tx_ghrsc->irq, 
>> gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> 
>          ret = request_irq(tx_ghrsc->irq, ...
> 
> 
>> +                msgq);
>> +        if (ret)
>> +            goto err_chans;
>> +    }
>> +
>> +    if (msgq->rx_ghrsc) {
>> +        ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, 
>> gh_msgq_rx_irq_handler,
>> +                        IRQF_ONESHOT, "gh_msgq_rx", msgq);
>> +        if (ret)
>> +            goto err_tx_irq;
>> +    }
>> +
>> +    tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
>> +
>> +    ret = mbox_controller_register(&msgq->mbox);
>> +    if (ret)
>> +        goto err_rx_irq;
>> +
>> +    ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> 
> 
>> +    if (ret)
>> +        goto err_mbox;
>> +
>> +    return 0;
>> +err_mbox:
>> +    mbox_controller_unregister(&msgq->mbox);
>> +err_rx_irq:
>> +    if (msgq->rx_ghrsc)
>> +        free_irq(msgq->rx_ghrsc->irq, msgq);
>> +err_tx_irq:
>> +    if (msgq->tx_ghrsc)
>> +        free_irq(msgq->tx_ghrsc->irq, msgq);
>> +err_chans:
>> +    kfree(msgq->mbox.chans);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_init);
>> +
>> +void gh_msgq_remove(struct gh_msgq *msgq)
>> +{
> 
> Is there any need to un-bind the client?
> 

I was leaving un-binding the client to the client (RM).

>> +    mbox_controller_unregister(&msgq->mbox);
>> +
>> +    if (msgq->rx_ghrsc)
>> +        free_irq(msgq->rx_ghrsc->irq, msgq);
>> +
>> +    if (msgq->tx_ghrsc)
>> +        free_irq(msgq->tx_ghrsc->irq, msgq);
>> +
>> +    kfree(msgq->mbox.chans);
>> +}
>> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> index cb6df4eec5c2..2e13669c6363 100644
>> --- a/include/linux/gunyah.h
>> +++ b/include/linux/gunyah.h
>> @@ -8,11 +8,67 @@
>>   #include <linux/bitfield.h>
>>   #include <linux/errno.h>
>> +#include <linux/interrupt.h>
>>   #include <linux/limits.h>
>> +#include <linux/mailbox_controller.h>
>> +#include <linux/mailbox_client.h>
>>   #include <linux/types.h>
>> +/* Follows resource manager's resource types for VM_GET_HYP_RESOURCES */
>> +enum gunyah_resource_type {
>> +    GUNYAH_RESOURCE_TYPE_BELL_TX    = 0,
>> +    GUNYAH_RESOURCE_TYPE_BELL_RX    = 1,
>> +    GUNYAH_RESOURCE_TYPE_MSGQ_TX    = 2,
>> +    GUNYAH_RESOURCE_TYPE_MSGQ_RX    = 3,
>> +    GUNYAH_RESOURCE_TYPE_VCPU    = 4,
> 
> The maximum value here must fit in 8 bits.  I guess
> there's no risk right now of using that up, but you
> use negative values in some cases elsewhere.
> 
>> +};
>> +
>> +struct gunyah_resource {
>> +    enum gunyah_resource_type type;
>> +    u64 capid;
>> +    int irq;
> 
> request_irq() defines the IRQ value to be an unsigned int.
> 

Done.

>> +};
>> +
>> +/**
>> + * Gunyah Message Queues
>> + */
>> +
>> +#define GH_MSGQ_MAX_MSG_SIZE    240
>> +
>> +struct gh_msgq_tx_data {
>> +    size_t length;
>> +    bool push;
>> +    char data[];
>> +};
>> +
>> +struct gh_msgq_rx_data {
>> +    size_t length;
>> +    char data[GH_MSGQ_MAX_MSG_SIZE];
>> +};
>> +
>> +struct gh_msgq {
>> +    struct gunyah_resource *tx_ghrsc;
>> +    struct gunyah_resource *rx_ghrsc;
>> +
>> +    /* msgq private */
>> +    int last_ret; /* Linux error, not GH_STATUS_* */
>> +    struct mbox_controller mbox;
>> +    struct tasklet_struct txdone_tasklet;
> 
> Can the msgq_client be embedded here too?  (I don't really
> know whether msgq and msgq_client are one-to one.)
> 

They are one-to-one. I can embed the struct in the struct gh_msgq and 
drop the kcalloc.

Thanks,
Elliot

>> +};
>> +
>> +
>> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct 
>> mbox_client *cl,
>> +             struct gunyah_resource *tx_ghrsc, struct gunyah_resource 
>> *rx_ghrsc);
>> +void gh_msgq_remove(struct gh_msgq *msgq);
> 
> I suggested:
> 
> int gh_msgq_send(struct gh_msgq, struct gh_msgq_tx_data *data);
> 
>                      -Alex
> 
>> +
>> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
>> +{
>> +    return &msgq->mbox.chans[0];
>> +}
>> +
>>   
>> /******************************************************************************/
>>   /* Common arch-independent definitions for Gunyah 
>> hypercalls                  */
>> +
>>   #define GH_CAPID_INVAL    U64_MAX
>>   #define GH_VMID_ROOT_VM    0xff
>

Elliot Berman Feb. 24, 2023, 10:48 p.m. UTC | #26

On 2/24/2023 5:20 AM, Arnd Bergmann wrote:
> On Fri, Feb 24, 2023, at 11:29, Srinivas Kandagatla wrote:
>> On 23/02/2023 22:40, Elliot Berman wrote:
> 
>>>>> Does this means adding #define GH_VM_DEFAULT_ARG 0 ? I am not sure
>>>>> yet what arguments to add here.
>>>>>
>>>>> The ABI can add new "long" values to GH_CREATE_VM and that wouldn't
>>>>
>>>> Sorry, that is exactly what we want to avoid, we can not change the
>>>> UAPI its going to break the userspace.
>>>>
>>>>> break compatibility with old kernels; old kernels reject it as -EINVAL.
>>>>
>>>> If you have userspace built with older kernel headers then that will
>>>> break. Am not sure about old-kernels.
>>>>
>>>> What exactly is the argument that you want to add to GH_CREATE_VM?
>>>>
>>>> If you want to keep GH_CREATE_VM with no arguments that is fine but
>>>> remove the conflicting comments in the code and document so that its
>>>> not misleading readers/reviewers that the UAPI is going to be modified
>>>> in near future.
>>>>
>>>>
>>>
>>> The convention followed here comes from KVM_CREATE_VM. Is this ioctl
>>> considered bad example?
>>>
>>
>> It is recommended to only use _IO for commands without arguments, and
>> use pointers for passing data. Even though _IO can indicate either
>> commands with no argument or passing an integer value instead of a
>> pointer. Am really not sure how this works in compat case.
>>
>> Am sure there are tricks that can be done with just using _IO() macro
>> (ex vfio), but this does not mean that we should not use _IOW to be more
>> explicit on the type and size of argument that we are expecting.
>>
>> On the other hand If its really not possible to change this IOCTL to
>> _IOW and argument that you are referring would be with in integer range,
>> then what you have with _IO macro should work.
> 
> Passing an 'unsigned long' value instead of a pointer is fine for compat
> mode, as a 32-bit compat_ulong_t always fits inside of the 64-bit
> unsigned long. The downside is that portable code cannot have a
> single ioctl handler function that takes both commands with pointers
> and other commands with integer arguments, as some architectures
> (i.e. s390, possibly arm64+morello in the future) need to mangle
> pointer arguments using compat_ptr() but must not do that on integer
> arguments.

Thanks Arnd for helping clarify here!

I'd be open to making GH_CREATE_VM take a struct argument today, but I 
really don't know what size or what needs to be in that struct. My hope 
is that we can get away with just an integer for future needs. If 
integer doesn't suit, then new ioctl would need to be created. I think 
there's same problem if I pick some struct today (the struct may not 
suit tomorrow and we need to create new ioctl for the new struct).

Arnd Bergmann Feb. 28, 2023, 9:19 a.m. UTC | #27

On Tue, Feb 28, 2023, at 02:06, Alex Elder wrote:
> On 2/24/23 4:48 PM, Elliot Berman wrote:
>> I'd be open to making GH_CREATE_VM take a struct argument today, but I 
>> really don't know what size or what needs to be in that struct. My hope 
>> is that we can get away with just an integer for future needs. If 
>> integer doesn't suit, then new ioctl would need to be created. I think 
>> there's same problem if I pick some struct today (the struct may not 
>> suit tomorrow and we need to create new ioctl for the new struct).
>
> I'd like someone to back me up (or tell me I'm wrong), but...
>
> I think you can still pass a void in/out pointer, which can
> be interpreted in an IOCTL-specific way, as long as it can
> be unambiguously processed.
>
> So if you passed a non-null pointer, what it referred to
> could contain a key that defines the way to interpret it.
>
> You can't take away a behavior you've once supported, but I
> *think* you can add a new behavior (with a new structure
> that identifies itself).
>
> So if that is correct, you can extend a single IOCTL.  But
> sadly I can't tell you I'm sure this is correct.

In general you are correct that the behavior of an ioctl
command can be changed by reusing a combination of inputs that
was previously prohibited. I can't think of a case where that
would be a good idea though, as this just adds more complexity
than defining a new ioctl command code.

Interface versions and multiplexed ioctl commands are
all discouraged for the same reason.

      Arnd

Elliot Berman March 1, 2023, midnight UTC | #28

On 2/23/2023 3:41 PM, Alex Elder wrote:
> On 2/14/23 3:12 PM, Elliot Berman wrote:
>> Gunyah is an open-source Type-1 hypervisor developed by Qualcomm. It
>> does not depend on any lower-privileged OS/kernel code for its core
>> functionality. This increases its security and can support a smaller
>> trusted computing based when compared to Type-2 hypervisors.
>>
>> Add documentation describing the Gunyah hypervisor and the main
>> components of the Gunyah hypervisor which are of interest to Linux
>> virtualization development.
>>
>> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   Documentation/virt/gunyah/index.rst         | 113 ++++++++++++++++++++
>>   Documentation/virt/gunyah/message-queue.rst |  61 +++++++++++
>>   Documentation/virt/index.rst                |   1 +
>>   3 files changed, 175 insertions(+)
>>   create mode 100644 Documentation/virt/gunyah/index.rst
>>   create mode 100644 Documentation/virt/gunyah/message-queue.rst
>>
>> diff --git a/Documentation/virt/gunyah/index.rst 
>> b/Documentation/virt/gunyah/index.rst
>> new file mode 100644
>> index 000000000000..45adbbc311db
>> --- /dev/null
>> +++ b/Documentation/virt/gunyah/index.rst
>> @@ -0,0 +1,113 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +=================
>> +Gunyah Hypervisor
>> +=================
>> +
>> +.. toctree::
>> +   :maxdepth: 1
>> +
>> +   message-queue
>> +
>> +Gunyah is a Type-1 hypervisor which is independent of any OS kernel, 
>> and runs in
>> +a higher CPU privilege level. It does not depend on any 
>> lower-privileged operating system
>> +for its core functionality. This increases its security and can 
>> support a much smaller
>> +trusted computing base than a Type-2 hypervisor.
>> +
>> +Gunyah is an open source hypervisor. The source repo is available at
>> +https://github.com/quic/gunyah-hypervisor.
>> +
>> +Gunyah provides these following features.
>> +
>> +- Scheduling:
>> +
>> +  A scheduler for virtual CPUs (vCPUs) on physical CPUs enables 
>> time-sharing
>> +  of the CPUs. Gunyah supports two models of scheduling:
>> +
>> +    1. "Behind the back" scheduling in which Gunyah hypervisor 
>> schedules vCPUS on its own.
>> +    2. "Proxy" scheduling in which a delegated VM can donate part of 
>> one of its vCPU slice
>> +       to another VM's vCPU via a hypercall.
>> +
>> +- Memory Management:
>> +
>> +  APIs handling memory, abstracted as objects, limiting direct use of 
>> physical
>> +  addresses. Memory ownership and usage tracking of all memory under 
>> its control.
>> +  Memory partitioning between VMs is a fundamental security feature.
>> +
>> +- Interrupt Virtualization:
>> +
>> +  Uses CPU hardware interrupt virtualization capabilities. Interrupts 
>> are handled
>> +  in the hypervisor and routed to the assigned VM.
>> +
>> +- Inter-VM Communication:
>> +
>> +  There are several different mechanisms provided for communicating 
>> between VMs.
>> +
>> +- Virtual platform:
>> +
>> +  Architectural devices such as interrupt controllers and CPU timers 
>> are directly provided
>> +  by the hypervisor as well as core virtual platform devices and 
>> system APIs such as ARM PSCI.
>> +
>> +- Device Virtualization:
>> +
>> +  Para-virtualization of devices is supported using inter-VM 
>> communication.
>> +
>> +Architectures supported
>> +=======================
>> +AArch64 with a GIC
>> +
>> +Resources and Capabilities
>> +==========================
>> +
>> +Some services or resources provided by the Gunyah hypervisor are 
>> described to a virtual machine by
>> +capability IDs. For instance, inter-VM communication is performed 
>> with doorbells and message queues.
>> +Gunyah allows access to manipulate that doorbell via the capability 
>> ID. These resources are
>> +described in Linux as a struct gunyah_resource.
>> +
>> +High level management of these resources is performed by the resource 
>> manager VM. RM informs a
>> +guest VM about resources it can access through either the device tree 
>> or via guest-initiated RPC.
>> +
>> +For each virtual machine, Gunyah maintains a table of resources which 
>> can be accessed by that VM.
>> +An entry in this table is called a "capability" and VMs can only 
>> access resources via this
>> +capability table. Hence, virtual Gunyah resources are referenced by a 
>> "capability IDs" and not
>> +"resource IDs". If 2 VMs have access to the same resource, they might 
>> not be using the same
>> +capability ID to access that resource since the capability tables are 
>> independent per VM.
>> +
>> +Resource Manager
>> +================
>> +
>> +The resource manager (RM) is a privileged application VM supporting 
>> the Gunyah Hypervisor.
>> +It provides policy enforcement aspects of the virtualization system. 
>> The resource manager can
>> +be treated as an extension of the Hypervisor but is separated to its 
>> own partition to ensure
>> +that the hypervisor layer itself remains small and secure and to 
>> maintain a separation of policy
>> +and mechanism in the platform. RM runs at arm64 NS-EL1 similar to 
>> other virtual machines.
>> +
>> +Communication with the resource manager from each guest VM happens 
>> with message-queue.rst. Details
>> +about the specific messages can be found in 
>> drivers/virt/gunyah/rsc_mgr.c
>> +
>> +::
>> +
>> +  +-------+   +--------+   +--------+
>> +  |  RM   |   |  VM_A  |   |  VM_B  |
>> +  +-.-.-.-+   +---.----+   +---.----+
>> +    | |           |            |
>> +  +-.-.-----------.------------.----+
>> +  | | \==========/             |    |
>> +  |  \========================/     |
>> +  |            Gunyah               |
>> +  +---------------------------------+
>> +
>> +The source for the resource manager is available at 
>> https://github.com/quic/gunyah-resource-manager.
>> +
>> +The resource manager provides the following features:
>> +
>> +- VM lifecycle management: allocating a VM, starting VMs, destruction 
>> of VMs
>> +- VM access control policy, including memory sharing and lending
>> +- Interrupt routing configuration
>> +- Forwarding of system-level events (e.g. VM shutdown) to owner VM
>> +
>> +When booting a virtual machine which uses a devicetree such as Linux, 
>> resource manager overlays a
>> +/hypervisor node. This node can let Linux know it is running as a 
>> Gunyah guest VM,
>> +how to communicate with resource manager, and basic description and 
>> capabilities of
>> +this VM. See 
>> Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml for 
>> a description
>> +of this node.
>> diff --git a/Documentation/virt/gunyah/message-queue.rst 
>> b/Documentation/virt/gunyah/message-queue.rst
>> new file mode 100644
>> index 000000000000..0667b3eb1ff9
>> --- /dev/null
>> +++ b/Documentation/virt/gunyah/message-queue.rst
>> @@ -0,0 +1,61 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +Message Queues
>> +==============
>> +Message queue is a simple low-capacity IPC channel between two VMs. 
>> It is
>> +intended for sending small control and configuration messages. Each 
>> message
>> +queue is unidirectional, so a full-duplex IPC channel requires a pair 
>> of queues.
>> +
>> +Messages can be up to 240 bytes in length. Longer messages require a 
>> further
>> +protocol on top of the message queue messages themselves. For 
>> instance, communication
>> +with the resource manager adds a header field for sending longer 
>> messages via multiple
>> +message fragments.
>> +
>> +The diagram below shows how message queue works. A typical 
>> configuration involves
>> +2 message queues. Message queue 1 allows VM_A to send messages to 
>> VM_B. Message
>> +queue 2 allows VM_B to send messages to VM_A.
>> +
>> +1. VM_A sends a message of up to 240 bytes in length. It raises a 
>> hypercall
> 
> Can you clarify that the message being sent is in the VM's *own*
> memory/  Maybe this is clear, but the message doesn't have to (for
> example) be located in shared memory.  The original message is
> copied into message queue buffers in order to be transferred.
> 
>> +   with the message to inform the hypervisor to add the message to
>> +   message queue 1's queue.
>> +
>> +2. Gunyah raises the corresponding interrupt for VM_B (Rx vIRQ) when 
>> any of
>> +   these happens:
>> +
>> +   a. gh_msgq_send has PUSH flag. Queue is immediately flushed. This 
>> is the typical case.
> 
> Below you use gh_msgq_send() (with parentheses).  I prefer that,
> but whatever you do, do it consistently.
> 
>> +   b. Explicility with gh_msgq_push command from VM_A.
>> +   c. Message queue has reached a threshold depth.
>> +
>> +3. VM_B calls gh_msgq_recv and Gunyah copies message to requested 
>> buffer.
>> +
>> +4. Gunyah buffers messages in the queue. If the queue became full 
>> when VM_A added a message,
>> +   the return values for gh_msgq_send() include a flag that indicates 
>> the queue is full.
>> +   Once VM_B receives the message and, thus, there is space in the 
>> queue, Gunyah
>> +   will raise the Tx vIRQ on VM_A to indicate it can continue sending 
>> messages.
>> +
>> +For VM_B to send a message to VM_A, the process is identical, except 
>> that hypercalls
>> +reference message queue 2's capability ID. Each message queue has its 
>> own independent
>> +vIRQ: two TX message queues will have two vIRQs (and two capability 
>> IDs).
> 
> Can a sender determine when a message has been delivered?

Sender cannot determine when the receiving VM has processed the message.

> Does the TX vIRQ indicate only that the messaging system
> has processed the message (taken it and queued it), but
> says nothing about it being delivered/accepted/received?

That's the correct interpretation.

Thanks,
Elliot

Elliot Berman March 2, 2023, 1:40 a.m. UTC | #29

On 2/23/2023 1:58 PM, Alex Elder wrote:
> On 2/14/23 3:12 PM, Elliot Berman wrote:
>> Add architecture-independent standard error codes, types, and macros for
>> Gunyah hypercalls.
>>
>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   include/linux/gunyah.h | 82 ++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 82 insertions(+)
>>   create mode 100644 include/linux/gunyah.h
>>
>> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
>> new file mode 100644
>> index 000000000000..59ef4c735ae8
>> --- /dev/null
>> +++ b/include/linux/gunyah.h
>> @@ -0,0 +1,82 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All 
>> rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_GUNYAH_H
>> +#define _LINUX_GUNYAH_H
>> +
>> +#include <linux/errno.h>
>> +#include <linux/limits.h>
>> +
>> +/******************************************************************************/
>> +/* Common arch-independent definitions for Gunyah 
>> hypercalls                  */
>> +#define GH_CAPID_INVAL    U64_MAX
>> +#define GH_VMID_ROOT_VM    0xff
>> +
>> +enum gh_error {
>> +    GH_ERROR_OK            = 0,
>> +    GH_ERROR_UNIMPLEMENTED        = -1,
>> +    GH_ERROR_RETRY            = -2,
> 
> Do you expect this type to have a particular size?
> Since you specify negative values, it matters, and
> it's possible that this forces it to be a 4-byte value
> (though I'm not sure what the rules are).  In other
> words, UNIMPLEMENTED could conceivably have value 0xff
> or 0xffffffff.  I'm not even sure you can tell whether
> an enum is interpreted as signed or unsigned.

I'm not a C expert, but my understanding is that enums are signed. 
Gunyah will be returning a signed 64-bit register, however there's no 
intention to go beyond 32 bits of error codes since we want to work on 
32-bit architectures.

> 
> It's not usually a good thing to do, but this *could*
> be a case where you do a typedef to represent this as
> a signed value of a certain bit width.  (But don't do
> that unless someone else says that's worth doing.)
> 
>                      -Alex
> 
>> +
>> +    GH_ERROR_ARG_INVAL        = 1,
>> +    GH_ERROR_ARG_SIZE        = 2,
>> +    GH_ERROR_ARG_ALIGN        = 3,
>> +
>> +    GH_ERROR_NOMEM            = 10,
>> +
>> +    GH_ERROR_ADDR_OVFL        = 20,
>> +    GH_ERROR_ADDR_UNFL        = 21,
>> +    GH_ERROR_ADDR_INVAL        = 22,
>> +
>> +    GH_ERROR_DENIED            = 30,
>> +    GH_ERROR_BUSY            = 31,
>> +    GH_ERROR_IDLE            = 32,
>> +
>> +    GH_ERROR_IRQ_BOUND        = 40,
>> +    GH_ERROR_IRQ_UNBOUND        = 41,
>> +
>> +    GH_ERROR_CSPACE_CAP_NULL    = 50,
>> +    GH_ERROR_CSPACE_CAP_REVOKED    = 51,
>> +    GH_ERROR_CSPACE_WRONG_OBJ_TYPE    = 52,
>> +    GH_ERROR_CSPACE_INSUF_RIGHTS    = 53,
>> +    GH_ERROR_CSPACE_FULL        = 54,
>> +
>> +    GH_ERROR_MSGQUEUE_EMPTY        = 60,
>> +    GH_ERROR_MSGQUEUE_FULL        = 61,
>> +};
>> +
>> +/**
>> + * gh_remap_error() - Remap Gunyah hypervisor errors into a Linux 
>> error code
>> + * @gh_error: Gunyah hypercall return value
>> + */
>> +static inline int gh_remap_error(enum gh_error gh_error)
>> +{
>> +    switch (gh_error) {
>> +    case GH_ERROR_OK:
>> +        return 0;
>> +    case GH_ERROR_NOMEM:
>> +        return -ENOMEM;
>> +    case GH_ERROR_DENIED:
>> +    case GH_ERROR_CSPACE_CAP_NULL:
>> +    case GH_ERROR_CSPACE_CAP_REVOKED:
>> +    case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
>> +    case GH_ERROR_CSPACE_INSUF_RIGHTS:
>> +    case GH_ERROR_CSPACE_FULL:
>> +        return -EACCES;
>> +    case GH_ERROR_BUSY:
>> +    case GH_ERROR_IDLE:
>> +    case GH_ERROR_IRQ_BOUND:
>> +    case GH_ERROR_IRQ_UNBOUND:
>> +    case GH_ERROR_MSGQUEUE_FULL:
>> +    case GH_ERROR_MSGQUEUE_EMPTY:
> 
> Is an empty message queue really busy?
> 

Changed to -EIO.

>> +        return -EBUSY;
>> +    case GH_ERROR_UNIMPLEMENTED:
>> +    case GH_ERROR_RETRY:
>> +        return -EOPNOTSUPP;
>> +    default:
>> +        return -EINVAL;
>> +    }
>> +}
>> +
>> +#endif
>

Arnd Bergmann March 2, 2023, 7:18 a.m. UTC | #30

On Thu, Mar 2, 2023, at 02:40, Elliot Berman wrote:
> On 2/23/2023 1:58 PM, Alex Elder wrote:

>>> +enum gh_error {
>>> +    GH_ERROR_OK            = 0,
>>> +    GH_ERROR_UNIMPLEMENTED        = -1,
>>> +    GH_ERROR_RETRY            = -2,
>> 
>> Do you expect this type to have a particular size?
>> Since you specify negative values, it matters, and
>> it's possible that this forces it to be a 4-byte value
>> (though I'm not sure what the rules are).  In other
>> words, UNIMPLEMENTED could conceivably have value 0xff
>> or 0xffffffff.  I'm not even sure you can tell whether
>> an enum is interpreted as signed or unsigned.
>
> I'm not a C expert, but my understanding is that enums are signed. 
> Gunyah will be returning a signed 64-bit register, however there's no 
> intention to go beyond 32 bits of error codes since we want to work on 
> 32-bit architectures.

This came up recently because gcc-13 changes the rules.

In GNU C, the enum type will have the smallest type that fits all
values, so if it contains a negative number it ends up as a signed
type (int, long or long long), but if all values are positive and at
least one of them exceeds the signed range (e.g. UINT_MAX), it is
an unsigned type. If it contains both UINT_MAX and -1, the enum
type gets changed to a signed 64-bit type in order to fit both.

Before gcc-13, the individual constants have the smallest type
(at least 'int') that fits their value, but in gcc-13 they have
the same type as the enum type itself.

     Arnd