mbox series

[v13,00/24] Drivers for Gunyah hypervisor

Message ID 20230509204801.2824351-1-quic_eberman@quicinc.com
Headers show
Series Drivers for Gunyah hypervisor | expand

Message

Elliot Berman May 9, 2023, 8:47 p.m. UTC
Gunyah is a Type-1 hypervisor independent of any
high-level OS kernel, and runs in a higher CPU privilege level. It does
not depend on any lower-privileged OS kernel/code for its core
functionality. This increases its security and can support a much smaller
trusted computing base than a Type-2 hypervisor.

Gunyah is an open source hypervisor. The source repo is available at
https://github.com/quic/gunyah-hypervisor.

The diagram below shows the architecture.

::

         VM A                    VM B
     +-----+ +-----+  | +-----+ +-----+ +-----+
     |     | |     |  | |     | |     | |     |
 EL0 | APP | | APP |  | | APP | | APP | | APP |
     |     | |     |  | |     | |     | |     |
     +-----+ +-----+  | +-----+ +-----+ +-----+
 ---------------------|-------------------------
     +--------------+ | +----------------------+
     |              | | |                      |
 EL1 | Linux Kernel | | |Linux kernel/Other OS |   ...
     |              | | |                      |
     +--------------+ | +----------------------+
 --------hvc/smc------|------hvc/smc------------
     +----------------------------------------+
     |                                        |
 EL2 |            Gunyah Hypervisor           |
     |                                        |
     +----------------------------------------+

Gunyah provides these following features.

- Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
physical CPUs and enables time-sharing of the CPUs.
- Memory Management: Gunyah tracks memory ownership and use of all memory
under its control. Memory partitioning between VMs is a fundamental
security feature.
- Interrupt Virtualization: All interrupts are handled in the hypervisor
and routed to the assigned VM.
- Inter-VM Communication: There are several different mechanisms provided
for communicating between VMs.
- Device Virtualization: Para-virtualization of devices is supported using
inter-VM communication. Low level system features and devices such as
interrupt controllers are supported with emulation where required.

This series adds the basic framework for detecting that Linux is running
under Gunyah as a virtual machine, communication with the Gunyah Resource
Manager, and a sample virtual machine manager capable of launching virtual machines.

The series relies on two other patches posted separately:
 - https://lore.kernel.org/all/20230213181832.3489174-1-quic_eberman@quicinc.com/
 - https://lore.kernel.org/all/20230213232537.2040976-2-quic_eberman@quicinc.com/

Changes in v13:
 - Tweaks to message queue driver to address race condition between IRQ and mailbox registration
 - Allow removal of VM functions by function-specific comparison -- specifically to allow
   removing irqfd by label only and not requiring original FD to be provided.

Changes in v12: https://lore.kernel.org/all/20230424231558.70911-1-quic_eberman@quicinc.com/
 - Stylistic/cosmetic tweaks suggested by Alex
 - Remove patch "virt: gunyah: Identify hypervisor version" and squash the
   check that we're running under a reasonable Gunyah hypervisor into RM driver
 - Refactor platform hooks into a separate module per suggestion from Srini
 - GFP_KERNEL_ACCOUNT and account_locked_vm() for page pinning
 - enum-ify related constants

Changes in v11: https://lore.kernel.org/all/20230304010632.2127470-1-quic_eberman@quicinc.com/
 - Rename struct gh_vm_dtb_config:gpa -> guest_phys_addr & overflow checks for this
 - More docstrings throughout
 - Make resp_buf and resp_buf_size optional
 - Replace deprecated idr with xarray
 - Refconting on misc device instead of RM's platform device
 - Renaming variables, structs, etc. from gunyah_ -> gh_
 - Drop removal of user mem regions
 - Drop mem_lend functionality; to converge with restricted_memfd later

Changes in v10: https://lore.kernel.org/all/20230214211229.3239350-1-quic_eberman@quicinc.com/
 - Fix bisectability (end result of series is same, --fixups applied to wrong commits)
 - Convert GH_ERROR_* and GH_RM_ERROR_* to enums
 - Correct race condition between allocating/freeing user memory
 - Replace offsetof with struct_size
 - Series-wide renaming of functions to be more consistent
 - VM shutdown & restart support added in vCPU and VM Manager patches
 - Convert VM function name (string) to type (number)
 - Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
 - Remove defensive checks of hypervisor correctness
 - Clean ups to ioeventfd as suggested by Srivatsa

Changes in v9: https://lore.kernel.org/all/20230120224627.4053418-1-quic_eberman@quicinc.com/
 - Refactor Gunyah API flags to be exposed as feature flags at kernel level
 - Move mbox client cleanup into gunyah_msgq_remove()
 - Simplify gh_rm_call return value and response payload
 - Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series

Changes in v8: https://lore.kernel.org/all/20221219225850.2397345-1-quic_eberman@quicinc.com/
 - Treat VM manager as a library of RM
 - Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
   from Gunyah userspace

Changes in v7: https://lore.kernel.org/all/20221121140009.2353512-1-quic_eberman@quicinc.com/
 - Refactor to remove gunyah RM bus
 - Refactor allow multiple RM device instances
 - Bump UAPI to start at 0x0
 - Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations

Changes in v6: https://lore.kernel.org/all/20221026185846.3983888-1-quic_eberman@quicinc.com/
 - *Replace gunyah-console with gunyah VM Manager*
 - Move include/asm-generic/gunyah.h into include/linux/gunyah.h
 - s/gunyah_msgq/gh_msgq/
 - Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.

Changes in v5: https://lore.kernel.org/all/20221011000840.289033-1-quic_eberman@quicinc.com/
 - Dropped sysfs nodes
 - Switch from aux bus to Gunyah RM bus for the subdevices
 - Cleaning up RM console

Changes in v4: https://lore.kernel.org/all/20220928195633.2348848-1-quic_eberman@quicinc.com/
 - Tidied up documentation throughout based on questions/feedback received
 - Switched message queue implementation to use mailboxes
 - Renamed "gunyah_device" as "gunyah_resource"

Changes in v3: https://lore.kernel.org/all/20220811214107.1074343-1-quic_eberman@quicinc.com/
 - /Maintained/Supported/ in MAINTAINERS
 - Tidied up documentation throughout based on questions/feedback received
 - Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
 - Drop opaque typedefs
 - Move sysfs nodes under /sys/hypervisor/gunyah/
 - Moved Gunyah console driver to drivers/tty/
 - Reworked gh_device design to drop the Gunyah bus.

Changes in v2: https://lore.kernel.org/all/20220801211240.597859-1-quic_eberman@quicinc.com/
 - DT bindings clean up
 - Switch hypercalls to follow SMCCC 

v1: https://lore.kernel.org/all/20220223233729.1571114-1-quic_eberman@quicinc.com/

Elliot Berman (24):
  dt-bindings: Add binding for gunyah hypervisor
  gunyah: Common types and error codes for Gunyah hypercalls
  virt: gunyah: Add hypercalls to identify Gunyah
  virt: gunyah: msgq: Add hypercalls to send and receive messages
  mailbox: Add Gunyah message queue mailbox
  gunyah: rsc_mgr: Add resource manager RPC core
  gunyah: rsc_mgr: Add VM lifecycle RPC
  gunyah: vm_mgr: Introduce basic VM Manager
  gunyah: rsc_mgr: Add RPC for sharing memory
  gunyah: vm_mgr: Add/remove user memory regions
  gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
  samples: Add sample userspace Gunyah VM Manager
  gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
  virt: gunyah: Add Qualcomm Gunyah platform ops
  docs: gunyah: Document Gunyah VM Manager
  virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
  gunyah: vm_mgr: Add framework for VM Functions
  virt: gunyah: Add resource tickets
  virt: gunyah: Add IO handlers
  virt: gunyah: Add proxy-scheduled vCPUs
  virt: gunyah: Add hypercalls for sending doorbell
  virt: gunyah: Add irqfd interface
  virt: gunyah: Add ioeventfd
  MAINTAINERS: Add Gunyah hypervisor drivers section

 .../bindings/firmware/gunyah-hypervisor.yaml  |  82 ++
 .../userspace-api/ioctl/ioctl-number.rst      |   1 +
 Documentation/virt/gunyah/index.rst           |   1 +
 Documentation/virt/gunyah/message-queue.rst   |   8 +
 Documentation/virt/gunyah/vm-manager.rst      | 142 +++
 MAINTAINERS                                   |  13 +
 arch/arm64/Kbuild                             |   1 +
 arch/arm64/gunyah/Makefile                    |   3 +
 arch/arm64/gunyah/gunyah_hypercall.c          | 140 +++
 arch/arm64/include/asm/gunyah.h               |  24 +
 drivers/mailbox/Makefile                      |   2 +
 drivers/mailbox/gunyah-msgq.c                 | 212 ++++
 drivers/virt/Kconfig                          |   2 +
 drivers/virt/Makefile                         |   1 +
 drivers/virt/gunyah/Kconfig                   |  59 ++
 drivers/virt/gunyah/Makefile                  |  11 +
 drivers/virt/gunyah/gunyah_ioeventfd.c        | 130 +++
 drivers/virt/gunyah/gunyah_irqfd.c            | 180 ++++
 drivers/virt/gunyah/gunyah_platform_hooks.c   |  80 ++
 drivers/virt/gunyah/gunyah_qcom.c             | 147 +++
 drivers/virt/gunyah/gunyah_vcpu.c             | 468 +++++++++
 drivers/virt/gunyah/rsc_mgr.c                 | 910 ++++++++++++++++++
 drivers/virt/gunyah/rsc_mgr.h                 |  19 +
 drivers/virt/gunyah/rsc_mgr_rpc.c             | 500 ++++++++++
 drivers/virt/gunyah/vm_mgr.c                  | 794 +++++++++++++++
 drivers/virt/gunyah/vm_mgr.h                  |  70 ++
 drivers/virt/gunyah/vm_mgr_mm.c               | 256 +++++
 include/linux/gunyah.h                        | 207 ++++
 include/linux/gunyah_rsc_mgr.h                | 162 ++++
 include/linux/gunyah_vm_mgr.h                 | 126 +++
 include/uapi/linux/gunyah.h                   | 293 ++++++
 samples/Kconfig                               |  10 +
 samples/Makefile                              |   1 +
 samples/gunyah/.gitignore                     |   2 +
 samples/gunyah/Makefile                       |   6 +
 samples/gunyah/gunyah_vmm.c                   | 270 ++++++
 samples/gunyah/sample_vm.dts                  |  68 ++
 37 files changed, 5401 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
 create mode 100644 Documentation/virt/gunyah/vm-manager.rst
 create mode 100644 arch/arm64/gunyah/Makefile
 create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
 create mode 100644 arch/arm64/include/asm/gunyah.h
 create mode 100644 drivers/mailbox/gunyah-msgq.c
 create mode 100644 drivers/virt/gunyah/Kconfig
 create mode 100644 drivers/virt/gunyah/Makefile
 create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
 create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
 create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
 create mode 100644 drivers/virt/gunyah/gunyah_qcom.c
 create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
 create mode 100644 drivers/virt/gunyah/rsc_mgr.c
 create mode 100644 drivers/virt/gunyah/rsc_mgr.h
 create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
 create mode 100644 drivers/virt/gunyah/vm_mgr.c
 create mode 100644 drivers/virt/gunyah/vm_mgr.h
 create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
 create mode 100644 include/linux/gunyah.h
 create mode 100644 include/linux/gunyah_rsc_mgr.h
 create mode 100644 include/linux/gunyah_vm_mgr.h
 create mode 100644 include/uapi/linux/gunyah.h
 create mode 100644 samples/gunyah/.gitignore
 create mode 100644 samples/gunyah/Makefile
 create mode 100644 samples/gunyah/gunyah_vmm.c
 create mode 100644 samples/gunyah/sample_vm.dts


base-commit: c8c655c34e33544aec9d64b660872ab33c29b5f1
prerequisite-patch-id: b48c45acdec06adf37e09fe35e6a9412c5784800
prerequisite-patch-id: bc27499c7652385c584424529edbc5781c074d68

Comments

Elliot Berman May 19, 2023, 5:02 p.m. UTC | #1
On 5/19/2023 4:59 AM, Will Deacon wrote:
> Hi Elliot,
> 
> On Tue, May 09, 2023 at 01:47:47PM -0700, Elliot Berman wrote:
>> When launching a virtual machine, Gunyah userspace allocates memory for
>> the guest and informs Gunyah about these memory regions through
>> SET_USER_MEMORY_REGION ioctl.
>>
>> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
>> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
>> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
>> ---
>>   drivers/virt/gunyah/Makefile    |   2 +-
>>   drivers/virt/gunyah/vm_mgr.c    |  59 +++++++-
>>   drivers/virt/gunyah/vm_mgr.h    |  26 ++++
>>   drivers/virt/gunyah/vm_mgr_mm.c | 236 ++++++++++++++++++++++++++++++++
>>   include/uapi/linux/gunyah.h     |  37 +++++
>>   5 files changed, 356 insertions(+), 4 deletions(-)
>>   create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
> 
> [...]
> 
>> +int gh_vm_mem_alloc(struct gh_vm *ghvm, struct gh_userspace_memory_region *region)
>> +{
>> +	struct gh_vm_mem *mapping, *tmp_mapping;
>> +	struct page *curr_page, *prev_page;
>> +	struct gh_rm_mem_parcel *parcel;
>> +	int i, j, pinned, ret = 0;
>> +	unsigned int gup_flags;
>> +	size_t entry_size;
>> +	u16 vmid;
>> +
>> +	if (!region->memory_size || !PAGE_ALIGNED(region->memory_size) ||
>> +		!PAGE_ALIGNED(region->userspace_addr) ||
>> +		!PAGE_ALIGNED(region->guest_phys_addr))
>> +		return -EINVAL;
>> +
>> +	if (overflows_type(region->guest_phys_addr + region->memory_size, u64))
>> +		return -EOVERFLOW;
>> +
>> +	ret = mutex_lock_interruptible(&ghvm->mm_lock);
>> +	if (ret)
>> +		return ret;
>> +
>> +	mapping = __gh_vm_mem_find_by_label(ghvm, region->label);
>> +	if (mapping) {
>> +		ret = -EEXIST;
>> +		goto unlock;
>> +	}
>> +
>> +	list_for_each_entry(tmp_mapping, &ghvm->memory_mappings, list) {
>> +		if (gh_vm_mem_overlap(tmp_mapping, region->guest_phys_addr,
>> +					region->memory_size)) {
>> +			ret = -EEXIST;
>> +			goto unlock;
>> +		}
>> +	}
>> +
>> +	mapping = kzalloc(sizeof(*mapping), GFP_KERNEL_ACCOUNT);
>> +	if (!mapping) {
>> +		ret = -ENOMEM;
>> +		goto unlock;
>> +	}
>> +
>> +	mapping->guest_phys_addr = region->guest_phys_addr;
>> +	mapping->npages = region->memory_size >> PAGE_SHIFT;
>> +	parcel = &mapping->parcel;
>> +	parcel->label = region->label;
>> +	parcel->mem_handle = GH_MEM_HANDLE_INVAL; /* to be filled later by mem_share/mem_lend */
>> +	parcel->mem_type = GH_RM_MEM_TYPE_NORMAL;
>> +
>> +	ret = account_locked_vm(ghvm->mm, mapping->npages, true);
>> +	if (ret)
>> +		goto free_mapping;
>> +
>> +	mapping->pages = kcalloc(mapping->npages, sizeof(*mapping->pages), GFP_KERNEL_ACCOUNT);
>> +	if (!mapping->pages) {
>> +		ret = -ENOMEM;
>> +		mapping->npages = 0; /* update npages for reclaim */
>> +		goto unlock_pages;
>> +	}
>> +
>> +	gup_flags = FOLL_LONGTERM;
>> +	if (region->flags & GH_MEM_ALLOW_WRITE)
>> +		gup_flags |= FOLL_WRITE;
>> +
>> +	pinned = pin_user_pages_fast(region->userspace_addr, mapping->npages,
>> +					gup_flags, mapping->pages);
>> +	if (pinned < 0) {
>> +		ret = pinned;
>> +		goto free_pages;
>> +	} else if (pinned != mapping->npages) {
>> +		ret = -EFAULT;
>> +		mapping->npages = pinned; /* update npages for reclaim */
>> +		goto unpin_pages;
>> +	}
> 
> Sorry if I missed it, but I still don't see where you reject file mappings
> here.
> 

Sure, I can reject file mappings. I didn't catch that was the ask 
previously and thought it was only a comment about behavior of file 
mappings.

> This is also the wrong interface for upstream. Please get involved with
> the fd-based guest memory discussions [1] and port your series to that.
> 

The user interface design for *shared* memory aligns with 
KVM_SET_USER_MEMORY_REGION.

I understood we want to use restricted memfd for giving guest-private 
memory (Gunyah calls this "lending memory"). When I went through the 
changes, I gathered KVM is using restricted memfd only for guest-private 
memory and not for shared memory. Thus, I dropped support for lending 
memory to the guest VM and only retained the shared memory support in 
this series. I'd like to merge what we can today and introduce the 
guest-private memory support in tandem with the restricted memfd; I 
don't see much reason to delay the series.

I briefly evaluated and picked the arm64/pKVM support that Fuad shared 
[2] and found it should be fine for Gunyah. I did build-only at the 
time. I don't have any comments on the base restricted_memfd support and 
Fuad has not posted [2] on mailing lists yet as far as I can tell.

> This patch cannot be merged in its current form.
> 

I am a little confused why the implementation to share memory with the 
VM is being rejected. Besides rejecting file mappings, any other changes 
needed to be accepted?

- Elliot

> Will
> 
> [1] https://lore.kernel.org/kvm/20221202061347.1070246-1-chao.p.peng@linux.intel.com/
[2]: 
https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/fdmem-v10-core
Alex Bennée May 24, 2023, 5:13 p.m. UTC | #2
Elliot Berman <quic_eberman@quicinc.com> writes:

> Gunyah is a Type-1 hypervisor independent of any
> high-level OS kernel, and runs in a higher CPU privilege level. It does
> not depend on any lower-privileged OS kernel/code for its core
> functionality. This increases its security and can support a much smaller
> trusted computing base than a Type-2 hypervisor.
>
<snip>
>
> The series relies on two other patches posted separately:
>  - https://lore.kernel.org/all/20230213181832.3489174-1-quic_eberman@quicinc.com/
>  -
> https://lore.kernel.org/all/20230213232537.2040976-2-quic_eberman@quicinc.com/

I couldn't find this one, but is this what it should have been:

  b4 am -S -t 20230213232537.2040976-1-quic_eberman@quicinc.com
  Grabbing thread from lore.kernel.org/all/20230213232537.2040976-1-quic_eberman%40quicinc.com/t.mbox.gz
  Analyzing 9 messages in the thread
  Checking attestation on all messages, may take a moment...
  ---
    ✓ [PATCH 1/3] mailbox: Allow direct registration to a channel
      + Tested-by: Sudeep Holla <sudeep.holla@arm.com>
    ✓ [PATCH 2/3] mailbox: omap: Use mbox_bind_client
      + Tested-by: Sudeep Holla <sudeep.holla@arm.com>
    ✓ [PATCH 3/3] mailbox: pcc: Use mbox_bind_client
      + Tested-by: Sudeep Holla <sudeep.holla@arm.com>
    ---
    ✓ Signed: DKIM/quicinc.com
  ---
  Total patches: 3
  ---
  Cover: ./20230213_quic_eberman_mailbox_allow_direct_registration_to_a_channel.cover
   Link: https://lore.kernel.org/r/20230213232537.2040976-1-quic_eberman@quicinc.com
   Base: base-commit 09e41676e35ab06e4bce8870ea3bf1f191c3cb90 not known, ignoring
   Base: applies clean to current tree
         git checkout -b 20230213_quic_eberman_quicinc_com HEAD
         git am ./20230213_quic_eberman_mailbox_allow_direct_registration_to_a_channel.mbx
  🕙18:10:45 alex@zen:linux.git  on  review/gunyah-v12 [$?] 
  ➜  git am 20230213_quic_eberman_mailbox_allow_direct_registration_to_a_channel.mbx 
  Applying: mailbox: Allow direct registration to a channel
  Applying: mailbox: omap: Use mbox_bind_client
  Applying: mailbox: pcc: Use mbox_bind_client


<snip>
>
> Elliot Berman (24):
<snip>

>   mailbox: Add Gunyah message queue mailbox

This patch touches a file that isn't in mainline which makes me wonder
if I've missed another pre-requisite patch?

<snip>
>  Documentation/virt/gunyah/message-queue.rst   |   8 +
<snip>
Alex Elder June 5, 2023, 7:47 p.m. UTC | #3
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Gunyah is a Type-1 hypervisor independent of any
> high-level OS kernel, and runs in a higher CPU privilege level. It does
> not depend on any lower-privileged OS kernel/code for its core
> functionality. This increases its security and can support a much smaller
> trusted computing base than a Type-2 hypervisor.
> 
> Gunyah is an open source hypervisor. The source repo is available at
> https://github.com/quic/gunyah-hypervisor.

Does the kernel code in this patch series function with the
hypervisor code at the link above?  It looks like it has not
been updated in 2 years.  If not, where can one find the open
source hypervisor code that this kernel driver *does* function
with?  If it is not available now, *when* will it be published?

It's OK for the hypervisor to be closed source, but if that's
the case, the statement about it being open source should not
be made.

In addition, the Gunyah resource manager is a fundamental
component of Gunyah.  It's code appears to be here:
     https://github.com/quic/gunyah-resource-manager/
I haven't looked further on in the kernel documentation
yet but if there is a permanent place where the open source
hypervisor and resource manager code will reside, you should
link to both repositories (and anything else that might be
required) there.

Previously I reviewed the net result of all applied patches,
and did a pretty detailed review of the code.  I'm comfortable
I've previously pointed out things I thought were significant,
so this time around I'm doing less detailed review, looking at
each individual patch.  For the most part, it looks fine to me
(and in most cases I've provided a Reviewed-by tag).

I'll state once more that my review is oriented toward correct
code and good practices.  For "virtualization issues" you should
rely on others (like Will) to provide informed feedback.

					-Alex

> The diagram below shows the architecture.
> 
> ::
> 
>           VM A                    VM B
>       +-----+ +-----+  | +-----+ +-----+ +-----+
>       |     | |     |  | |     | |     | |     |
>   EL0 | APP | | APP |  | | APP | | APP | | APP |
>       |     | |     |  | |     | |     | |     |
>       +-----+ +-----+  | +-----+ +-----+ +-----+
>   ---------------------|-------------------------
>       +--------------+ | +----------------------+
>       |              | | |                      |
>   EL1 | Linux Kernel | | |Linux kernel/Other OS |   ...
>       |              | | |                      |
>       +--------------+ | +----------------------+
>   --------hvc/smc------|------hvc/smc------------
>       +----------------------------------------+
>       |                                        |
>   EL2 |            Gunyah Hypervisor           |
>       |                                        |
>       +----------------------------------------+
> 
> Gunyah provides these following features.
> 
> - Threads and Scheduling: The scheduler schedules virtual CPUs (VCPUs) on
> physical CPUs and enables time-sharing of the CPUs.
> - Memory Management: Gunyah tracks memory ownership and use of all memory
> under its control. Memory partitioning between VMs is a fundamental
> security feature.
> - Interrupt Virtualization: All interrupts are handled in the hypervisor
> and routed to the assigned VM.
> - Inter-VM Communication: There are several different mechanisms provided
> for communicating between VMs.
> - Device Virtualization: Para-virtualization of devices is supported using
> inter-VM communication. Low level system features and devices such as
> interrupt controllers are supported with emulation where required.
> 
> This series adds the basic framework for detecting that Linux is running
> under Gunyah as a virtual machine, communication with the Gunyah Resource
> Manager, and a sample virtual machine manager capable of launching virtual machines.
> 
> The series relies on two other patches posted separately:
>   - https://lore.kernel.org/all/20230213181832.3489174-1-quic_eberman@quicinc.com/

The above patch has been applied in the Qualcomm tree.

>   - https://lore.kernel.org/all/20230213232537.2040976-2-quic_eberman@quicinc.com/

And this link doesn't lead to a patch.  Has it too been
applied?  (Otherwise, it should be corrected.)

					-Alex

> Changes in v13:
>   - Tweaks to message queue driver to address race condition between IRQ and mailbox registration
>   - Allow removal of VM functions by function-specific comparison -- specifically to allow
>     removing irqfd by label only and not requiring original FD to be provided.
> 
> Changes in v12: https://lore.kernel.org/all/20230424231558.70911-1-quic_eberman@quicinc.com/
>   - Stylistic/cosmetic tweaks suggested by Alex
>   - Remove patch "virt: gunyah: Identify hypervisor version" and squash the
>     check that we're running under a reasonable Gunyah hypervisor into RM driver
>   - Refactor platform hooks into a separate module per suggestion from Srini
>   - GFP_KERNEL_ACCOUNT and account_locked_vm() for page pinning
>   - enum-ify related constants
> 
> Changes in v11: https://lore.kernel.org/all/20230304010632.2127470-1-quic_eberman@quicinc.com/
>   - Rename struct gh_vm_dtb_config:gpa -> guest_phys_addr & overflow checks for this
>   - More docstrings throughout
>   - Make resp_buf and resp_buf_size optional
>   - Replace deprecated idr with xarray
>   - Refconting on misc device instead of RM's platform device
>   - Renaming variables, structs, etc. from gunyah_ -> gh_
>   - Drop removal of user mem regions
>   - Drop mem_lend functionality; to converge with restricted_memfd later
> 
> Changes in v10: https://lore.kernel.org/all/20230214211229.3239350-1-quic_eberman@quicinc.com/
>   - Fix bisectability (end result of series is same, --fixups applied to wrong commits)
>   - Convert GH_ERROR_* and GH_RM_ERROR_* to enums
>   - Correct race condition between allocating/freeing user memory
>   - Replace offsetof with struct_size
>   - Series-wide renaming of functions to be more consistent
>   - VM shutdown & restart support added in vCPU and VM Manager patches
>   - Convert VM function name (string) to type (number)
>   - Convert VM function argument to value (which could be a pointer) to remove memory wastage for arguments
>   - Remove defensive checks of hypervisor correctness
>   - Clean ups to ioeventfd as suggested by Srivatsa
> 
> Changes in v9: https://lore.kernel.org/all/20230120224627.4053418-1-quic_eberman@quicinc.com/
>   - Refactor Gunyah API flags to be exposed as feature flags at kernel level
>   - Move mbox client cleanup into gunyah_msgq_remove()
>   - Simplify gh_rm_call return value and response payload
>   - Missing clean-up/error handling/little endian fixes as suggested by Srivatsa and Alex in v8 series
> 
> Changes in v8: https://lore.kernel.org/all/20221219225850.2397345-1-quic_eberman@quicinc.com/
>   - Treat VM manager as a library of RM
>   - Add patches 21-28 as RFC to support proxy-scheduled vCPUs and necessary bits to support virtio
>     from Gunyah userspace
> 
> Changes in v7: https://lore.kernel.org/all/20221121140009.2353512-1-quic_eberman@quicinc.com/
>   - Refactor to remove gunyah RM bus
>   - Refactor allow multiple RM device instances
>   - Bump UAPI to start at 0x0
>   - Refactor QCOM SCM's platform hooks to allow CONFIG_QCOM_SCM=Y/CONFIG_GUNYAH=M combinations
> 
> Changes in v6: https://lore.kernel.org/all/20221026185846.3983888-1-quic_eberman@quicinc.com/
>   - *Replace gunyah-console with gunyah VM Manager*
>   - Move include/asm-generic/gunyah.h into include/linux/gunyah.h
>   - s/gunyah_msgq/gh_msgq/
>   - Minor tweaks and documentation tidying based on comments from Jiri, Greg, Arnd, Dmitry, and Bagas.
> 
> Changes in v5: https://lore.kernel.org/all/20221011000840.289033-1-quic_eberman@quicinc.com/
>   - Dropped sysfs nodes
>   - Switch from aux bus to Gunyah RM bus for the subdevices
>   - Cleaning up RM console
> 
> Changes in v4: https://lore.kernel.org/all/20220928195633.2348848-1-quic_eberman@quicinc.com/
>   - Tidied up documentation throughout based on questions/feedback received
>   - Switched message queue implementation to use mailboxes
>   - Renamed "gunyah_device" as "gunyah_resource"
> 
> Changes in v3: https://lore.kernel.org/all/20220811214107.1074343-1-quic_eberman@quicinc.com/
>   - /Maintained/Supported/ in MAINTAINERS
>   - Tidied up documentation throughout based on questions/feedback received
>   - Moved hypercalls into arch/arm64/gunyah/; following hyper-v's implementation
>   - Drop opaque typedefs
>   - Move sysfs nodes under /sys/hypervisor/gunyah/
>   - Moved Gunyah console driver to drivers/tty/
>   - Reworked gh_device design to drop the Gunyah bus.
> 
> Changes in v2: https://lore.kernel.org/all/20220801211240.597859-1-quic_eberman@quicinc.com/
>   - DT bindings clean up
>   - Switch hypercalls to follow SMCCC
> 
> v1: https://lore.kernel.org/all/20220223233729.1571114-1-quic_eberman@quicinc.com/
> 
> Elliot Berman (24):
>    dt-bindings: Add binding for gunyah hypervisor
>    gunyah: Common types and error codes for Gunyah hypercalls
>    virt: gunyah: Add hypercalls to identify Gunyah
>    virt: gunyah: msgq: Add hypercalls to send and receive messages
>    mailbox: Add Gunyah message queue mailbox
>    gunyah: rsc_mgr: Add resource manager RPC core
>    gunyah: rsc_mgr: Add VM lifecycle RPC
>    gunyah: vm_mgr: Introduce basic VM Manager
>    gunyah: rsc_mgr: Add RPC for sharing memory
>    gunyah: vm_mgr: Add/remove user memory regions
>    gunyah: vm_mgr: Add ioctls to support basic non-proxy VM boot
>    samples: Add sample userspace Gunyah VM Manager
>    gunyah: rsc_mgr: Add platform ops on mem_lend/mem_reclaim
>    virt: gunyah: Add Qualcomm Gunyah platform ops
>    docs: gunyah: Document Gunyah VM Manager
>    virt: gunyah: Translate gh_rm_hyp_resource into gunyah_resource
>    gunyah: vm_mgr: Add framework for VM Functions
>    virt: gunyah: Add resource tickets
>    virt: gunyah: Add IO handlers
>    virt: gunyah: Add proxy-scheduled vCPUs
>    virt: gunyah: Add hypercalls for sending doorbell
>    virt: gunyah: Add irqfd interface
>    virt: gunyah: Add ioeventfd
>    MAINTAINERS: Add Gunyah hypervisor drivers section
> 
>   .../bindings/firmware/gunyah-hypervisor.yaml  |  82 ++
>   .../userspace-api/ioctl/ioctl-number.rst      |   1 +
>   Documentation/virt/gunyah/index.rst           |   1 +
>   Documentation/virt/gunyah/message-queue.rst   |   8 +
>   Documentation/virt/gunyah/vm-manager.rst      | 142 +++
>   MAINTAINERS                                   |  13 +
>   arch/arm64/Kbuild                             |   1 +
>   arch/arm64/gunyah/Makefile                    |   3 +
>   arch/arm64/gunyah/gunyah_hypercall.c          | 140 +++
>   arch/arm64/include/asm/gunyah.h               |  24 +
>   drivers/mailbox/Makefile                      |   2 +
>   drivers/mailbox/gunyah-msgq.c                 | 212 ++++
>   drivers/virt/Kconfig                          |   2 +
>   drivers/virt/Makefile                         |   1 +
>   drivers/virt/gunyah/Kconfig                   |  59 ++
>   drivers/virt/gunyah/Makefile                  |  11 +
>   drivers/virt/gunyah/gunyah_ioeventfd.c        | 130 +++
>   drivers/virt/gunyah/gunyah_irqfd.c            | 180 ++++
>   drivers/virt/gunyah/gunyah_platform_hooks.c   |  80 ++
>   drivers/virt/gunyah/gunyah_qcom.c             | 147 +++
>   drivers/virt/gunyah/gunyah_vcpu.c             | 468 +++++++++
>   drivers/virt/gunyah/rsc_mgr.c                 | 910 ++++++++++++++++++
>   drivers/virt/gunyah/rsc_mgr.h                 |  19 +
>   drivers/virt/gunyah/rsc_mgr_rpc.c             | 500 ++++++++++
>   drivers/virt/gunyah/vm_mgr.c                  | 794 +++++++++++++++
>   drivers/virt/gunyah/vm_mgr.h                  |  70 ++
>   drivers/virt/gunyah/vm_mgr_mm.c               | 256 +++++
>   include/linux/gunyah.h                        | 207 ++++
>   include/linux/gunyah_rsc_mgr.h                | 162 ++++
>   include/linux/gunyah_vm_mgr.h                 | 126 +++
>   include/uapi/linux/gunyah.h                   | 293 ++++++
>   samples/Kconfig                               |  10 +
>   samples/Makefile                              |   1 +
>   samples/gunyah/.gitignore                     |   2 +
>   samples/gunyah/Makefile                       |   6 +
>   samples/gunyah/gunyah_vmm.c                   | 270 ++++++
>   samples/gunyah/sample_vm.dts                  |  68 ++
>   37 files changed, 5401 insertions(+)
>   create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
>   create mode 100644 Documentation/virt/gunyah/vm-manager.rst
>   create mode 100644 arch/arm64/gunyah/Makefile
>   create mode 100644 arch/arm64/gunyah/gunyah_hypercall.c
>   create mode 100644 arch/arm64/include/asm/gunyah.h
>   create mode 100644 drivers/mailbox/gunyah-msgq.c
>   create mode 100644 drivers/virt/gunyah/Kconfig
>   create mode 100644 drivers/virt/gunyah/Makefile
>   create mode 100644 drivers/virt/gunyah/gunyah_ioeventfd.c
>   create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
>   create mode 100644 drivers/virt/gunyah/gunyah_platform_hooks.c
>   create mode 100644 drivers/virt/gunyah/gunyah_qcom.c
>   create mode 100644 drivers/virt/gunyah/gunyah_vcpu.c
>   create mode 100644 drivers/virt/gunyah/rsc_mgr.c
>   create mode 100644 drivers/virt/gunyah/rsc_mgr.h
>   create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
>   create mode 100644 drivers/virt/gunyah/vm_mgr.c
>   create mode 100644 drivers/virt/gunyah/vm_mgr.h
>   create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
>   create mode 100644 include/linux/gunyah.h
>   create mode 100644 include/linux/gunyah_rsc_mgr.h
>   create mode 100644 include/linux/gunyah_vm_mgr.h
>   create mode 100644 include/uapi/linux/gunyah.h
>   create mode 100644 samples/gunyah/.gitignore
>   create mode 100644 samples/gunyah/Makefile
>   create mode 100644 samples/gunyah/gunyah_vmm.c
>   create mode 100644 samples/gunyah/sample_vm.dts
> 
> 
> base-commit: c8c655c34e33544aec9d64b660872ab33c29b5f1
> prerequisite-patch-id: b48c45acdec06adf37e09fe35e6a9412c5784800
> prerequisite-patch-id: bc27499c7652385c584424529edbc5781c074d68
Alex Elder June 5, 2023, 7:47 p.m. UTC | #4
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
> 
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

Looks OK to me.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 83 insertions(+)
>   create mode 100644 include/linux/gunyah.h
> 
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..a4e8ec91961d
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls                  */
> +#define GH_CAPID_INVAL	U64_MAX
> +#define GH_VMID_ROOT_VM	0xff
> +
> +enum gh_error {
> +	GH_ERROR_OK			= 0,
> +	GH_ERROR_UNIMPLEMENTED		= -1,
> +	GH_ERROR_RETRY			= -2,

I know you explained it "should be OK" to use a negative
value (with unspecified bit width) here.  I continue to
feel it's not well-enough specified for an external API,
but I'm going to try to just let it go.

> +	GH_ERROR_ARG_INVAL		= 1,
> +	GH_ERROR_ARG_SIZE		= 2,
> +	GH_ERROR_ARG_ALIGN		= 3,
> +
> +	GH_ERROR_NOMEM			= 10,
> +
> +	GH_ERROR_ADDR_OVFL		= 20,
> +	GH_ERROR_ADDR_UNFL		= 21,
> +	GH_ERROR_ADDR_INVAL		= 22,
> +
> +	GH_ERROR_DENIED			= 30,
> +	GH_ERROR_BUSY			= 31,
> +	GH_ERROR_IDLE			= 32,
> +
> +	GH_ERROR_IRQ_BOUND		= 40,
> +	GH_ERROR_IRQ_UNBOUND		= 41,
> +
> +	GH_ERROR_CSPACE_CAP_NULL	= 50,
> +	GH_ERROR_CSPACE_CAP_REVOKED	= 51,
> +	GH_ERROR_CSPACE_WRONG_OBJ_TYPE	= 52,
> +	GH_ERROR_CSPACE_INSUF_RIGHTS	= 53,
> +	GH_ERROR_CSPACE_FULL		= 54,
> +
> +	GH_ERROR_MSGQUEUE_EMPTY		= 60,
> +	GH_ERROR_MSGQUEUE_FULL		= 61,
> +};
> +
> +/**
> + * gh_error_remap() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_error_remap(enum gh_error gh_error)
> +{
> +	switch (gh_error) {
> +	case GH_ERROR_OK:
> +		return 0;
> +	case GH_ERROR_NOMEM:
> +		return -ENOMEM;
> +	case GH_ERROR_DENIED:
> +	case GH_ERROR_CSPACE_CAP_NULL:
> +	case GH_ERROR_CSPACE_CAP_REVOKED:
> +	case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> +	case GH_ERROR_CSPACE_INSUF_RIGHTS:
> +	case GH_ERROR_CSPACE_FULL:
> +		return -EACCES;
> +	case GH_ERROR_BUSY:
> +	case GH_ERROR_IDLE:
> +		return -EBUSY;
> +	case GH_ERROR_IRQ_BOUND:
> +	case GH_ERROR_IRQ_UNBOUND:
> +	case GH_ERROR_MSGQUEUE_FULL:
> +	case GH_ERROR_MSGQUEUE_EMPTY:
> +		return -EIO;
> +	case GH_ERROR_UNIMPLEMENTED:
> +	case GH_ERROR_RETRY:
> +		return -EOPNOTSUPP;
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +#endif
Alex Elder June 5, 2023, 7:47 p.m. UTC | #5
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
> 
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

I have a couple of comments, but regardless of how you respond
to them:

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>   drivers/virt/gunyah/Makefile                  |  2 +-
>   drivers/virt/gunyah/rsc_mgr.c                 | 50 +++++++++-
>   drivers/virt/gunyah/vm_mgr.c                  | 93 +++++++++++++++++++
>   drivers/virt/gunyah/vm_mgr.h                  | 20 ++++
>   include/uapi/linux/gunyah.h                   | 23 +++++
>   6 files changed, 187 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/virt/gunyah/vm_mgr.c
>   create mode 100644 drivers/virt/gunyah/vm_mgr.h
>   create mode 100644 include/uapi/linux/gunyah.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 176e8fc3f31b..396212e88f7d 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code  Seq#    Include File                                           Comments
>   'F'   DD     video/sstfb.h                                           conflict!
>   'G'   00-3F  drivers/misc/sgi-gru/grulib.h                           conflict!
>   'G'   00-0F  xen/gntalloc.h, xen/gntdev.h                            conflict!
> +'G'   00-0f  linux/gunyah.h                                          conflict!

The existing pattern throughout this file is to use capital A-F,
so I would follow that here.

Sort off related:  I prefer lower-case a-f in hexadecimal
numbers in code, and you use capitals (at least some of the
time).

>   'H'   00-7F  linux/hiddev.h                                          conflict!
>   'H'   00-0F  linux/hidraw.h                                          conflict!
>   'H'   01     linux/mei.h                                             conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 241bab357b86..e47e25895299 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,4 +1,4 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
> -gunyah-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 88b5beb1ea51..4f6f96bdcf3d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -15,8 +15,10 @@
>   #include <linux/completion.h>
>   #include <linux/gunyah_rsc_mgr.h>
>   #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>   
>   #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>   
>   #define RM_RPC_API_VERSION_MASK		GENMASK(3, 0)
>   #define RM_RPC_HEADER_WORDS_MASK	GENMASK(7, 4)
> @@ -130,6 +132,7 @@ struct gh_rm_connection {
>    * @cache: cache for allocating Tx messages
>    * @send_lock: synchronization to allow only one request to be sent at a time
>    * @nh: notifier chain for clients interested in RM notification messages
> + * @miscdev: /dev/gunyah
>    */
>   struct gh_rm {
>   	struct device *dev;
> @@ -146,6 +149,8 @@ struct gh_rm {
>   	struct kmem_cache *cache;
>   	struct mutex send_lock;
>   	struct blocking_notifier_head nh;
> +
> +	struct miscdevice miscdev;
>   };
>   
>   /**
> @@ -581,6 +586,33 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
>   }
>   EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
>   
> +struct device *gh_rm_get(struct gh_rm *rm)
> +{
> +	return get_device(rm->miscdev.this_device);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_get);
> +
> +void gh_rm_put(struct gh_rm *rm)
> +{
> +	put_device(rm->miscdev.this_device);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_put);
> +
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> +	struct miscdevice *miscdev = filp->private_data;
> +	struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> +	return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> +	.owner		= THIS_MODULE,
> +	.unlocked_ioctl	= gh_dev_ioctl,
> +	.compat_ioctl	= compat_ptr_ioctl,
> +	.llseek		= noop_llseek,
> +};
> +
>   static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
>   					    struct gh_resource *ghrsc)
>   {
> @@ -665,7 +697,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
>   	rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>   	rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>   
> -	return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	if (ret)
> +		goto err_cache;
> +
> +	rm->miscdev.name = "gunyah";
> +	rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> +	rm->miscdev.fops = &gh_dev_fops;
> +
> +	ret = misc_register(&rm->miscdev);
> +	if (ret)
> +		goto err_msgq;
> +
> +	return 0;
> +err_msgq:
> +	mbox_free_channel(gh_msgq_chan(&rm->msgq));

I'm sure I've said this before.  I find it strange that you need
to call mbox_free_channel() here, when it's not obvious where the
client got bound to any mbox channel.  It seems like freeing the
channel should happen inside gh_msgq_remove().  But... perhaps
you previously explained to me why it's done this way.

> +	gh_msgq_remove(&rm->msgq);
>   err_cache:
>   	kmem_cache_destroy(rm->cache);
>   	return ret;
> @@ -675,6 +722,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
>   {
>   	struct gh_rm *rm = platform_get_drvdata(pdev);
>   
> +	misc_deregister(&rm->miscdev);
>   	mbox_free_channel(gh_msgq_chan(&rm->msgq));
>   	gh_msgq_remove(&rm->msgq);
>   	kmem_cache_destroy(rm->cache);

. . .
Alex Elder June 5, 2023, 7:48 p.m. UTC | #6
On 5/9/23 3:47 PM, Elliot Berman wrote:
> When launching a virtual machine, Gunyah userspace allocates memory for
> the guest and informs Gunyah about these memory regions through
> SET_USER_MEMORY_REGION ioctl.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

Two minor comments below.  In any case:

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   drivers/virt/gunyah/Makefile    |   2 +-
>   drivers/virt/gunyah/vm_mgr.c    |  59 +++++++-
>   drivers/virt/gunyah/vm_mgr.h    |  26 ++++
>   drivers/virt/gunyah/vm_mgr_mm.c | 236 ++++++++++++++++++++++++++++++++
>   include/uapi/linux/gunyah.h     |  37 +++++
>   5 files changed, 356 insertions(+), 4 deletions(-)
>   create mode 100644 drivers/virt/gunyah/vm_mgr_mm.c
> 
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index e47e25895299..bacf78b8fa33 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,4 +1,4 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
> -gunyah-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
> +gunyah-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> index a43401cb34f7..297427952b8c 100644
> --- a/drivers/virt/gunyah/vm_mgr.c
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -15,6 +15,8 @@
>   
>   #include "vm_mgr.h"
>   
> +static void gh_vm_free(struct work_struct *work);
> +

You could just define gh_vm_free() here rather than declaring
and defining it later.

>   static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
>   {
>   	struct gh_vm *ghvm;

. . .

> diff --git a/drivers/virt/gunyah/vm_mgr_mm.c b/drivers/virt/gunyah/vm_mgr_mm.c
> new file mode 100644
> index 000000000000..91109bbf36b3
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr_mm.c
> @@ -0,0 +1,236 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/mm.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static bool pages_are_mergeable(struct page *a, struct page *b)
> +{
> +	if (page_to_pfn(a) + 1 != page_to_pfn(b))
> +		return false;
> +	if (!zone_device_pages_have_same_pgmap(a, b))
> +		return false;
> +	return true;

Maybe just:

	return zone_device_pages_have_same_pgmap(a, b);

> +}
> +
> +static bool gh_vm_mem_overlap(struct gh_vm_mem *a, u64 addr, u64 size)
> +{
> +	u64 a_end = a->guest_phys_addr + (a->npages << PAGE_SHIFT);
> +	u64 end = addr + size;
> +
> +	return a->guest_phys_addr < end && addr < a_end;
> +}
> +

. . .
Alex Elder June 5, 2023, 7:48 p.m. UTC | #7
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Add a sample Gunyah VMM capable of launching a non-proxy scheduled VM.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

I haven't tested this, but I trust it works.

I have some trivial comments, but otherwise:

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   samples/Kconfig              |  10 ++
>   samples/Makefile             |   1 +
>   samples/gunyah/.gitignore    |   2 +
>   samples/gunyah/Makefile      |   6 +
>   samples/gunyah/gunyah_vmm.c  | 270 +++++++++++++++++++++++++++++++++++
>   samples/gunyah/sample_vm.dts |  68 +++++++++
>   6 files changed, 357 insertions(+)
>   create mode 100644 samples/gunyah/.gitignore
>   create mode 100644 samples/gunyah/Makefile
>   create mode 100644 samples/gunyah/gunyah_vmm.c
>   create mode 100644 samples/gunyah/sample_vm.dts
> 
> diff --git a/samples/Kconfig b/samples/Kconfig
> index b2db430bd3ff..567c7a706c01 100644
> --- a/samples/Kconfig
> +++ b/samples/Kconfig
> @@ -280,6 +280,16 @@ config SAMPLE_KMEMLEAK
>             Build a sample program which have explicitly leaks memory to test
>             kmemleak
>   
> +config SAMPLE_GUNYAH
> +	bool "Build example Gunyah Virtual Machine Manager"
> +	depends on CC_CAN_LINK && HEADERS_INSTALL
> +	depends on GUNYAH
> +	help
> +	  Build an example Gunyah VMM userspace program capable of launching
> +	  a basic virtual machine under the Gunyah hypervisor.
> +	  This demonstrates how to create a virtual machine under the Gunyah
> +	  hypervisor.

I think you can drop the second sentence above.  Perhaps adjust the
first a bit if you think the second adds anything important.

> +
>   source "samples/rust/Kconfig"
>   
>   endif # SAMPLES
> diff --git a/samples/Makefile b/samples/Makefile
> index 7727f1a0d6d1..e1b92dec169f 100644
> --- a/samples/Makefile
> +++ b/samples/Makefile
> @@ -37,3 +37,4 @@ obj-$(CONFIG_SAMPLE_KMEMLEAK)		+= kmemleak/
>   obj-$(CONFIG_SAMPLE_CORESIGHT_SYSCFG)	+= coresight/
>   obj-$(CONFIG_SAMPLE_FPROBE)		+= fprobe/
>   obj-$(CONFIG_SAMPLES_RUST)		+= rust/
> +obj-$(CONFIG_SAMPLE_GUNYAH)		+= gunyah/
> diff --git a/samples/gunyah/.gitignore b/samples/gunyah/.gitignore
> new file mode 100644
> index 000000000000..adc7d1589fde
> --- /dev/null
> +++ b/samples/gunyah/.gitignore
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +/gunyah_vmm
> diff --git a/samples/gunyah/Makefile b/samples/gunyah/Makefile
> new file mode 100644
> index 000000000000..faf14f9bb337
> --- /dev/null
> +++ b/samples/gunyah/Makefile
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +
> +userprogs-always-y += gunyah_vmm
> +dtb-y += sample_vm.dtb
> +
> +userccflags += -I usr/include
> diff --git a/samples/gunyah/gunyah_vmm.c b/samples/gunyah/gunyah_vmm.c
> new file mode 100644
> index 000000000000..d0eb49e86372
> --- /dev/null
> +++ b/samples/gunyah/gunyah_vmm.c
> @@ -0,0 +1,270 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.

Update the copyright.

> + */
> +
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <sys/ioctl.h>
> +#include <getopt.h>
> +#include <limits.h>
> +#include <stdint.h>
> +#include <fcntl.h>
> +#include <string.h>
> +#include <sys/sysmacros.h>
> +#define __USE_GNU
> +#include <sys/mman.h>
> +
> +#include <linux/gunyah.h>
> +
> +struct vm_config {
> +	int image_fd;
> +	int dtb_fd;
> +	int ramdisk_fd;
> +
> +	uint64_t guest_base;
> +	uint64_t guest_size;
> +
> +	uint64_t image_offset;
> +	off_t image_size;
> +	uint64_t dtb_offset;
> +	off_t dtb_size;
> +	uint64_t ramdisk_offset;
> +	off_t ramdisk_size;
> +};
> +
> +static struct option options[] = {
> +	{ "help", no_argument, NULL, 'h' },
> +	{ "image", required_argument, NULL, 'i' },
> +	{ "dtb", required_argument, NULL, 'd' },
> +	{ "ramdisk", optional_argument, NULL, 'r' },
> +	{ "base", optional_argument, NULL, 'B' },
> +	{ "size", optional_argument, NULL, 'S' },
> +	{ "image_offset", optional_argument, NULL, 'I' },
> +	{ "dtb_offset", optional_argument, NULL, 'D' },
> +	{ "ramdisk_offset", optional_argument, NULL, 'R' },
> +	{ }
> +};
> +
> +static void print_help(char *cmd)
> +{
> +	printf("gunyah_vmm, a sample tool to launch Gunyah VMs\n"
> +	       "Usage: %s <options>\n"
> +	       "       --help,    -h  this menu\n"
> +	       "       --image,   -i <image> VM image file to load (e.g. a kernel Image) [Required]\n"
> +	       "       --dtb,     -d <dtb>   Devicetree file to load [Required]\n"
> +	       "       --ramdisk, -r <ramdisk>  Ramdisk file to load\n"
> +	       "       --base,    -B <address>  Set the base address of guest's memory [Default: 0x80000000]\n"
> +	       "       --size,    -S <number>   The number of bytes large to make the guest's memory [Default: 0x6400000 (100 MB)]\n"
> +	       "       --image_offset, -I <number>  Offset into guest memory to load the VM image file [Default: 0x10000]\n"
> +	       "        --dtb_offset,  -D <number>  Offset into guest memory to load the DTB [Default: 0]\n"
> +	       "        --ramdisk_offset, -R <number>  Offset into guest memory to load a ramdisk [Default: 0x4600000]\n"
> +	       , cmd);

You could define the default values above with symbolic constants,
and print them with 0x%08x in the messages above (or something
similar).

> +}
> +
> +int main(int argc, char **argv)
> +{
> +	int gunyah_fd, vm_fd, guest_fd;
> +	struct gh_userspace_memory_region guest_mem_desc = { 0 };
> +	struct gh_vm_dtb_config dtb_config = { 0 };
> +	char *guest_mem;
> +	struct vm_config config = {
> +		/* Defaults good enough to boot static kernel and a basic ramdisk */
> +		.ramdisk_fd = -1,
> +		.guest_base = 0x80000000,
> +		.guest_size = 0x6400000, /* 100 MB */
> +		.image_offset = 0,
> +		.dtb_offset = 0x45f0000,
> +		.ramdisk_offset = 0x4600000, /* put at +70MB (30MB for ramdisk) */
> +	};
> +	struct stat st;
> +	int opt, optidx, ret = 0;
> +	long l;
> +
> +	while ((opt = getopt_long(argc, argv, "hi:d:r:B:S:I:D:R:c:", options, &optidx)) != -1) {
> +		switch (opt) {
> +		case 'i':
> +			config.image_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> +			if (config.image_fd < 0) {
> +				perror("Failed to open image");
> +				return -1;
> +			}
> +			if (stat(optarg, &st) < 0) {
> +				perror("Failed to stat image");
> +				return -1;
> +			}
> +			config.image_size = st.st_size;
> +			break;
> +		case 'd':
> +			config.dtb_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> +			if (config.dtb_fd < 0) {
> +				perror("Failed to open dtb");
> +				return -1;
> +			}
> +			if (stat(optarg, &st) < 0) {
> +				perror("Failed to stat dtb");
> +				return -1;
> +			}
> +			config.dtb_size = st.st_size;
> +			break;
> +		case 'r':
> +			config.ramdisk_fd = open(optarg, O_RDONLY | O_CLOEXEC);
> +			if (config.ramdisk_fd < 0) {
> +				perror("Failed to open ramdisk");
> +				return -1;
> +			}
> +			if (stat(optarg, &st) < 0) {
> +				perror("Failed to stat ramdisk");
> +				return -1;
> +			}
> +			config.ramdisk_size = st.st_size;
> +			break;
> +		case 'B':
> +			l = strtol(optarg, NULL, 0);
> +			if (l == LONG_MIN) {
> +				perror("Failed to parse base address");
> +				return -1;
> +			}
> +			config.guest_base = l;
> +			break;
> +		case 'S':
> +			l = strtol(optarg, NULL, 0);
> +			if (l == LONG_MIN) {
> +				perror("Failed to parse memory size");
> +				return -1;
> +			}
> +			config.guest_size = l;
> +			break;
> +		case 'I':
> +			l = strtol(optarg, NULL, 0);
> +			if (l == LONG_MIN) {
> +				perror("Failed to parse image offset");
> +				return -1;
> +			}
> +			config.image_offset = l;
> +			break;
> +		case 'D':
> +			l = strtol(optarg, NULL, 0);
> +			if (l == LONG_MIN) {
> +				perror("Failed to parse dtb offset");
> +				return -1;
> +			}
> +			config.dtb_offset = l;
> +			break;
> +		case 'R':
> +			l = strtol(optarg, NULL, 0);
> +			if (l == LONG_MIN) {
> +				perror("Failed to parse ramdisk offset");
> +				return -1;
> +			}
> +			config.ramdisk_offset = l;
> +			break;
> +		case 'h':
> +			print_help(argv[0]);
> +			return 0;
> +		default:
> +			print_help(argv[0]);
> +			return -1;
> +		}
> +	}
> +
> +	if (!config.image_fd || !config.dtb_fd) {

I *think* it's possible to have 0 be assigned as config.image_fd
if STDIN is closed when this is run.  I might be wrong though, it's
been quite a while...  In any case, to guarantee this works correctly
these should be set to -1 (as you do for ramdisk_fd).

> +		print_help(argv[0]);
> +		return -1;
> +	}
> +
> +	if (config.image_offset + config.image_size > config.guest_size) {
> +		fprintf(stderr, "Image offset and size puts it outside guest memory. Make image smaller or increase guest memory size.\n");
> +		return -1;
> +	}
> +
> +	if (config.dtb_offset + config.dtb_size > config.guest_size) {
> +		fprintf(stderr, "DTB offset and size puts it outside guest memory. Make dtb smaller or increase guest memory size.\n");
> +		return -1;
> +	}
> +
> +	if (config.ramdisk_fd == -1 &&
> +		config.ramdisk_offset + config.ramdisk_size > config.guest_size) {
> +		fprintf(stderr, "Ramdisk offset and size puts it outside guest memory. Make ramdisk smaller or increase guest memory size.\n");
> +		return -1;
> +	}
> +
> +	gunyah_fd = open("/dev/gunyah", O_RDWR | O_CLOEXEC);
> +	if (gunyah_fd < 0) {
> +		perror("Failed to open /dev/gunyah");
> +		return -1;
> +	}
> +
> +	vm_fd = ioctl(gunyah_fd, GH_CREATE_VM, 0);
> +	if (vm_fd < 0) {
> +		perror("Failed to create vm");
> +		return -1;
> +	}
> +
> +	guest_fd = memfd_create("guest_memory", MFD_CLOEXEC);
> +	if (guest_fd < 0) {
> +		perror("Failed to create guest memfd");
> +		return -1;
> +	}
> +
> +	if (ftruncate(guest_fd, config.guest_size) < 0) {
> +		perror("Failed to grow guest memory");
> +		return -1;
> +	}
> +
> +	guest_mem = mmap(NULL, config.guest_size, PROT_READ | PROT_WRITE, MAP_SHARED, guest_fd, 0);
> +	if (guest_mem == MAP_FAILED) {
> +		perror("Not enough memory");
> +		return -1;
> +	}
> +
> +	if (read(config.image_fd, guest_mem + config.image_offset, config.image_size) < 0) {
> +		perror("Failed to read image into guest memory");
> +		return -1;
> +	}
> +
> +	if (read(config.dtb_fd, guest_mem + config.dtb_offset, config.dtb_size) < 0) {
> +		perror("Failed to read dtb into guest memory");
> +		return -1;
> +	}
> +
> +	if (config.ramdisk_fd > 0 &&
> +		read(config.ramdisk_fd, guest_mem + config.ramdisk_offset,
> +			config.ramdisk_size) < 0) {
> +		perror("Failed to read ramdisk into guest memory");
> +		return -1;
> +	}
> +
> +	guest_mem_desc.label = 0;
> +	guest_mem_desc.flags = GH_MEM_ALLOW_READ | GH_MEM_ALLOW_WRITE | GH_MEM_ALLOW_EXEC;
> +	guest_mem_desc.guest_phys_addr = config.guest_base;
> +	guest_mem_desc.memory_size = config.guest_size;
> +	guest_mem_desc.userspace_addr = (__u64)guest_mem;
> +
> +	if (ioctl(vm_fd, GH_VM_SET_USER_MEM_REGION, &guest_mem_desc) < 0) {
> +		perror("Failed to register guest memory with VM");
> +		return -1;
> +	}
> +
> +	dtb_config.guest_phys_addr = config.guest_base + config.dtb_offset;
> +	dtb_config.size = config.dtb_size;
> +	if (ioctl(vm_fd, GH_VM_SET_DTB_CONFIG, &dtb_config) < 0) {
> +		perror("Failed to set DTB configuration for VM");
> +		return -1;
> +	}
> +
> +	ret = ioctl(vm_fd, GH_VM_START);
> +	if (ret) {
> +		perror("GH_VM_START failed");
> +		return -1;
> +	}
> +
> +	while (1)
> +		sleep(10);

Maybe call pause() instead of sleep?

> +
> +	return 0;
> +}
> diff --git a/samples/gunyah/sample_vm.dts b/samples/gunyah/sample_vm.dts
> new file mode 100644
> index 000000000000..293bbc0469c8
> --- /dev/null
> +++ b/samples/gunyah/sample_vm.dts
> @@ -0,0 +1,68 @@
> +// SPDX-License-Identifier: BSD-3-Clause
> +/*
> + * Copyright (c) 2022 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +/dts-v1/;
> +
> +/ {
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +	interrupt-parent = <&intc>;
> +
> +	chosen {
> +		bootargs = "nokaslr";
> +	};
> +
> +	cpus {
> +		#address-cells = <0x2>;
> +		#size-cells = <0>;
> +
> +		cpu@0 {
> +			device_type = "cpu";
> +			compatible = "arm,armv8";
> +			reg = <0 0>;
> +		};
> +	};
> +
> +	intc: interrupt-controller@3FFF0000 {
> +		compatible = "arm,gic-v3";
> +		#interrupt-cells = <3>;
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		interrupt-controller;
> +		reg = <0 0x3FFF0000 0 0x10000>,
> +		      <0 0x3FFD0000 0 0x20000>;
> +	};
> +
> +	timer {
> +		compatible = "arm,armv8-timer";
> +		always-on;
> +		interrupts = <1 13 0x108>,
> +			     <1 14 0x108>,
> +			     <1 11 0x108>,
> +			     <1 10 0x108>;
> +		clock-frequency = <19200000>;
> +	};
> +
> +	gunyah-vm-config {
> +		image-name = "linux_vm_0";
> +
> +		memory {
> +			#address-cells = <2>;
> +			#size-cells = <2>;
> +
> +			base-address = <0 0x80000000>;
> +		};
> +
> +		interrupts {
> +			config = <&intc>;
> +		};
> +
> +		vcpus {
> +			affinity-map = < 0 >;
> +			sched-priority = < (-1) >;
> +			sched-timeslice = < 2000 >;
> +		};
> +	};
> +};
Alex Elder June 5, 2023, 7:48 p.m. UTC | #8
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Qualcomm platforms have a firmware entity which performs access control
> to physical pages. Dynamically started Gunyah virtual machines use the
> QCOM_SCM_RM_MANAGED_VMID for access. Linux thus needs to assign access
> to the memory used by guest VMs. Gunyah doesn't do this operation for us
> since it is the current VM (typically VMID_HLOS) delegating the access
> and not Gunyah itself. Use the Gunyah platform ops to achieve this so
> that only Qualcomm platforms attempt to make the needed SCM calls.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

Minor suggestions below.  Please consider them, but either way:

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   drivers/virt/gunyah/Kconfig       |  13 +++
>   drivers/virt/gunyah/Makefile      |   1 +
>   drivers/virt/gunyah/gunyah_qcom.c | 147 ++++++++++++++++++++++++++++++
>   3 files changed, 161 insertions(+)
>   create mode 100644 drivers/virt/gunyah/gunyah_qcom.c
> 
> diff --git a/drivers/virt/gunyah/Kconfig b/drivers/virt/gunyah/Kconfig
> index de815189dab6..0421b751aad4 100644
> --- a/drivers/virt/gunyah/Kconfig
> +++ b/drivers/virt/gunyah/Kconfig
> @@ -5,6 +5,7 @@ config GUNYAH
>   	depends on ARM64
>   	depends on MAILBOX
>   	select GUNYAH_PLATFORM_HOOKS
> +	imply GUNYAH_QCOM_PLATFORM if ARCH_QCOM
>   	help
>   	  The Gunyah drivers are the helper interfaces that run in a guest VM
>   	  such as basic inter-VM IPC and signaling mechanisms, and higher level
> @@ -15,3 +16,15 @@ config GUNYAH
>   
>   config GUNYAH_PLATFORM_HOOKS
>   	tristate
> +
> +config GUNYAH_QCOM_PLATFORM
> +	tristate "Support for Gunyah on Qualcomm platforms"
> +	depends on GUNYAH
> +	select GUNYAH_PLATFORM_HOOKS
> +	select QCOM_SCM
> +	help
> +	  Enable support for interacting with Gunyah on Qualcomm
> +	  platforms. Interaction with Qualcomm firmware requires
> +	  extra platform-specific support.
> +
> +	  Say Y/M here to use Gunyah on Qualcomm platforms.
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 4fbeee521d60..2aa9ff038ed0 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,6 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
>   obj-$(CONFIG_GUNYAH_PLATFORM_HOOKS) += gunyah_platform_hooks.o
> +obj-$(CONFIG_GUNYAH_QCOM_PLATFORM) += gunyah_qcom.o
>   
>   gunyah-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o vm_mgr_mm.o
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> diff --git a/drivers/virt/gunyah/gunyah_qcom.c b/drivers/virt/gunyah/gunyah_qcom.c
> new file mode 100644
> index 000000000000..18acbda8fcbd
> --- /dev/null
> +++ b/drivers/virt/gunyah/gunyah_qcom.c
> @@ -0,0 +1,147 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/arm-smccc.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/module.h>
> +#include <linux/firmware/qcom/qcom_scm.h>
> +#include <linux/types.h>
> +#include <linux/uuid.h>
> +
> +#define QCOM_SCM_RM_MANAGED_VMID	0x3A
> +#define QCOM_SCM_MAX_MANAGED_VMID	0x3F

Is this limited to 63 because there are at most 64 VMIDs
that can be represented in a 64-bit unsigned?

> +
> +static int qcom_scm_gh_rm_pre_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> +	struct qcom_scm_vmperm *new_perms;
> +	u64 src, src_cpy;
> +	int ret = 0, i, n;
> +	u16 vmid;
> +
> +	new_perms = kcalloc(mem_parcel->n_acl_entries, sizeof(*new_perms), GFP_KERNEL);
> +	if (!new_perms)
> +		return -ENOMEM;
> +
> +	for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> +		vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> +		if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> +			new_perms[n].vmid = vmid;
> +		else
> +			new_perms[n].vmid = QCOM_SCM_RM_MANAGED_VMID;

So any out-of-range VM ID will cause the hunk of memory to
be assigned to the resource manager.  Is it expected that
this can occur (and not be an error)?

> +		if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_X)
> +			new_perms[n].perm |= QCOM_SCM_PERM_EXEC;
> +		if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_W)
> +			new_perms[n].perm |= QCOM_SCM_PERM_WRITE;
> +		if (mem_parcel->acl_entries[n].perms & GH_RM_ACL_R)
> +			new_perms[n].perm |= QCOM_SCM_PERM_READ;
> +	}
> +
> +	src = (1ull << QCOM_SCM_VMID_HLOS);

	src = BIT_ULL(QCOM_SCM_VMID_HLOS);

> +
> +	for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> +		src_cpy = src;
> +		ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
> +						le64_to_cpu(mem_parcel->mem_entries[i].size),
> +						&src_cpy, new_perms, mem_parcel->n_acl_entries);

Loops like this can look simpler if you jump to error handling
at the end that does this unwind activity, rather than incorporating
it inside the loop itself.  Or even just breaking if ret != 0, e.g.:

		if (ret)
			break;
	}

	if (!ret)
		return 0;

	/* And do the following block here, "outdented" twice */

> +		if (ret) {
> +			src = 0;
> +			for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> +				vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> +				if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> +					src |= (1ull << vmid);

	src |= BIT_ULL(vmid);

> +				else
> +					src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);

	src |= BIT_ULL(QCOM_SCM_RM_MANAGED_VMID);

> +			}
> +
> +			new_perms[0].vmid = QCOM_SCM_VMID_HLOS;
> +
> +			for (i--; i >= 0; i--) {
> +				src_cpy = src;
> +				WARN_ON_ONCE(qcom_scm_assign_mem(
> +						le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
> +						le64_to_cpu(mem_parcel->mem_entries[i].size),
> +						&src_cpy, new_perms, 1));
> +			}
> +			break;
> +		}
> +	}
> +
> +	kfree(new_perms);
> +	return ret;
> +}
> +
> +static int qcom_scm_gh_rm_post_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *mem_parcel)
> +{
> +	struct qcom_scm_vmperm new_perms;
> +	u64 src = 0, src_cpy;
> +	int ret = 0, i, n;
> +	u16 vmid;
> +
> +	new_perms.vmid = QCOM_SCM_VMID_HLOS;
> +	new_perms.perm = QCOM_SCM_PERM_EXEC | QCOM_SCM_PERM_WRITE | QCOM_SCM_PERM_READ;
> +
> +	for (n = 0; n < mem_parcel->n_acl_entries; n++) {
> +		vmid = le16_to_cpu(mem_parcel->acl_entries[n].vmid);
> +		if (vmid <= QCOM_SCM_MAX_MANAGED_VMID)
> +			src |= (1ull << vmid);
> +		else
> +			src |= (1ull << QCOM_SCM_RM_MANAGED_VMID);
> +	}
> +
> +	for (i = 0; i < mem_parcel->n_mem_entries; i++) {
> +		src_cpy = src;
> +		ret = qcom_scm_assign_mem(le64_to_cpu(mem_parcel->mem_entries[i].phys_addr),
> +						le64_to_cpu(mem_parcel->mem_entries[i].size),
> +						&src_cpy, &new_perms, 1);
> +		WARN_ON_ONCE(ret);
> +	}
> +
> +	return ret;
> +}
> +
> +static struct gh_rm_platform_ops qcom_scm_gh_rm_platform_ops = {
> +	.pre_mem_share = qcom_scm_gh_rm_pre_mem_share,
> +	.post_mem_reclaim = qcom_scm_gh_rm_post_mem_reclaim,
> +};
> +
> +/* {19bd54bd-0b37-571b-946f-609b54539de6} */
> +static const uuid_t QCOM_EXT_UUID =
> +	UUID_INIT(0x19bd54bd, 0x0b37, 0x571b, 0x94, 0x6f, 0x60, 0x9b, 0x54, 0x53, 0x9d, 0xe6);
> +
> +#define GH_QCOM_EXT_CALL_UUID_ID	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \
> +							   ARM_SMCCC_OWNER_VENDOR_HYP, 0x3f01)
> +
> +static bool gh_has_qcom_extensions(void)
> +{
> +	struct arm_smccc_res res;
> +	uuid_t uuid;
> +
> +	arm_smccc_1_1_smc(GH_QCOM_EXT_CALL_UUID_ID, &res);
> +
> +	((u32 *)&uuid.b[0])[0] = lower_32_bits(res.a0);
> +	((u32 *)&uuid.b[0])[1] = lower_32_bits(res.a1);
> +	((u32 *)&uuid.b[0])[2] = lower_32_bits(res.a2);
> +	((u32 *)&uuid.b[0])[3] = lower_32_bits(res.a3);

I said this elsewhere.  I'd rather see:

	u32 *u = (u32 *)&uuid;		/* Or &uuid.b? */

	*u++ = lower_32_bits(res.a0);
		. . .

> +
> +	return uuid_equal(&uuid, &QCOM_EXT_UUID);
> +}
> +
> +static int __init qcom_gh_platform_hooks_register(void)
> +{
> +	if (!gh_has_qcom_extensions())
> +		return -ENODEV;
> +
> +	return gh_rm_register_platform_ops(&qcom_scm_gh_rm_platform_ops);
> +}
> +
> +static void __exit qcom_gh_platform_hooks_unregister(void)
> +{
> +	gh_rm_unregister_platform_ops(&qcom_scm_gh_rm_platform_ops);
> +}
> +
> +module_init(qcom_gh_platform_hooks_register);
> +module_exit(qcom_gh_platform_hooks_unregister);
> +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. Platform Hooks for Gunyah");
> +MODULE_LICENSE("GPL");
Alex Elder June 5, 2023, 7:50 p.m. UTC | #9
On 5/9/23 3:47 PM, Elliot Berman wrote:
> Enable support for creating irqfds which can raise an interrupt on a
> Gunyah virtual machine. irqfds are exposed to userspace as a Gunyah VM
> function with the name "irqfd". If the VM devicetree is not configured
> to create a doorbell with the corresponding label, userspace will still
> be able to assert the eventfd but no interrupt will be raised on the
> guest.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

I have a minor suggestion.  I think I'd like to look at this
again, so:

Acked-by: Alex Elder <elder@linaro.org>

> ---
>   Documentation/virt/gunyah/vm-manager.rst |   2 +-
>   drivers/virt/gunyah/Kconfig              |   9 ++
>   drivers/virt/gunyah/Makefile             |   1 +
>   drivers/virt/gunyah/gunyah_irqfd.c       | 180 +++++++++++++++++++++++
>   include/uapi/linux/gunyah.h              |  35 +++++
>   5 files changed, 226 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/virt/gunyah/gunyah_irqfd.c
> 

. . .

> @@ -99,6 +102,38 @@ struct gh_fn_vcpu_arg {
>   	__u32 id;
>   };
>   
> +/**
> + * enum gh_irqfd_flags - flags for use in gh_fn_irqfd_arg
> + * @GH_IRQFD_FLAGS_LEVEL: make the interrupt operate like a level triggered
> + *                        interrupt on guest side. Triggering IRQFD before
> + *                        guest handles the interrupt causes interrupt to
> + *                        stay asserted.
> + */
> +enum gh_irqfd_flags {
> +	GH_IRQFD_FLAGS_LEVEL		= 1UL << 0,

	BIT(0),			/* ? */

> +};
> +
> +/**
> + * struct gh_fn_irqfd_arg - Arguments to create an irqfd function.
> + *
> + * Create this function with &GH_VM_ADD_FUNCTION using type &GH_FN_IRQFD.
> + *
> + * Allows setting an eventfd to directly trigger a guest interrupt.
> + * irqfd.fd specifies the file descriptor to use as the eventfd.
> + * irqfd.label corresponds to the doorbell label used in the guest VM's devicetree.
> + *
> + * @fd: an eventfd which when written to will raise a doorbell
> + * @label: Label of the doorbell created on the guest VM
> + * @flags: see &enum gh_irqfd_flags
> + * @padding: padding bytes
> + */
> +struct gh_fn_irqfd_arg {
> +	__u32 fd;
> +	__u32 label;
> +	__u32 flags;
> +	__u32 padding;
> +};
> +
>   /**
>    * struct gh_fn_desc - Arguments to create a VM function
>    * @type: Type of the function. See &enum gh_fn_type.
Alex Elder June 5, 2023, 7:50 p.m. UTC | #10
On 5/9/23 3:48 PM, Elliot Berman wrote:
> Add myself and Prakruthi as maintainers of Gunyah hypervisor drivers.
> 
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>

Looks good.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>   MAINTAINERS | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c754befb94e7..323391320cf1 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -8970,6 +8970,19 @@ L:	linux-efi@vger.kernel.org
>   S:	Maintained
>   F:	block/partitions/efi.*
>   
> +GUNYAH HYPERVISOR DRIVER
> +M:	Elliot Berman <quic_eberman@quicinc.com>
> +M:	Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> +L:	linux-arm-msm@vger.kernel.org
> +S:	Supported
> +F:	Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
> +F:	Documentation/virt/gunyah/
> +F:	arch/arm64/gunyah/
> +F:	drivers/mailbox/gunyah-msgq.c
> +F:	drivers/virt/gunyah/
> +F:	include/linux/gunyah*.h
> +F:	samples/gunyah/
> +
>   HABANALABS PCI DRIVER
>   M:	Oded Gabbay <ogabbay@kernel.org>
>   L:	dri-devel@lists.freedesktop.org
Srinivas Kandagatla June 5, 2023, 9:31 p.m. UTC | #11
On 09/05/2023 21:47, Elliot Berman wrote:
> Add architecture-independent standard error codes, types, and macros for
> Gunyah hypercalls.
> 
> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---

lgtm,

Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

--srini
>   include/linux/gunyah.h | 83 ++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 83 insertions(+)
>   create mode 100644 include/linux/gunyah.h
> 
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> new file mode 100644
> index 000000000000..a4e8ec91961d
> --- /dev/null
> +++ b/include/linux/gunyah.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _LINUX_GUNYAH_H
> +#define _LINUX_GUNYAH_H
> +
> +#include <linux/errno.h>
> +#include <linux/limits.h>
> +
> +/******************************************************************************/
> +/* Common arch-independent definitions for Gunyah hypercalls                  */
> +#define GH_CAPID_INVAL	U64_MAX
> +#define GH_VMID_ROOT_VM	0xff
> +
> +enum gh_error {
> +	GH_ERROR_OK			= 0,
> +	GH_ERROR_UNIMPLEMENTED		= -1,
> +	GH_ERROR_RETRY			= -2,
> +
> +	GH_ERROR_ARG_INVAL		= 1,
> +	GH_ERROR_ARG_SIZE		= 2,
> +	GH_ERROR_ARG_ALIGN		= 3,
> +
> +	GH_ERROR_NOMEM			= 10,
> +
> +	GH_ERROR_ADDR_OVFL		= 20,
> +	GH_ERROR_ADDR_UNFL		= 21,
> +	GH_ERROR_ADDR_INVAL		= 22,
> +
> +	GH_ERROR_DENIED			= 30,
> +	GH_ERROR_BUSY			= 31,
> +	GH_ERROR_IDLE			= 32,
> +
> +	GH_ERROR_IRQ_BOUND		= 40,
> +	GH_ERROR_IRQ_UNBOUND		= 41,
> +
> +	GH_ERROR_CSPACE_CAP_NULL	= 50,
> +	GH_ERROR_CSPACE_CAP_REVOKED	= 51,
> +	GH_ERROR_CSPACE_WRONG_OBJ_TYPE	= 52,
> +	GH_ERROR_CSPACE_INSUF_RIGHTS	= 53,
> +	GH_ERROR_CSPACE_FULL		= 54,
> +
> +	GH_ERROR_MSGQUEUE_EMPTY		= 60,
> +	GH_ERROR_MSGQUEUE_FULL		= 61,
> +};
> +
> +/**
> + * gh_error_remap() - Remap Gunyah hypervisor errors into a Linux error code
> + * @gh_error: Gunyah hypercall return value
> + */
> +static inline int gh_error_remap(enum gh_error gh_error)
> +{
> +	switch (gh_error) {
> +	case GH_ERROR_OK:
> +		return 0;
> +	case GH_ERROR_NOMEM:
> +		return -ENOMEM;
> +	case GH_ERROR_DENIED:
> +	case GH_ERROR_CSPACE_CAP_NULL:
> +	case GH_ERROR_CSPACE_CAP_REVOKED:
> +	case GH_ERROR_CSPACE_WRONG_OBJ_TYPE:
> +	case GH_ERROR_CSPACE_INSUF_RIGHTS:
> +	case GH_ERROR_CSPACE_FULL:
> +		return -EACCES;
> +	case GH_ERROR_BUSY:
> +	case GH_ERROR_IDLE:
> +		return -EBUSY;
> +	case GH_ERROR_IRQ_BOUND:
> +	case GH_ERROR_IRQ_UNBOUND:
> +	case GH_ERROR_MSGQUEUE_FULL:
> +	case GH_ERROR_MSGQUEUE_EMPTY:
> +		return -EIO;
> +	case GH_ERROR_UNIMPLEMENTED:
> +	case GH_ERROR_RETRY:
> +		return -EOPNOTSUPP;
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +#endif
Srinivas Kandagatla June 5, 2023, 9:32 p.m. UTC | #12
On 09/05/2023 21:47, Elliot Berman wrote:
> Gunyah message queues are a unidirectional inter-VM pipe for messages up
> to 1024 bytes. This driver supports pairing a receiver message queue and
> a transmitter message queue to expose a single mailbox channel.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---


Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

--srini

>   Documentation/virt/gunyah/message-queue.rst |   8 +
>   drivers/mailbox/Makefile                    |   2 +
>   drivers/mailbox/gunyah-msgq.c               | 212 ++++++++++++++++++++
>   include/linux/gunyah.h                      |  57 ++++++
>   4 files changed, 279 insertions(+)
>   create mode 100644 drivers/mailbox/gunyah-msgq.c
> 
> diff --git a/Documentation/virt/gunyah/message-queue.rst b/Documentation/virt/gunyah/message-queue.rst
> index b352918ae54b..70d82a4ef32d 100644
> --- a/Documentation/virt/gunyah/message-queue.rst
> +++ b/Documentation/virt/gunyah/message-queue.rst
> @@ -61,3 +61,11 @@ vIRQ: two TX message queues will have two vIRQs (and two capability IDs).
>         |               |         |                 |         |               |
>         |               |         |                 |         |               |
>         +---------------+         +-----------------+         +---------------+
> +
> +Gunyah message queues are exposed as mailboxes. To create the mailbox, create
> +a mbox_client and call `gh_msgq_init()`. On receipt of the RX_READY interrupt,
> +all messages in the RX message queue are read and pushed via the `rx_callback`
> +of the registered mbox_client.
> +
> +.. kernel-doc:: drivers/mailbox/gunyah-msgq.c
> +   :identifiers: gh_msgq_init
> diff --git a/drivers/mailbox/Makefile b/drivers/mailbox/Makefile
> index fc9376117111..5f929bb55e9a 100644
> --- a/drivers/mailbox/Makefile
> +++ b/drivers/mailbox/Makefile
> @@ -55,6 +55,8 @@ obj-$(CONFIG_MTK_CMDQ_MBOX)	+= mtk-cmdq-mailbox.o
>   
>   obj-$(CONFIG_ZYNQMP_IPI_MBOX)	+= zynqmp-ipi-mailbox.o
>   
> +obj-$(CONFIG_GUNYAH)		+= gunyah-msgq.o
> +
>   obj-$(CONFIG_SUN6I_MSGBOX)	+= sun6i-msgbox.o
>   
>   obj-$(CONFIG_SPRD_MBOX)		+= sprd-mailbox.o
> diff --git a/drivers/mailbox/gunyah-msgq.c b/drivers/mailbox/gunyah-msgq.c
> new file mode 100644
> index 000000000000..b7a54f233680
> --- /dev/null
> +++ b/drivers/mailbox/gunyah-msgq.c
> @@ -0,0 +1,212 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/mailbox_controller.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/gunyah.h>
> +#include <linux/printk.h>
> +#include <linux/init.h>
> +#include <linux/slab.h>
> +#include <linux/wait.h>
> +
> +#define mbox_chan_to_msgq(chan) (container_of(chan->mbox, struct gh_msgq, mbox))
> +
> +static irqreturn_t gh_msgq_rx_irq_handler(int irq, void *data)
> +{
> +	struct gh_msgq *msgq = data;
> +	struct gh_msgq_rx_data rx_data;
> +	enum gh_error gh_error;
> +	bool ready = true;
> +
> +	while (ready) {
> +		gh_error = gh_hypercall_msgq_recv(msgq->rx_ghrsc->capid,
> +				&rx_data.data, sizeof(rx_data.data),
> +				&rx_data.length, &ready);
> +		if (gh_error != GH_ERROR_OK) {
> +			if (gh_error != GH_ERROR_MSGQUEUE_EMPTY)
> +				dev_warn(msgq->mbox.dev, "Failed to receive data: %d\n", gh_error);
> +			break;
> +		}
> +		if (likely(gh_msgq_chan(msgq)->cl))
> +			mbox_chan_received_data(gh_msgq_chan(msgq), &rx_data);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Fired when message queue transitions from "full" to "space available" to send messages */
> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
> +{
> +	struct gh_msgq *msgq = data;
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), 0);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Fired after sending message and hypercall told us there was more space available. */
> +static void gh_msgq_txdone_tasklet(struct tasklet_struct *tasklet)
> +{
> +	struct gh_msgq *msgq = container_of(tasklet, struct gh_msgq, txdone_tasklet);
> +
> +	mbox_chan_txdone(gh_msgq_chan(msgq), msgq->last_ret);
> +}
> +
> +static int gh_msgq_send_data(struct mbox_chan *chan, void *data)
> +{
> +	struct gh_msgq *msgq = mbox_chan_to_msgq(chan);
> +	struct gh_msgq_tx_data *msgq_data = data;
> +	u64 tx_flags = 0;
> +	enum gh_error gh_error;
> +	bool ready;
> +
> +	if (!msgq->tx_ghrsc)
> +		return -EOPNOTSUPP;
> +
> +	if (msgq_data->push)
> +		tx_flags |= GH_HYPERCALL_MSGQ_TX_FLAGS_PUSH;
> +
> +	gh_error = gh_hypercall_msgq_send(msgq->tx_ghrsc->capid, msgq_data->length, msgq_data->data,
> +						tx_flags, &ready);
> +
> +	/**
> +	 * unlikely because Linux tracks state of msgq and should not try to
> +	 * send message when msgq is full.
> +	 */
> +	if (unlikely(gh_error == GH_ERROR_MSGQUEUE_FULL))
> +		return -EAGAIN;
> +
> +	/**
> +	 * Propagate all other errors to client. If we return error to mailbox
> +	 * framework, then no other messages can be sent and nobody will know
> +	 * to retry this message.
> +	 */
> +	msgq->last_ret = gh_error_remap(gh_error);
> +
> +	/**
> +	 * This message was successfully sent, but message queue isn't ready to
> +	 * accept more messages because it's now full. Mailbox framework
> +	 * requires that we only report that message was transmitted when
> +	 * we're ready to transmit another message. We'll get that in the form
> +	 * of tx IRQ once the other side starts to drain the msgq.
> +	 */
> +	if (gh_error == GH_ERROR_OK) {
> +		if (!ready)
> +			return 0;
> +	} else
> +		dev_err(msgq->mbox.dev, "Failed to send data: %d (%d)\n", gh_error, msgq->last_ret);
> +
> +	/**
> +	 * We can send more messages. Mailbox framework requires that tx done
> +	 * happens asynchronously to sending the message. Gunyah message queues
> +	 * tell us right away on the hypercall return whether we can send more
> +	 * messages. To work around this, defer the txdone to a tasklet.
> +	 */
> +	tasklet_schedule(&msgq->txdone_tasklet);
> +
> +	return 0;
> +}
> +
> +static struct mbox_chan_ops gh_msgq_ops = {
> +	.send_data = gh_msgq_send_data,
> +};
> +
> +/**
> + * gh_msgq_init() - Initialize a Gunyah message queue with an mbox_client
> + * @parent: device parent used for the mailbox controller
> + * @msgq: Pointer to the gh_msgq to initialize
> + * @cl: A mailbox client to bind to the mailbox channel that the message queue creates
> + * @tx_ghrsc: optional, the transmission side of the message queue
> + * @rx_ghrsc: optional, the receiving side of the message queue
> + *
> + * At least one of tx_ghrsc and rx_ghrsc must be not NULL. Most message queue use cases come with
> + * a pair of message queues to facilitate bidirectional communication. When tx_ghrsc is set,
> + * the client can send messages with mbox_send_message(gh_msgq_chan(msgq), msg). When rx_ghrsc
> + * is set, the mbox_client must register an .rx_callback() and the message queue driver will
> + * deliver all available messages upon receiving the RX ready interrupt. The messages should be
> + * consumed or copied by the client right away as the gh_msgq_rx_data will be replaced/destroyed
> + * after the callback.
> + *
> + * Returns - 0 on success, negative otherwise
> + */
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> +		 struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc)
> +{
> +	int ret;
> +
> +	/* Must have at least a tx_ghrsc or rx_ghrsc and that they are the right device types */
> +	if ((!tx_ghrsc && !rx_ghrsc) ||
> +	    (tx_ghrsc && tx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_TX) ||
> +	    (rx_ghrsc && rx_ghrsc->type != GH_RESOURCE_TYPE_MSGQ_RX))
> +		return -EINVAL;
> +
> +	msgq->mbox.dev = parent;
> +	msgq->mbox.ops = &gh_msgq_ops;
> +	msgq->mbox.num_chans = 1;
> +	msgq->mbox.txdone_irq = true;
> +	msgq->mbox.chans = &msgq->mbox_chan;
> +
> +	ret = mbox_controller_register(&msgq->mbox);
> +	if (ret)
> +		return ret;
> +
> +	ret = mbox_bind_client(gh_msgq_chan(msgq), cl);
> +	if (ret)
> +		goto err_mbox;
> +
> +	if (tx_ghrsc) {
> +		msgq->tx_ghrsc = tx_ghrsc;
> +
> +		ret = request_irq(msgq->tx_ghrsc->irq, gh_msgq_tx_irq_handler, 0, "gh_msgq_tx",
> +				msgq);
> +		if (ret)
> +			goto err_tx_ghrsc;
> +
> +		tasklet_setup(&msgq->txdone_tasklet, gh_msgq_txdone_tasklet);
> +	}
> +
> +	if (rx_ghrsc) {
> +		msgq->rx_ghrsc = rx_ghrsc;
> +
> +		ret = request_threaded_irq(msgq->rx_ghrsc->irq, NULL, gh_msgq_rx_irq_handler,
> +						IRQF_ONESHOT, "gh_msgq_rx", msgq);
> +		if (ret)
> +			goto err_tx_irq;
> +	}
> +
> +	return 0;
> +err_tx_irq:
> +	if (msgq->tx_ghrsc)
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +
> +	msgq->rx_ghrsc = NULL;
> +err_tx_ghrsc:
> +	msgq->tx_ghrsc = NULL;
> +err_mbox:
> +	mbox_controller_unregister(&msgq->mbox);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_init);
> +
> +void gh_msgq_remove(struct gh_msgq *msgq)
> +{
> +	if (msgq->rx_ghrsc)
> +		free_irq(msgq->rx_ghrsc->irq, msgq);
> +
> +	if (msgq->tx_ghrsc) {
> +		tasklet_kill(&msgq->txdone_tasklet);
> +		free_irq(msgq->tx_ghrsc->irq, msgq);
> +	}
> +
> +	mbox_controller_unregister(&msgq->mbox);
> +
> +	msgq->rx_ghrsc = NULL;
> +	msgq->tx_ghrsc = NULL;
> +}
> +EXPORT_SYMBOL_GPL(gh_msgq_remove);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("Gunyah Message Queue Driver");
> diff --git a/include/linux/gunyah.h b/include/linux/gunyah.h
> index 01a6f202d037..982e27d10d57 100644
> --- a/include/linux/gunyah.h
> +++ b/include/linux/gunyah.h
> @@ -8,11 +8,68 @@
>   
>   #include <linux/bitfield.h>
>   #include <linux/errno.h>
> +#include <linux/interrupt.h>
>   #include <linux/limits.h>
> +#include <linux/mailbox_controller.h>
> +#include <linux/mailbox_client.h>
>   #include <linux/types.h>
>   
> +/* Matches resource manager's resource types for VM_GET_HYP_RESOURCES RPC */
> +enum gh_resource_type {
> +	GH_RESOURCE_TYPE_BELL_TX	= 0,
> +	GH_RESOURCE_TYPE_BELL_RX	= 1,
> +	GH_RESOURCE_TYPE_MSGQ_TX	= 2,
> +	GH_RESOURCE_TYPE_MSGQ_RX	= 3,
> +	GH_RESOURCE_TYPE_VCPU		= 4,
> +};
> +
> +struct gh_resource {
> +	enum gh_resource_type type;
> +	u64 capid;
> +	unsigned int irq;
> +};
> +
> +/**
> + * Gunyah Message Queues
> + */
> +
> +#define GH_MSGQ_MAX_MSG_SIZE		240
> +
> +struct gh_msgq_tx_data {
> +	size_t length;
> +	bool push;
> +	char data[];
> +};
> +
> +struct gh_msgq_rx_data {
> +	size_t length;
> +	char data[GH_MSGQ_MAX_MSG_SIZE];
> +};
> +
> +struct gh_msgq {
> +	struct gh_resource *tx_ghrsc;
> +	struct gh_resource *rx_ghrsc;
> +
> +	/* msgq private */
> +	int last_ret; /* Linux error, not GH_STATUS_* */
> +	struct mbox_chan mbox_chan;
> +	struct mbox_controller mbox;
> +	struct tasklet_struct txdone_tasklet;
> +};
> +
> +
> +int gh_msgq_init(struct device *parent, struct gh_msgq *msgq, struct mbox_client *cl,
> +		     struct gh_resource *tx_ghrsc, struct gh_resource *rx_ghrsc);
> +void gh_msgq_remove(struct gh_msgq *msgq);
> +
> +static inline struct mbox_chan *gh_msgq_chan(struct gh_msgq *msgq)
> +{
> +	return &msgq->mbox.chans[0];
> +}
> +
>   /******************************************************************************/
>   /* Common arch-independent definitions for Gunyah hypercalls                  */
> +
>   #define GH_CAPID_INVAL	U64_MAX
>   #define GH_VMID_ROOT_VM	0xff
>
Srinivas Kandagatla June 6, 2023, 12:49 p.m. UTC | #13
On 09/05/2023 21:47, Elliot Berman wrote:
> Add Gunyah Resource Manager RPC to launch an unauthenticated VM.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---



Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

--srini


>   drivers/virt/gunyah/Makefile      |   2 +-
>   drivers/virt/gunyah/rsc_mgr_rpc.c | 259 ++++++++++++++++++++++++++++++
>   include/linux/gunyah_rsc_mgr.h    |  73 +++++++++
>   3 files changed, 333 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/virt/gunyah/rsc_mgr_rpc.c
> 
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 0f5aec834698..241bab357b86 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,4 +1,4 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
> -gunyah-y += rsc_mgr.o
> +gunyah-y += rsc_mgr.o rsc_mgr_rpc.o
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> new file mode 100644
> index 000000000000..a4a9f0ba4e1f
> --- /dev/null
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -0,0 +1,259 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +#include "rsc_mgr.h"
> +
> +/* Message IDs: VM Management */
> +#define GH_RM_RPC_VM_ALLOC_VMID			0x56000001
> +#define GH_RM_RPC_VM_DEALLOC_VMID		0x56000002
> +#define GH_RM_RPC_VM_START			0x56000004
> +#define GH_RM_RPC_VM_STOP			0x56000005
> +#define GH_RM_RPC_VM_RESET			0x56000006
> +#define GH_RM_RPC_VM_CONFIG_IMAGE		0x56000009
> +#define GH_RM_RPC_VM_INIT			0x5600000B
> +#define GH_RM_RPC_VM_GET_HYP_RESOURCES		0x56000020
> +#define GH_RM_RPC_VM_GET_VMID			0x56000024
> +
> +struct gh_rm_vm_common_vmid_req {
> +	__le16 vmid;
> +	__le16 _padding;
> +} __packed;
> +
> +/* Call: VM_ALLOC */
> +struct gh_rm_vm_alloc_vmid_resp {
> +	__le16 vmid;
> +	__le16 _padding;
> +} __packed;
> +
> +/* Call: VM_STOP */
> +#define GH_RM_VM_STOP_FLAG_FORCE_STOP		BIT(0)
> +
> +#define GH_RM_VM_STOP_REASON_FORCE_STOP		3
> +
> +struct gh_rm_vm_stop_req {
> +	__le16 vmid;
> +	u8 flags;
> +	u8 _padding;
> +	__le32 stop_reason;
> +} __packed;
> +
> +/* Call: VM_CONFIG_IMAGE */
> +struct gh_rm_vm_config_image_req {
> +	__le16 vmid;
> +	__le16 auth_mech;
> +	__le32 mem_handle;
> +	__le64 image_offset;
> +	__le64 image_size;
> +	__le64 dtb_offset;
> +	__le64 dtb_size;
> +} __packed;
> +
> +/*
> + * Several RM calls take only a VMID as a parameter and give only standard
> + * response back. Deduplicate boilerplate code by using this common call.
> + */
> +static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
> +{
> +	struct gh_rm_vm_common_vmid_req req_payload = {
> +		.vmid = cpu_to_le16(vmid),
> +	};
> +
> +	return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: Use 0 to dynamically allocate a VM. A reserved VMID can be supplied
> + *        to request allocation of a platform-defined VM.
> + *
> + * Returns - the allocated VMID or negative value on error
> + */
> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid)
> +{
> +	struct gh_rm_vm_common_vmid_req req_payload = {
> +		.vmid = cpu_to_le16(vmid),
> +	};
> +	struct gh_rm_vm_alloc_vmid_resp *resp_payload;
> +	size_t resp_size;
> +	void *resp;
> +	int ret;
> +
> +	ret = gh_rm_call(rm, GH_RM_RPC_VM_ALLOC_VMID, &req_payload, sizeof(req_payload), &resp,
> +			&resp_size);
> +	if (ret)
> +		return ret;
> +
> +	if (!vmid) {
> +		resp_payload = resp;
> +		ret = le16_to_cpu(resp_payload->vmid);
> +		kfree(resp);
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * gh_rm_dealloc_vmid() - Dispose of a VMID
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + */
> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid)
> +{
> +	return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_DEALLOC_VMID, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_reset() - Reset a VM's resources
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + *
> + * As part of tearing down the VM, request RM to clean up all the VM resources
> + * associated with the VM. Only after this, Linux can clean up all the
> + * references it maintains to resources.
> + */
> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid)
> +{
> +	return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_RESET, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_start() - Move a VM into "ready to run" state
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + *
> + * On VMs which use proxy scheduling, vcpu_run is needed to actually run the VM.
> + * On VMs which use Gunyah's scheduling, the vCPUs start executing in accordance with Gunyah
> + * scheduling policies.
> + */
> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid)
> +{
> +	return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_START, vmid);
> +}
> +
> +/**
> + * gh_rm_vm_stop() - Send a request to Resource Manager VM to forcibly stop a VM.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + */
> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid)
> +{
> +	struct gh_rm_vm_stop_req req_payload = {
> +		.vmid = cpu_to_le16(vmid),
> +		.flags = GH_RM_VM_STOP_FLAG_FORCE_STOP,
> +		.stop_reason = cpu_to_le32(GH_RM_VM_STOP_REASON_FORCE_STOP),
> +	};
> +
> +	return gh_rm_call(rm, GH_RM_RPC_VM_STOP, &req_payload, sizeof(req_payload), NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_vm_configure() - Prepare a VM to start and provide the common
> + *			  configuration needed by RM to configure a VM
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier allocated with gh_rm_alloc_vmid
> + * @auth_mechanism: Authentication mechanism used by resource manager to verify
> + *                  the virtual machine
> + * @mem_handle: Handle to a previously shared memparcel that contains all parts
> + *              of the VM image subject to authentication.
> + * @image_offset: Start address of VM image, relative to the start of memparcel
> + * @image_size: Size of the VM image
> + * @dtb_offset: Start address of the devicetree binary with VM configuration,
> + *              relative to start of memparcel.
> + * @dtb_size: Maximum size of devicetree binary.
> + */
> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> +		u32 mem_handle, u64 image_offset, u64 image_size, u64 dtb_offset, u64 dtb_size)
> +{
> +	struct gh_rm_vm_config_image_req req_payload = {
> +		.vmid = cpu_to_le16(vmid),
> +		.auth_mech = cpu_to_le16(auth_mechanism),
> +		.mem_handle = cpu_to_le32(mem_handle),
> +		.image_offset = cpu_to_le64(image_offset),
> +		.image_size = cpu_to_le64(image_size),
> +		.dtb_offset = cpu_to_le64(dtb_offset),
> +		.dtb_size = cpu_to_le64(dtb_size),
> +	};
> +
> +	return gh_rm_call(rm, GH_RM_RPC_VM_CONFIG_IMAGE, &req_payload, sizeof(req_payload),
> +			  NULL, NULL);
> +}
> +
> +/**
> + * gh_rm_vm_init() - Move the VM to initialized state.
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VM identifier
> + *
> + * RM will allocate needed resources for the VM.
> + */
> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid)
> +{
> +	return gh_rm_common_vmid_call(rm, GH_RM_RPC_VM_INIT, vmid);
> +}
> +
> +/**
> + * gh_rm_get_hyp_resources() - Retrieve hypervisor resources (capabilities) associated with a VM
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: VMID of the other VM to get the resources of
> + * @resources: Set by gh_rm_get_hyp_resources and contains the returned hypervisor resources.
> + *             Caller must free the resources pointer if successful.
> + */
> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> +				struct gh_rm_hyp_resources **resources)
> +{
> +	struct gh_rm_vm_common_vmid_req req_payload = {
> +		.vmid = cpu_to_le16(vmid),
> +	};
> +	struct gh_rm_hyp_resources *resp;
> +	size_t resp_size;
> +	int ret;
> +
> +	ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_HYP_RESOURCES,
> +			 &req_payload, sizeof(req_payload),
> +			 (void **)&resp, &resp_size);
> +	if (ret)
> +		return ret;
> +
> +	if (!resp_size)
> +		return -EBADMSG;
> +
> +	if (resp_size < struct_size(resp, entries, 0) ||
> +		resp_size != struct_size(resp, entries, le32_to_cpu(resp->n_entries))) {
> +		kfree(resp);
> +		return -EBADMSG;
> +	}
> +
> +	*resources = resp;
> +	return 0;
> +}
> +
> +/**
> + * gh_rm_get_vmid() - Retrieve VMID of this virtual machine
> + * @rm: Handle to a Gunyah resource manager
> + * @vmid: Filled with the VMID of this VM
> + */
> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid)
> +{
> +	static u16 cached_vmid = GH_VMID_INVAL;
> +	size_t resp_size;
> +	__le32 *resp;
> +	int ret;
> +
> +	if (cached_vmid != GH_VMID_INVAL) {
> +		*vmid = cached_vmid;
> +		return 0;
> +	}
> +
> +	ret = gh_rm_call(rm, GH_RM_RPC_VM_GET_VMID, NULL, 0, (void **)&resp, &resp_size);
> +	if (ret)
> +		return ret;
> +
> +	*vmid = cached_vmid = lower_16_bits(le32_to_cpu(*resp));
> +	kfree(resp);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_get_vmid);
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index f2a312e80af5..1ac66d9004d2 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -18,4 +18,77 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb);
>   struct device *gh_rm_get(struct gh_rm *rm);
>   void gh_rm_put(struct gh_rm *rm);
>   
> +struct gh_rm_vm_exited_payload {
> +	__le16 vmid;
> +	__le16 exit_type;
> +	__le32 exit_reason_size;
> +	u8 exit_reason[];
> +} __packed;
> +
> +#define GH_RM_NOTIFICATION_VM_EXITED		 0x56100001
> +
> +enum gh_rm_vm_status {
> +	GH_RM_VM_STATUS_NO_STATE	= 0,
> +	GH_RM_VM_STATUS_INIT		= 1,
> +	GH_RM_VM_STATUS_READY		= 2,
> +	GH_RM_VM_STATUS_RUNNING		= 3,
> +	GH_RM_VM_STATUS_PAUSED		= 4,
> +	GH_RM_VM_STATUS_LOAD		= 5,
> +	GH_RM_VM_STATUS_AUTH		= 6,
> +	GH_RM_VM_STATUS_INIT_FAILED	= 8,
> +	GH_RM_VM_STATUS_EXITED		= 9,
> +	GH_RM_VM_STATUS_RESETTING	= 10,
> +	GH_RM_VM_STATUS_RESET		= 11,
> +};
> +
> +struct gh_rm_vm_status_payload {
> +	__le16 vmid;
> +	u16 reserved;
> +	u8 vm_status;
> +	u8 os_status;
> +	__le16 app_status;
> +} __packed;
> +
> +#define GH_RM_NOTIFICATION_VM_STATUS		 0x56100008
> +
> +/* RPC Calls */
> +int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
> +int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_start(struct gh_rm *rm, u16 vmid);
> +int gh_rm_vm_stop(struct gh_rm *rm, u16 vmid);
> +
> +enum gh_rm_vm_auth_mechanism {
> +	GH_RM_VM_AUTH_NONE		= 0,
> +	GH_RM_VM_AUTH_QCOM_PIL_ELF	= 1,
> +	GH_RM_VM_AUTH_QCOM_ANDROID_PVM	= 2,
> +};
> +
> +int gh_rm_vm_configure(struct gh_rm *rm, u16 vmid, enum gh_rm_vm_auth_mechanism auth_mechanism,
> +			u32 mem_handle, u64 image_offset, u64 image_size,
> +			u64 dtb_offset, u64 dtb_size);
> +int gh_rm_vm_init(struct gh_rm *rm, u16 vmid);
> +
> +struct gh_rm_hyp_resource {
> +	u8 type;
> +	u8 reserved;
> +	__le16 partner_vmid;
> +	__le32 resource_handle;
> +	__le32 resource_label;
> +	__le64 cap_id;
> +	__le32 virq_handle;
> +	__le32 virq;
> +	__le64 base;
> +	__le64 size;
> +} __packed;
> +
> +struct gh_rm_hyp_resources {
> +	__le32 n_entries;
> +	struct gh_rm_hyp_resource entries[];
> +} __packed;
> +
> +int gh_rm_get_hyp_resources(struct gh_rm *rm, u16 vmid,
> +				struct gh_rm_hyp_resources **resources);
> +int gh_rm_get_vmid(struct gh_rm *rm, u16 *vmid);
> +
>   #endif
Srinivas Kandagatla June 6, 2023, 12:51 p.m. UTC | #14
On 09/05/2023 21:47, Elliot Berman wrote:
> Gunyah VM manager is a kernel moduel which exposes an interface to
> Gunyah userspace to load, run, and interact with other Gunyah virtual
> machines. The interface is a character device at /dev/gunyah.
> 
> Add a basic VM manager driver. Upcoming patches will add more ioctls
> into this driver.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---



Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

--srini

>   .../userspace-api/ioctl/ioctl-number.rst      |  1 +
>   drivers/virt/gunyah/Makefile                  |  2 +-
>   drivers/virt/gunyah/rsc_mgr.c                 | 50 +++++++++-
>   drivers/virt/gunyah/vm_mgr.c                  | 93 +++++++++++++++++++
>   drivers/virt/gunyah/vm_mgr.h                  | 20 ++++
>   include/uapi/linux/gunyah.h                   | 23 +++++
>   6 files changed, 187 insertions(+), 2 deletions(-)
>   create mode 100644 drivers/virt/gunyah/vm_mgr.c
>   create mode 100644 drivers/virt/gunyah/vm_mgr.h
>   create mode 100644 include/uapi/linux/gunyah.h
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index 176e8fc3f31b..396212e88f7d 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -137,6 +137,7 @@ Code  Seq#    Include File                                           Comments
>   'F'   DD     video/sstfb.h                                           conflict!
>   'G'   00-3F  drivers/misc/sgi-gru/grulib.h                           conflict!
>   'G'   00-0F  xen/gntalloc.h, xen/gntdev.h                            conflict!
> +'G'   00-0f  linux/gunyah.h                                          conflict!
>   'H'   00-7F  linux/hiddev.h                                          conflict!
>   'H'   00-0F  linux/hidraw.h                                          conflict!
>   'H'   01     linux/mei.h                                             conflict!
> diff --git a/drivers/virt/gunyah/Makefile b/drivers/virt/gunyah/Makefile
> index 241bab357b86..e47e25895299 100644
> --- a/drivers/virt/gunyah/Makefile
> +++ b/drivers/virt/gunyah/Makefile
> @@ -1,4 +1,4 @@
>   # SPDX-License-Identifier: GPL-2.0
>   
> -gunyah-y += rsc_mgr.o rsc_mgr_rpc.o
> +gunyah-y += rsc_mgr.o rsc_mgr_rpc.o vm_mgr.o
>   obj-$(CONFIG_GUNYAH) += gunyah.o
> diff --git a/drivers/virt/gunyah/rsc_mgr.c b/drivers/virt/gunyah/rsc_mgr.c
> index 88b5beb1ea51..4f6f96bdcf3d 100644
> --- a/drivers/virt/gunyah/rsc_mgr.c
> +++ b/drivers/virt/gunyah/rsc_mgr.c
> @@ -15,8 +15,10 @@
>   #include <linux/completion.h>
>   #include <linux/gunyah_rsc_mgr.h>
>   #include <linux/platform_device.h>
> +#include <linux/miscdevice.h>
>   
>   #include "rsc_mgr.h"
> +#include "vm_mgr.h"
>   
>   #define RM_RPC_API_VERSION_MASK		GENMASK(3, 0)
>   #define RM_RPC_HEADER_WORDS_MASK	GENMASK(7, 4)
> @@ -130,6 +132,7 @@ struct gh_rm_connection {
>    * @cache: cache for allocating Tx messages
>    * @send_lock: synchronization to allow only one request to be sent at a time
>    * @nh: notifier chain for clients interested in RM notification messages
> + * @miscdev: /dev/gunyah
>    */
>   struct gh_rm {
>   	struct device *dev;
> @@ -146,6 +149,8 @@ struct gh_rm {
>   	struct kmem_cache *cache;
>   	struct mutex send_lock;
>   	struct blocking_notifier_head nh;
> +
> +	struct miscdevice miscdev;
>   };
>   
>   /**
> @@ -581,6 +586,33 @@ int gh_rm_notifier_unregister(struct gh_rm *rm, struct notifier_block *nb)
>   }
>   EXPORT_SYMBOL_GPL(gh_rm_notifier_unregister);
>   
> +struct device *gh_rm_get(struct gh_rm *rm)
> +{
> +	return get_device(rm->miscdev.this_device);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_get);
> +
> +void gh_rm_put(struct gh_rm *rm)
> +{
> +	put_device(rm->miscdev.this_device);
> +}
> +EXPORT_SYMBOL_GPL(gh_rm_put);
> +
> +static long gh_dev_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
> +{
> +	struct miscdevice *miscdev = filp->private_data;
> +	struct gh_rm *rm = container_of(miscdev, struct gh_rm, miscdev);
> +
> +	return gh_dev_vm_mgr_ioctl(rm, cmd, arg);
> +}
> +
> +static const struct file_operations gh_dev_fops = {
> +	.owner		= THIS_MODULE,
> +	.unlocked_ioctl	= gh_dev_ioctl,
> +	.compat_ioctl	= compat_ptr_ioctl,
> +	.llseek		= noop_llseek,
> +};
> +
>   static int gh_msgq_platform_probe_direction(struct platform_device *pdev, bool tx,
>   					    struct gh_resource *ghrsc)
>   {
> @@ -665,7 +697,22 @@ static int gh_rm_drv_probe(struct platform_device *pdev)
>   	rm->msgq_client.rx_callback = gh_rm_msgq_rx_data;
>   	rm->msgq_client.tx_done = gh_rm_msgq_tx_done;
>   
> -	return gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	ret = gh_msgq_init(&pdev->dev, &rm->msgq, &rm->msgq_client, &rm->tx_ghrsc, &rm->rx_ghrsc);
> +	if (ret)
> +		goto err_cache;
> +
> +	rm->miscdev.name = "gunyah";
> +	rm->miscdev.minor = MISC_DYNAMIC_MINOR;
> +	rm->miscdev.fops = &gh_dev_fops;
> +
> +	ret = misc_register(&rm->miscdev);
> +	if (ret)
> +		goto err_msgq;
> +
> +	return 0;
> +err_msgq:
> +	mbox_free_channel(gh_msgq_chan(&rm->msgq));
> +	gh_msgq_remove(&rm->msgq);
>   err_cache:
>   	kmem_cache_destroy(rm->cache);
>   	return ret;
> @@ -675,6 +722,7 @@ static int gh_rm_drv_remove(struct platform_device *pdev)
>   {
>   	struct gh_rm *rm = platform_get_drvdata(pdev);
>   
> +	misc_deregister(&rm->miscdev);
>   	mbox_free_channel(gh_msgq_chan(&rm->msgq));
>   	gh_msgq_remove(&rm->msgq);
>   	kmem_cache_destroy(rm->cache);
> diff --git a/drivers/virt/gunyah/vm_mgr.c b/drivers/virt/gunyah/vm_mgr.c
> new file mode 100644
> index 000000000000..a43401cb34f7
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.c
> @@ -0,0 +1,93 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "gh_vm_mgr: " fmt
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/file.h>
> +#include <linux/gunyah_rsc_mgr.h>
> +#include <linux/miscdevice.h>
> +#include <linux/module.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +#include "vm_mgr.h"
> +
> +static __must_check struct gh_vm *gh_vm_alloc(struct gh_rm *rm)
> +{
> +	struct gh_vm *ghvm;
> +
> +	ghvm = kzalloc(sizeof(*ghvm), GFP_KERNEL);
> +	if (!ghvm)
> +		return ERR_PTR(-ENOMEM);
> +
> +	ghvm->parent = gh_rm_get(rm);
> +	ghvm->rm = rm;
> +
> +	return ghvm;
> +}
> +
> +static int gh_vm_release(struct inode *inode, struct file *filp)
> +{
> +	struct gh_vm *ghvm = filp->private_data;
> +
> +	gh_rm_put(ghvm->rm);
> +	kfree(ghvm);
> +	return 0;
> +}
> +
> +static const struct file_operations gh_vm_fops = {
> +	.owner = THIS_MODULE,
> +	.release = gh_vm_release,
> +	.llseek = noop_llseek,
> +};
> +
> +static long gh_dev_ioctl_create_vm(struct gh_rm *rm, unsigned long arg)
> +{
> +	struct gh_vm *ghvm;
> +	struct file *file;
> +	int fd, err;
> +
> +	/* arg reserved for future use. */
> +	if (arg)
> +		return -EINVAL;
> +
> +	ghvm = gh_vm_alloc(rm);
> +	if (IS_ERR(ghvm))
> +		return PTR_ERR(ghvm);
> +
> +	fd = get_unused_fd_flags(O_CLOEXEC);
> +	if (fd < 0) {
> +		err = fd;
> +		goto err_destroy_vm;
> +	}
> +
> +	file = anon_inode_getfile("gunyah-vm", &gh_vm_fops, ghvm, O_RDWR);
> +	if (IS_ERR(file)) {
> +		err = PTR_ERR(file);
> +		goto err_put_fd;
> +	}
> +
> +	fd_install(fd, file);
> +
> +	return fd;
> +
> +err_put_fd:
> +	put_unused_fd(fd);
> +err_destroy_vm:
> +	gh_rm_put(ghvm->rm);
> +	kfree(ghvm);
> +	return err;
> +}
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg)
> +{
> +	switch (cmd) {
> +	case GH_CREATE_VM:
> +		return gh_dev_ioctl_create_vm(rm, arg);
> +	default:
> +		return -ENOTTY;
> +	}
> +}
> diff --git a/drivers/virt/gunyah/vm_mgr.h b/drivers/virt/gunyah/vm_mgr.h
> new file mode 100644
> index 000000000000..1e94b58d7d34
> --- /dev/null
> +++ b/drivers/virt/gunyah/vm_mgr.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _GH_VM_MGR_H
> +#define _GH_VM_MGR_H
> +
> +#include <linux/gunyah_rsc_mgr.h>
> +
> +#include <uapi/linux/gunyah.h>
> +
> +long gh_dev_vm_mgr_ioctl(struct gh_rm *rm, unsigned int cmd, unsigned long arg);
> +
> +struct gh_vm {
> +	struct gh_rm *rm;
> +	struct device *parent;
> +};
> +
> +#endif
> diff --git a/include/uapi/linux/gunyah.h b/include/uapi/linux/gunyah.h
> new file mode 100644
> index 000000000000..86b9cb60118d
> --- /dev/null
> +++ b/include/uapi/linux/gunyah.h
> @@ -0,0 +1,23 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2022-2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_GUNYAH_H
> +#define _UAPI_LINUX_GUNYAH_H
> +
> +/*
> + * Userspace interface for /dev/gunyah - gunyah based virtual machine
> + */
> +
> +#include <linux/types.h>
> +#include <linux/ioctl.h>
> +
> +#define GH_IOCTL_TYPE			'G'
> +
> +/*
> + * ioctls for /dev/gunyah fds:
> + */
> +#define GH_CREATE_VM			_IO(GH_IOCTL_TYPE, 0x0) /* Returns a Gunyah VM fd */
> +
> +#endif
Srinivas Kandagatla June 6, 2023, 1:35 p.m. UTC | #15
On 09/05/2023 21:47, Elliot Berman wrote:
> Gunyah resource manager provides API to manipulate stage 2 page tables.
> Manipulations are represented as a memory parcel. Memory parcels
> describe a list of memory regions (intermediate physical address and
> size), a list of new permissions for VMs, and the memory type (DDR or
> MMIO). Memory parcels are uniquely identified by a handle allocated by
> Gunyah. There are a few types of memory parcel sharing which Gunyah
> supports:
> 
>   - Sharing: the guest and host VM both have access
>   - Lending: only the guest has access; host VM loses access
>   - Donating: Permanently lent (not reclaimed even if guest shuts down)
> 
> Memory parcels that have been shared or lent can be reclaimed by the
> host via an additional call. The reclaim operation restores the original
> access the host VM had to the memory parcel and removes the access to
> other VM.
> 
> One point to note that memory parcels don't describe where in the guest
> VM the memory parcel should reside. The guest VM must accept the memory
> parcel either explicitly via a "gh_rm_mem_accept" call (not introduced
> here) or be configured to accept it automatically at boot. As the guest
> VM accepts the memory parcel, it also mentions the IPA it wants to place
> memory parcel.
> 
> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@quicinc.com>
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---




Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>

--srini

>   drivers/virt/gunyah/rsc_mgr_rpc.c | 227 ++++++++++++++++++++++++++++++
>   include/linux/gunyah_rsc_mgr.h    |  48 +++++++
>   2 files changed, 275 insertions(+)
> 
> diff --git a/drivers/virt/gunyah/rsc_mgr_rpc.c b/drivers/virt/gunyah/rsc_mgr_rpc.c
> index a4a9f0ba4e1f..4f25f07400b3 100644
> --- a/drivers/virt/gunyah/rsc_mgr_rpc.c
> +++ b/drivers/virt/gunyah/rsc_mgr_rpc.c
> @@ -6,6 +6,12 @@
>   #include <linux/gunyah_rsc_mgr.h>
>   #include "rsc_mgr.h"
>   
> +/* Message IDs: Memory Management */
> +#define GH_RM_RPC_MEM_LEND			0x51000012
> +#define GH_RM_RPC_MEM_SHARE			0x51000013
> +#define GH_RM_RPC_MEM_RECLAIM			0x51000015
> +#define GH_RM_RPC_MEM_APPEND			0x51000018
> +
>   /* Message IDs: VM Management */
>   #define GH_RM_RPC_VM_ALLOC_VMID			0x56000001
>   #define GH_RM_RPC_VM_DEALLOC_VMID		0x56000002
> @@ -22,6 +28,46 @@ struct gh_rm_vm_common_vmid_req {
>   	__le16 _padding;
>   } __packed;
>   
> +/* Call: MEM_LEND, MEM_SHARE */
> +#define GH_MEM_SHARE_REQ_FLAGS_APPEND		BIT(1)
> +
> +struct gh_rm_mem_share_req_header {
> +	u8 mem_type;
> +	u8 _padding0;
> +	u8 flags;
> +	u8 _padding1;
> +	__le32 label;
> +} __packed;
> +
> +struct gh_rm_mem_share_req_acl_section {
> +	__le32 n_entries;
> +	struct gh_rm_mem_acl_entry entries[];
> +};
> +
> +struct gh_rm_mem_share_req_mem_section {
> +	__le16 n_entries;
> +	__le16 _padding;
> +	struct gh_rm_mem_entry entries[];
> +};
> +
> +/* Call: MEM_RELEASE */
> +struct gh_rm_mem_release_req {
> +	__le32 mem_handle;
> +	u8 flags; /* currently not used */
> +	u8 _padding0;
> +	__le16 _padding1;
> +} __packed;
> +
> +/* Call: MEM_APPEND */
> +#define GH_MEM_APPEND_REQ_FLAGS_END		BIT(0)
> +
> +struct gh_rm_mem_append_req_header {
> +	__le32 mem_handle;
> +	u8 flags;
> +	u8 _padding0;
> +	__le16 _padding1;
> +} __packed;
> +
>   /* Call: VM_ALLOC */
>   struct gh_rm_vm_alloc_vmid_resp {
>   	__le16 vmid;
> @@ -51,6 +97,8 @@ struct gh_rm_vm_config_image_req {
>   	__le64 dtb_size;
>   } __packed;
>   
> +#define GH_RM_MAX_MEM_ENTRIES	512
> +
>   /*
>    * Several RM calls take only a VMID as a parameter and give only standard
>    * response back. Deduplicate boilerplate code by using this common call.
> @@ -64,6 +112,185 @@ static int gh_rm_common_vmid_call(struct gh_rm *rm, u32 message_id, u16 vmid)
>   	return gh_rm_call(rm, message_id, &req_payload, sizeof(req_payload), NULL, NULL);
>   }
>   
> +static int _gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle, bool end_append,
> +			struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> +	struct gh_rm_mem_share_req_mem_section *mem_section;
> +	struct gh_rm_mem_append_req_header *req_header;
> +	size_t msg_size = 0;
> +	void *msg;
> +	int ret;
> +
> +	msg_size += sizeof(struct gh_rm_mem_append_req_header);
> +	msg_size += struct_size(mem_section, entries, n_mem_entries);
> +
> +	msg = kzalloc(msg_size, GFP_KERNEL);
> +	if (!msg)
> +		return -ENOMEM;
> +
> +	req_header = msg;
> +	mem_section = (void *)req_header + sizeof(struct gh_rm_mem_append_req_header);
> +
> +	req_header->mem_handle = cpu_to_le32(mem_handle);
> +	if (end_append)
> +		req_header->flags |= GH_MEM_APPEND_REQ_FLAGS_END;
> +
> +	mem_section->n_entries = cpu_to_le16(n_mem_entries);
> +	memcpy(mem_section->entries, mem_entries, sizeof(*mem_entries) * n_mem_entries);
> +
> +	ret = gh_rm_call(rm, GH_RM_RPC_MEM_APPEND, msg, msg_size, NULL, NULL);
> +	kfree(msg);
> +
> +	return ret;
> +}
> +
> +static int gh_rm_mem_append(struct gh_rm *rm, u32 mem_handle,
> +			struct gh_rm_mem_entry *mem_entries, size_t n_mem_entries)
> +{
> +	bool end_append;
> +	int ret = 0;
> +	size_t n;
> +
> +	while (n_mem_entries) {
> +		if (n_mem_entries > GH_RM_MAX_MEM_ENTRIES) {
> +			end_append = false;
> +			n = GH_RM_MAX_MEM_ENTRIES;
> +		} else {
> +			end_append = true;
> +			n = n_mem_entries;
> +		}
> +
> +		ret = _gh_rm_mem_append(rm, mem_handle, end_append, mem_entries, n);
> +		if (ret)
> +			break;
> +
> +		mem_entries += n;
> +		n_mem_entries -= n;
> +	}
> +
> +	return ret;
> +}
> +
> +static int gh_rm_mem_lend_common(struct gh_rm *rm, u32 message_id, struct gh_rm_mem_parcel *p)
> +{
> +	size_t msg_size = 0, initial_mem_entries = p->n_mem_entries, resp_size;
> +	size_t acl_section_size, mem_section_size;
> +	struct gh_rm_mem_share_req_acl_section *acl_section;
> +	struct gh_rm_mem_share_req_mem_section *mem_section;
> +	struct gh_rm_mem_share_req_header *req_header;
> +	u32 *attr_section;
> +	__le32 *resp;
> +	void *msg;
> +	int ret;
> +
> +	if (!p->acl_entries || !p->n_acl_entries || !p->mem_entries || !p->n_mem_entries ||
> +	    p->n_acl_entries > U8_MAX || p->mem_handle != GH_MEM_HANDLE_INVAL)
> +		return -EINVAL;
> +
> +	if (initial_mem_entries > GH_RM_MAX_MEM_ENTRIES)
> +		initial_mem_entries = GH_RM_MAX_MEM_ENTRIES;
> +
> +	acl_section_size = struct_size(acl_section, entries, p->n_acl_entries);
> +	mem_section_size = struct_size(mem_section, entries, initial_mem_entries);
> +	/* The format of the message goes:
> +	 * request header
> +	 * ACL entries (which VMs get what kind of access to this memory parcel)
> +	 * Memory entries (list of memory regions to share)
> +	 * Memory attributes (currently unused, we'll hard-code the size to 0)
> +	 */
> +	msg_size += sizeof(struct gh_rm_mem_share_req_header);
> +	msg_size += acl_section_size;
> +	msg_size += mem_section_size;
> +	msg_size += sizeof(u32); /* for memory attributes, currently unused */
> +
> +	msg = kzalloc(msg_size, GFP_KERNEL);
> +	if (!msg)
> +		return -ENOMEM;
> +
> +	req_header = msg;
> +	acl_section = (void *)req_header + sizeof(*req_header);
> +	mem_section = (void *)acl_section + acl_section_size;
> +	attr_section = (void *)mem_section + mem_section_size;
> +
> +	req_header->mem_type = p->mem_type;
> +	if (initial_mem_entries != p->n_mem_entries)
> +		req_header->flags |= GH_MEM_SHARE_REQ_FLAGS_APPEND;
> +	req_header->label = cpu_to_le32(p->label);
> +
> +	acl_section->n_entries = cpu_to_le32(p->n_acl_entries);
> +	memcpy(acl_section->entries, p->acl_entries,
> +		flex_array_size(acl_section, entries, p->n_acl_entries));
> +
> +	mem_section->n_entries = cpu_to_le16(initial_mem_entries);
> +	memcpy(mem_section->entries, p->mem_entries,
> +		flex_array_size(mem_section, entries, initial_mem_entries));
> +
> +	/* Set n_entries for memory attribute section to 0 */
> +	*attr_section = 0;
> +
> +	ret = gh_rm_call(rm, message_id, msg, msg_size, (void **)&resp, &resp_size);
> +	kfree(msg);
> +
> +	if (ret)
> +		return ret;
> +
> +	p->mem_handle = le32_to_cpu(*resp);
> +	kfree(resp);
> +
> +	if (initial_mem_entries != p->n_mem_entries) {
> +		ret = gh_rm_mem_append(rm, p->mem_handle,
> +					&p->mem_entries[initial_mem_entries],
> +					p->n_mem_entries - initial_mem_entries);
> +		if (ret) {
> +			gh_rm_mem_reclaim(rm, p);
> +			p->mem_handle = GH_MEM_HANDLE_INVAL;
> +		}
> +	}
> +
> +	return ret;
> +}
> +
> +/**
> + * gh_rm_mem_lend() - Lend memory to other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Information about the memory to be lent.
> + *
> + * Lending removes Linux's access to the memory while the memory parcel is lent.
> + */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> +	return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_LEND, parcel);
> +}
> +
> +
> +/**
> + * gh_rm_mem_share() - Share memory with other virtual machines.
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Information about the memory to be shared.
> + *
> + * Sharing keeps Linux's access to the memory while the memory parcel is shared.
> + */
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> +	return gh_rm_mem_lend_common(rm, GH_RM_RPC_MEM_SHARE, parcel);
> +}
> +
> +/**
> + * gh_rm_mem_reclaim() - Reclaim a memory parcel
> + * @rm: Handle to a Gunyah resource manager
> + * @parcel: Information about the memory to be reclaimed.
> + *
> + * RM maps the associated memory back into the stage-2 page tables of the owner VM.
> + */
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel)
> +{
> +	struct gh_rm_mem_release_req req = {
> +		.mem_handle = cpu_to_le32(parcel->mem_handle),
> +	};
> +
> +	return gh_rm_call(rm, GH_RM_RPC_MEM_RECLAIM, &req, sizeof(req), NULL, NULL);
> +}
> +
>   /**
>    * gh_rm_alloc_vmid() - Allocate a new VM in Gunyah. Returns the VM identifier.
>    * @rm: Handle to a Gunyah resource manager
> diff --git a/include/linux/gunyah_rsc_mgr.h b/include/linux/gunyah_rsc_mgr.h
> index 1ac66d9004d2..dfac088420bd 100644
> --- a/include/linux/gunyah_rsc_mgr.h
> +++ b/include/linux/gunyah_rsc_mgr.h
> @@ -11,6 +11,7 @@
>   #include <linux/gunyah.h>
>   
>   #define GH_VMID_INVAL		U16_MAX
> +#define GH_MEM_HANDLE_INVAL	U32_MAX
>   
>   struct gh_rm;
>   int gh_rm_notifier_register(struct gh_rm *rm, struct notifier_block *nb);
> @@ -51,7 +52,54 @@ struct gh_rm_vm_status_payload {
>   
>   #define GH_RM_NOTIFICATION_VM_STATUS		 0x56100008
>   
> +#define GH_RM_ACL_X		BIT(0)
> +#define GH_RM_ACL_W		BIT(1)
> +#define GH_RM_ACL_R		BIT(2)
> +
> +struct gh_rm_mem_acl_entry {
> +	__le16 vmid;
> +	u8 perms;
> +	u8 reserved;
> +} __packed;
> +
> +struct gh_rm_mem_entry {
> +	__le64 phys_addr;
> +	__le64 size;
> +} __packed;
> +
> +enum gh_rm_mem_type {
> +	GH_RM_MEM_TYPE_NORMAL	= 0,
> +	GH_RM_MEM_TYPE_IO	= 1,
> +};
> +
> +/*
> + * struct gh_rm_mem_parcel - Info about memory to be lent/shared/donated/reclaimed
> + * @mem_type: The type of memory: normal (DDR) or IO
> + * @label: An client-specified identifier which can be used by the other VMs to identify the purpose
> + *         of the memory parcel.
> + * @n_acl_entries: Count of the number of entries in the @acl_entries array.
> + * @acl_entries: An array of access control entries. Each entry specifies a VM and what access
> + *               is allowed for the memory parcel.
> + * @n_mem_entries: Count of the number of entries in the @mem_entries array.
> + * @mem_entries: An array of regions to be associated with the memory parcel. Addresses should be
> + *               (intermediate) physical addresses from Linux's perspective.
> + * @mem_handle: On success, filled with memory handle that RM allocates for this memory parcel
> + */
> +struct gh_rm_mem_parcel {
> +	enum gh_rm_mem_type mem_type;
> +	u32 label;
> +	size_t n_acl_entries;
> +	struct gh_rm_mem_acl_entry *acl_entries;
> +	size_t n_mem_entries;
> +	struct gh_rm_mem_entry *mem_entries;
> +	u32 mem_handle;
> +};
> +
>   /* RPC Calls */
> +int gh_rm_mem_lend(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_share(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +int gh_rm_mem_reclaim(struct gh_rm *rm, struct gh_rm_mem_parcel *parcel);
> +
>   int gh_rm_alloc_vmid(struct gh_rm *rm, u16 vmid);
>   int gh_rm_dealloc_vmid(struct gh_rm *rm, u16 vmid);
>   int gh_rm_vm_reset(struct gh_rm *rm, u16 vmid);
Elliot Berman June 13, 2023, 11:02 p.m. UTC | #16
On 5/24/2023 10:13 AM, Alex Bennée wrote:
> 
> Elliot Berman <quic_eberman@quicinc.com> writes:
> 


snip

>    Applying: mailbox: pcc: Use mbox_bind_client
> 
> 
> <snip>
>>
>> Elliot Berman (24):
> <snip>
> 
>>    mailbox: Add Gunyah message queue mailbox
> 
> This patch touches a file that isn't in mainline which makes me wonder
> if I've missed another pre-requisite patch?
> 

The v13 series had missed out on this patch: 
https://lore.kernel.org/all/20230613172054.3959700-2-quic_eberman@quicinc.com/ 
(which was present in every recent series). Apologies about that!

The v14 series applies cleanly on v6.4-rc6 (and should also apply on 
other recent tags, too).

b4 am 
https://lore.kernel.org/all/20230613172054.3959700-1-quic_eberman@quicinc.com/


> <snip>
>>   Documentation/virt/gunyah/message-queue.rst   |   8 +
> <snip>
>
Elliot Berman June 22, 2023, 11:56 p.m. UTC | #17
On 6/7/2023 8:54 AM, Elliot Berman wrote:
> 
> 
> On 6/5/2023 7:18 AM, Will Deacon wrote:
>> Hi Elliot,
>>
>> [+Quentin since he's looked at the MMU notifiers]
>>
>> Sorry for the slow response, I got buried in email during a week away.
>>
>> On Fri, May 19, 2023 at 10:02:29AM -0700, Elliot Berman wrote:
>>> On 5/19/2023 4:59 AM, Will Deacon wrote:
>>>> On Tue, May 09, 2023 at 01:47:47PM -0700, Elliot Berman wrote:
>>>>> +    ret = account_locked_vm(ghvm->mm, mapping->npages, true);
>>>>> +    if (ret)
>>>>> +        goto free_mapping;
>>>>> +
>>>>> +    mapping->pages = kcalloc(mapping->npages, 
>>>>> sizeof(*mapping->pages), GFP_KERNEL_ACCOUNT);
>>>>> +    if (!mapping->pages) {
>>>>> +        ret = -ENOMEM;
>>>>> +        mapping->npages = 0; /* update npages for reclaim */
>>>>> +        goto unlock_pages;
>>>>> +    }
>>>>> +
>>>>> +    gup_flags = FOLL_LONGTERM;
>>>>> +    if (region->flags & GH_MEM_ALLOW_WRITE)
>>>>> +        gup_flags |= FOLL_WRITE;
>>>>> +
>>>>> +    pinned = pin_user_pages_fast(region->userspace_addr, 
>>>>> mapping->npages,
>>>>> +                    gup_flags, mapping->pages);
>>>>> +    if (pinned < 0) {
>>>>> +        ret = pinned;
>>>>> +        goto free_pages;
>>>>> +    } else if (pinned != mapping->npages) {
>>>>> +        ret = -EFAULT;
>>>>> +        mapping->npages = pinned; /* update npages for reclaim */
>>>>> +        goto unpin_pages;
>>>>> +    }
>>>>
>>>> Sorry if I missed it, but I still don't see where you reject file 
>>>> mappings
>>>> here.
>>>>
>>>
>>> Sure, I can reject file mappings. I didn't catch that was the ask 
>>> previously
>>> and thought it was only a comment about behavior of file mappings.
>>
>> I thought the mention of filesystem corruption was clear enough! It's
>> definitely something we shouldn't allow.
>>
>>>> This is also the wrong interface for upstream. Please get involved with
>>>> the fd-based guest memory discussions [1] and port your series to that.
>>>>
>>>
>>> The user interface design for *shared* memory aligns with
>>> KVM_SET_USER_MEMORY_REGION.
>>
>> I don't think it does. For example, file mappings don't work (as above),
>> you're placing additional rlimit requirements on the caller, read-only
>> memslots are not functional, the memory cannot be swapped or migrated,
>> dirty logging doesn't work etc. pKVM is in the same boat, but that's why
>> we're not upstreaming this part in its current form.
>>
> 
> I thought pKVM was only holding off on upstreaming changes related to 
> guest-private memory?
> 
>>> I understood we want to use restricted memfd for giving guest-private 
>>> memory
>>> (Gunyah calls this "lending memory"). When I went through the changes, I
>>> gathered KVM is using restricted memfd only for guest-private memory 
>>> and not
>>> for shared memory. Thus, I dropped support for lending memory to the 
>>> guest
>>> VM and only retained the shared memory support in this series. I'd 
>>> like to
>>> merge what we can today and introduce the guest-private memory 
>>> support in
>>> tandem with the restricted memfd; I don't see much reason to delay the
>>> series.
>>
>> Right, protected guests will use the new restricted memfd ("guest mem"
>> now, I think?), but non-protected guests should implement the existing
>> interface *without* the need for the GUP pin on guest memory pages. Yes,
>> that means full support for MMU notifiers so that these pages can be
>> managed properly by the host kernel. We're working on that for pKVM, but
>> it requires a more flexible form of memory sharing over what we currently
>> have so that e.g. the zero page can be shared between multiple entities.
> 
> Gunyah doesn't support swapping pages out while the guest is running and 
> the design of Gunyah isn't made to give host kernel full control over 
> the S2 page table for its guests. As best I can tell from reading the 
> respective drivers, ACRN and Nitro Enclaves both GUP pin guest memory 
> pages prior to giving them to the guest, so I don't think this 
> requirement from Gunyah is particularly unusual.
> 

I read/dug into mmu notifiers more and I don't think it matches with 
Gunyah's features today. We don't allow the host to freely manage VM's 
pages because it requires the guest VM to have a level of trust on the 
host. Once a page is given to the guest, it's done for the lifetime of 
the VM. Allowing the host to replace pages in the guest memory map isn't 
part of any VM's security model that we run in Gunyah. With that 
requirement, longterm pinning looks like the correct approach to me.

Thanks,
Elliot
Elliot Berman July 13, 2023, 8:28 p.m. UTC | #18
Hi Will,

On 6/22/2023 4:56 PM, Elliot Berman wrote:
> 
> 
> On 6/7/2023 8:54 AM, Elliot Berman wrote:
>>
>>
>> On 6/5/2023 7:18 AM, Will Deacon wrote:
>>> Hi Elliot,
>>>
>>> [+Quentin since he's looked at the MMU notifiers]
>>>
>>> Sorry for the slow response, I got buried in email during a week away.
>>>
>>> On Fri, May 19, 2023 at 10:02:29AM -0700, Elliot Berman wrote:
>>>> On 5/19/2023 4:59 AM, Will Deacon wrote:
>>>>> On Tue, May 09, 2023 at 01:47:47PM -0700, Elliot Berman wrote:
>>>>>> +    ret = account_locked_vm(ghvm->mm, mapping->npages, true);
>>>>>> +    if (ret)
>>>>>> +        goto free_mapping;
>>>>>> +
>>>>>> +    mapping->pages = kcalloc(mapping->npages, 
>>>>>> sizeof(*mapping->pages), GFP_KERNEL_ACCOUNT);
>>>>>> +    if (!mapping->pages) {
>>>>>> +        ret = -ENOMEM;
>>>>>> +        mapping->npages = 0; /* update npages for reclaim */
>>>>>> +        goto unlock_pages;
>>>>>> +    }
>>>>>> +
>>>>>> +    gup_flags = FOLL_LONGTERM;
>>>>>> +    if (region->flags & GH_MEM_ALLOW_WRITE)
>>>>>> +        gup_flags |= FOLL_WRITE;
>>>>>> +
>>>>>> +    pinned = pin_user_pages_fast(region->userspace_addr, 
>>>>>> mapping->npages,
>>>>>> +                    gup_flags, mapping->pages);
>>>>>> +    if (pinned < 0) {
>>>>>> +        ret = pinned;
>>>>>> +        goto free_pages;
>>>>>> +    } else if (pinned != mapping->npages) {
>>>>>> +        ret = -EFAULT;
>>>>>> +        mapping->npages = pinned; /* update npages for reclaim */
>>>>>> +        goto unpin_pages;
>>>>>> +    }
>>>>>
>>>>> Sorry if I missed it, but I still don't see where you reject file 
>>>>> mappings
>>>>> here.
>>>>>
>>>>
>>>> Sure, I can reject file mappings. I didn't catch that was the ask 
>>>> previously
>>>> and thought it was only a comment about behavior of file mappings.
>>>
>>> I thought the mention of filesystem corruption was clear enough! It's
>>> definitely something we shouldn't allow.
>>>
>>>>> This is also the wrong interface for upstream. Please get involved 
>>>>> with
>>>>> the fd-based guest memory discussions [1] and port your series to 
>>>>> that.
>>>>>
>>>>
>>>> The user interface design for *shared* memory aligns with
>>>> KVM_SET_USER_MEMORY_REGION.
>>>
>>> I don't think it does. For example, file mappings don't work (as above),
>>> you're placing additional rlimit requirements on the caller, read-only
>>> memslots are not functional, the memory cannot be swapped or migrated,
>>> dirty logging doesn't work etc. pKVM is in the same boat, but that's why
>>> we're not upstreaming this part in its current form.
>>>
>>
>> I thought pKVM was only holding off on upstreaming changes related to 
>> guest-private memory?
>>
>>>> I understood we want to use restricted memfd for giving 
>>>> guest-private memory
>>>> (Gunyah calls this "lending memory"). When I went through the 
>>>> changes, I
>>>> gathered KVM is using restricted memfd only for guest-private memory 
>>>> and not
>>>> for shared memory. Thus, I dropped support for lending memory to the 
>>>> guest
>>>> VM and only retained the shared memory support in this series. I'd 
>>>> like to
>>>> merge what we can today and introduce the guest-private memory 
>>>> support in
>>>> tandem with the restricted memfd; I don't see much reason to delay the
>>>> series.
>>>
>>> Right, protected guests will use the new restricted memfd ("guest mem"
>>> now, I think?), but non-protected guests should implement the existing
>>> interface *without* the need for the GUP pin on guest memory pages. Yes,
>>> that means full support for MMU notifiers so that these pages can be
>>> managed properly by the host kernel. We're working on that for pKVM, but
>>> it requires a more flexible form of memory sharing over what we 
>>> currently
>>> have so that e.g. the zero page can be shared between multiple entities.
>>
>> Gunyah doesn't support swapping pages out while the guest is running 
>> and the design of Gunyah isn't made to give host kernel full control 
>> over the S2 page table for its guests. As best I can tell from reading 
>> the respective drivers, ACRN and Nitro Enclaves both GUP pin guest 
>> memory pages prior to giving them to the guest, so I don't think this 
>> requirement from Gunyah is particularly unusual.
>>
> 
> I read/dug into mmu notifiers more and I don't think it matches with 
> Gunyah's features today. We don't allow the host to freely manage VM's 
> pages because it requires the guest VM to have a level of trust on the 
> host. Once a page is given to the guest, it's done for the lifetime of 
> the VM. Allowing the host to replace pages in the guest memory map isn't 
> part of any VM's security model that we run in Gunyah. With that 
> requirement, longterm pinning looks like the correct approach to me.

Is my approach of longterm pinning correct given that Gunyah doesn't 
allow host to freely swap pages?
Will Deacon July 20, 2023, 10:39 a.m. UTC | #19
On Tue, Jul 18, 2023 at 07:28:49PM -0700, Elliot Berman wrote:
> On 7/14/2023 5:13 AM, Will Deacon wrote:
> > On Thu, Jul 13, 2023 at 01:28:34PM -0700, Elliot Berman wrote:
> > > On 6/22/2023 4:56 PM, Elliot Berman wrote:
> > > > On 6/7/2023 8:54 AM, Elliot Berman wrote:
> > > > > On 6/5/2023 7:18 AM, Will Deacon wrote:
> > > > > > Right, protected guests will use the new restricted memfd ("guest mem"
> > > > > > now, I think?), but non-protected guests should implement the existing
> > > > > > interface *without* the need for the GUP pin on guest memory pages. Yes,
> > > > > > that means full support for MMU notifiers so that these pages can be
> > > > > > managed properly by the host kernel. We're working on that for pKVM, but
> > > > > > it requires a more flexible form of memory sharing over what we
> > > > > > currently
> > > > > > have so that e.g. the zero page can be shared between multiple entities.
> > > > > 
> > > > > Gunyah doesn't support swapping pages out while the guest is running
> > > > > and the design of Gunyah isn't made to give host kernel full control
> > > > > over the S2 page table for its guests. As best I can tell from
> > > > > reading the respective drivers, ACRN and Nitro Enclaves both GUP pin
> > > > > guest memory pages prior to giving them to the guest, so I don't
> > > > > think this requirement from Gunyah is particularly unusual.
> > > > > 
> > > > 
> > > > I read/dug into mmu notifiers more and I don't think it matches with
> > > > Gunyah's features today. We don't allow the host to freely manage VM's
> > > > pages because it requires the guest VM to have a level of trust on the
> > > > host. Once a page is given to the guest, it's done for the lifetime of
> > > > the VM. Allowing the host to replace pages in the guest memory map isn't
> > > > part of any VM's security model that we run in Gunyah. With that
> > > > requirement, longterm pinning looks like the correct approach to me.
> > > 
> > > Is my approach of longterm pinning correct given that Gunyah doesn't allow
> > > host to freely swap pages?
> > 
> > No, I really don't think a longterm GUP pin is the right approach for this.
> > GUP pins in general are horrible for the mm layer, but required for cases
> > such as DMA where I/O faults are unrecoverable. Gunyah is not a good
> > justification for such a hack, and I don't think you get to choose which
> > parts of the Linux mm you want and which bits you don't.
> > 
> > In other words, either carve out your memory and pin it that way, or
> > implement the proper hooks for the mm to do its job.
> 
> I talked to the team about whether we can extend the Gunyah support for
> this. We have plans to support sharing/lending individual pages when the
> guest faults on them. The support also allows (unprotected) pages to be
> removed from the VM. We'll need to temporarily pin the pages of the VM
> configuration device tree blob while the VM is being created and those pages
> can be unpinned once the VM starts. I'll work on this.

That's pleasantly unexpected, thanks for pursuing this!

Will