From patchwork Tue Sep 19 11:11:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724459 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23E1E358BE for ; Tue, 19 Sep 2023 11:12:51 +0000 (UTC) Received: from mailgw01.mediatek.com (unknown [60.244.123.138]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A6FBA9; Tue, 19 Sep 2023 04:12:48 -0700 (PDT) X-UUID: 731255ce56dd11eea33bb35ae8d461a2-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=u4uvAih1mmbjI9Ma5fs3AGlxb2UV23DoiIZOQONCd2g=; b=OGq7KsVZi3NuSuIk6bZtSGiHsahTf56+4c7stLhzkmoUJ0X2b+6NOncuJTxOVWokfuQg/Mi8eMhCLM8zy/eeHG/Qw+eWcwsnl/Y9NrWg8iK97uIlT3BpoC5HtwcDVo1r2jiufKnDxw1gJzmrnVKNKPo+MTFZvZA72yfUJn8iiAQ=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:e514c4f8-b595-4136-8eeb-8051d03f9613, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:68581e14-4929-4845-9571-38c601e9c3c9, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR: NO,DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_ULN,TF_CID_SPAM_SNR X-UUID: 731255ce56dd11eea33bb35ae8d461a2-20230919 Received: from mtkmbs14n2.mediatek.inc [(172.21.101.76)] by mailgw01.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 2125183773; Tue, 19 Sep 2023 19:12:41 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs10n2.mediatek.inc (172.21.101.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:39 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:39 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , "My Chuang" , Kevenny Hsieh , Willix Yeh , Liju Chen Subject: [PATCH v6 03/15] virt: geniezone: Add GenieZone hypervisor driver Date: Tue, 19 Sep 2023 19:11:58 +0800 Message-ID: <20230919111210.19615-4-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-Product-Ver: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-AS-Result: No-10--15.609800-8.000000 X-TMASE-MatchedRID: +ewtXJwydIDPwZWTvltloQI0yP/uoH+DUAjrAJWsTe9mU223IIioZfb9 MQK6DQClvfKrrb4bmIqy7ec+ITUwM2eIEG00SdU9drnuu4cCcfF/aDoolm3GXWJkJOQVCIpwMKw CZ7huGiG36SL29gBZ5pCCPgbCpGAQEx7gYK5Baw8mZusHWPhfCgXBq8VnFhCkGoH7Aor25l4faH aH7SYxz5w9wMcKngv65JZWpbmrOY42fA1oT3w9vBes/RxhysDbO69hrW/YgWHRziMbBeTI+aFzV gEW/VGJb770XLgf6w8OVu3603NOrQJ1B5nuoWIZcfsdX+Y7hRO0NJ9wxH7tk5aD0y7qnNOVLYBc 3ZAt7MvQCKqxFMDCL3OL9OTtf6MNNb4r89y+oGvRfDQgu+j+5SlayzmQ9QV0Fp7kniXxovPH4d4 MsOPqitm7+6JBp8fEN3vA/dbJqUbvk9E0156d6HuzDvI75j0sWwKGivsEuI1aW2Ktn+I8/l0E/8 jA12YdJmL4F1ztDueAMuqetGVetnyef22ep6XYkGUtrowrXLg= X-TM-AS-User-Approved-Sender: No X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--15.609800-8.000000 X-TMASE-Version: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-SNTS-SMTP: 98CD4CECFE57F50845CC648E5F61DC9CEDC527F538FEAE5DDC03F42100F852D82000:8 X-MTK: N X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Yingshiuan Pan" GenieZone hypervisor(gzvm) is a type-1 hypervisor that supports various virtual machine types and provides security features such as TEE-like scenarios and secure boot. It can create guest VMs for security use cases and has virtualization capabilities for both platform and interrupt. Although the hypervisor can be booted independently, it requires the assistance of GenieZone hypervisor kernel driver(gzvm-ko) to leverage the ability of Linux kernel for vCPU scheduling, memory management, inter-VM communication and virtio backend support. Add the basic hypervisor driver. Subsequent patches will add more supported features to this driver. Signed-off-by: Yingshiuan Pan Signed-off-by: Jerry Wang Signed-off-by: Liju Chen Signed-off-by: Yi-De Wu --- MAINTAINERS | 3 + arch/arm64/Kbuild | 1 + arch/arm64/geniezone/Makefile | 9 +++ arch/arm64/geniezone/gzvm_arch_common.h | 41 ++++++++++++ arch/arm64/geniezone/vm.c | 22 ++++++ drivers/virt/Kconfig | 2 + drivers/virt/geniezone/Kconfig | 16 +++++ drivers/virt/geniezone/Makefile | 10 +++ drivers/virt/geniezone/gzvm_main.c | 89 +++++++++++++++++++++++++ include/linux/gzvm_drv.h | 25 +++++++ 10 files changed, 218 insertions(+) create mode 100644 arch/arm64/geniezone/Makefile create mode 100644 arch/arm64/geniezone/gzvm_arch_common.h create mode 100644 arch/arm64/geniezone/vm.c create mode 100644 drivers/virt/geniezone/Kconfig create mode 100644 drivers/virt/geniezone/Makefile create mode 100644 drivers/virt/geniezone/gzvm_main.c create mode 100644 include/linux/gzvm_drv.h diff --git a/MAINTAINERS b/MAINTAINERS index c4f7ed6228f2..1a9890520566 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8803,6 +8803,9 @@ M: Ze-Yu Wang M: Yi-De Wu F: Documentation/devicetree/bindings/hypervisor/mediatek,geniezone-hyp.yaml F: Documentation/virt/geniezone/ +F: arch/arm64/geniezone/ +F: drivers/virt/geniezone/ +F: include/linux/gzvm_drv.h GENWQE (IBM Generic Workqueue Card) M: Frank Haverkamp diff --git a/arch/arm64/Kbuild b/arch/arm64/Kbuild index 5bfbf7d79c99..0c3cca572919 100644 --- a/arch/arm64/Kbuild +++ b/arch/arm64/Kbuild @@ -4,6 +4,7 @@ obj-$(CONFIG_KVM) += kvm/ obj-$(CONFIG_XEN) += xen/ obj-$(subst m,y,$(CONFIG_HYPERV)) += hyperv/ obj-$(CONFIG_CRYPTO) += crypto/ +obj-$(CONFIG_MTK_GZVM) += geniezone/ # for cleaning subdir- += boot diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile new file mode 100644 index 000000000000..2957898cdd05 --- /dev/null +++ b/arch/arm64/geniezone/Makefile @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Main Makefile for gzvm, this one includes drivers/virt/geniezone/Makefile +# +include $(srctree)/drivers/virt/geniezone/Makefile + +gzvm-y += vm.o + +obj-$(CONFIG_MTK_GZVM) += gzvm.o diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h new file mode 100644 index 000000000000..402f5ef7dfb4 --- /dev/null +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#ifndef __GZVM_ARCH_COMMON_H__ +#define __GZVM_ARCH_COMMON_H__ + +#include + +enum { + GZVM_FUNC_PROBE = 12, + NR_GZVM_FUNC, +}; + +#define SMC_ENTITY_MTK 59 +#define GZVM_FUNCID_START (0x1000) +#define GZVM_HCALL_ID(func) \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, ARM_SMCCC_SMC_32, \ + SMC_ENTITY_MTK, (GZVM_FUNCID_START + (func))) + +#define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE) + +/** + * gzvm_hypcall_wrapper() - the wrapper for hvc calls + * @a0-a7: arguments passed in registers 0 to 7 + * @res: result values from registers 0 to 3 + * + * Return: The wrapper helps caller to convert geniezone errno to Linux errno. + */ +static inline int gzvm_hypcall_wrapper(unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + unsigned long a4, unsigned long a5, + unsigned long a6, unsigned long a7, + struct arm_smccc_res *res) +{ + arm_smccc_hvc(a0, a1, a2, a3, a4, a5, a6, a7, res); + return gzvm_err_to_errno(res->a0); +} + +#endif /* __GZVM_ARCH_COMMON_H__ */ diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c new file mode 100644 index 000000000000..dc52150790ef --- /dev/null +++ b/arch/arm64/geniezone/vm.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include + +#include +#include "gzvm_arch_common.h" + +int gzvm_arch_probe(void) +{ + struct arm_smccc_res res; + + arm_smccc_hvc(MT_HVC_GZVM_PROBE, 0, 0, 0, 0, 0, 0, 0, &res); + if (res.a0) + return -ENXIO; + + return 0; +} diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig index f79ab13a5c28..9bbf0bdf672c 100644 --- a/drivers/virt/Kconfig +++ b/drivers/virt/Kconfig @@ -54,4 +54,6 @@ source "drivers/virt/coco/sev-guest/Kconfig" source "drivers/virt/coco/tdx-guest/Kconfig" +source "drivers/virt/geniezone/Kconfig" + endif diff --git a/drivers/virt/geniezone/Kconfig b/drivers/virt/geniezone/Kconfig new file mode 100644 index 000000000000..2643fb8913cc --- /dev/null +++ b/drivers/virt/geniezone/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config MTK_GZVM + tristate "GenieZone Hypervisor driver for guest VM operation" + depends on ARM64 + help + This driver, gzvm, enables to run guest VMs on MTK GenieZone + hypervisor. It exports kvm-like interfaces for VMM (e.g., crosvm) in + order to operate guest VMs on GenieZone hypervisor. + + GenieZone hypervisor now only supports MediaTek SoC and arm64 + architecture. + + Select M if you want it be built as a module (gzvm.ko). + + If unsure, say N. diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile new file mode 100644 index 000000000000..8c1f0053e773 --- /dev/null +++ b/drivers/virt/geniezone/Makefile @@ -0,0 +1,10 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for GenieZone driver, this file should be include in arch's +# to avoid two ko being generated. +# + +GZVM_DIR ?= ../../../drivers/virt/geniezone + +gzvm-y := $(GZVM_DIR)/gzvm_main.o + diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c new file mode 100644 index 000000000000..f7d4f0646d97 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_main.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include +#include +#include +#include +#include + +/** + * gzvm_err_to_errno() - Convert geniezone return value to standard errno + * + * @err: Return value from geniezone function return + * + * Return: Standard errno + */ +int gzvm_err_to_errno(unsigned long err) +{ + int gz_err = (int)err; + + switch (gz_err) { + case 0: + return 0; + case ERR_NO_MEMORY: + return -ENOMEM; + case ERR_NOT_SUPPORTED: + return -EOPNOTSUPP; + case ERR_NOT_IMPLEMENTED: + return -EOPNOTSUPP; + case ERR_FAULT: + return -EFAULT; + default: + break; + } + + return -EINVAL; +} + +static const struct file_operations gzvm_chardev_ops = { + .llseek = noop_llseek, +}; + +static struct miscdevice gzvm_dev = { + .minor = MISC_DYNAMIC_MINOR, + .name = KBUILD_MODNAME, + .fops = &gzvm_chardev_ops, +}; + +static int gzvm_drv_probe(struct platform_device *pdev) +{ + if (gzvm_arch_probe() != 0) { + dev_err(&pdev->dev, "Not found available conduit\n"); + return -ENODEV; + } + + return misc_register(&gzvm_dev); +} + +static int gzvm_drv_remove(struct platform_device *pdev) +{ + misc_deregister(&gzvm_dev); + return 0; +} + +static const struct of_device_id gzvm_of_match[] = { + { .compatible = "mediatek,geniezone-hyp" }, + {/* sentinel */}, +}; + +static struct platform_driver gzvm_driver = { + .probe = gzvm_drv_probe, + .remove = gzvm_drv_remove, + .driver = { + .name = KBUILD_MODNAME, + .owner = THIS_MODULE, + .of_match_table = gzvm_of_match, + }, +}; + +module_platform_driver(gzvm_driver); + +MODULE_DEVICE_TABLE(of, gzvm_of_match); +MODULE_AUTHOR("MediaTek"); +MODULE_DESCRIPTION("GenieZone interface for VMM"); +MODULE_LICENSE("GPL"); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h new file mode 100644 index 000000000000..907f2f984de9 --- /dev/null +++ b/include/linux/gzvm_drv.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#ifndef __GZVM_DRV_H__ +#define __GZVM_DRV_H__ + +/* + * These are the definitions of APIs between GenieZone hypervisor and driver, + * there's no need to be visible to uapi. Furthermore, we need GenieZone + * specific error code in order to map to Linux errno + */ +#define NO_ERROR (0) +#define ERR_NO_MEMORY (-5) +#define ERR_NOT_SUPPORTED (-24) +#define ERR_NOT_IMPLEMENTED (-27) +#define ERR_FAULT (-40) + +int gzvm_err_to_errno(unsigned long err); + +/* arch-dependant functions */ +int gzvm_arch_probe(void); + +#endif /* __GZVM_DRV_H__ */ From patchwork Tue Sep 19 11:12:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724456 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4744338DCB for ; Tue, 19 Sep 2023 11:12:53 +0000 (UTC) Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24AD9F5; Tue, 19 Sep 2023 04:12:49 -0700 (PDT) X-UUID: 7398267c56dd11ee8051498923ad61e6-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=C7xeZNR5YJY8fQMYiigXEcsaLE5xKnNjlZcr2y2PmjE=; b=Sh1S+xxbdDpkk3heVBKDsS3s+VB+BbdjPjlb6HKYnL3H1pudxSyaftaT9GkuOAXYipYJI3YXwNSeEG9u/j1d3MRs06q2PWuQVFSi4anigMavm/w/HLpb0kQK45oOGrHHCjH3STsVTWKbpoHERXEwCALGYvooNvvdCiDl49j/+CA=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:aba7eb92-a607-4d14-bf3e-4d0c42e436e7, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:9aea34c3-1e57-4345-9d31-31ad9818b39f, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:0,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR:NO, DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR X-UUID: 7398267c56dd11ee8051498923ad61e6-20230919 Received: from mtkmbs10n1.mediatek.inc [(172.21.101.34)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1734494411; Tue, 19 Sep 2023 19:12:42 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs11n2.mediatek.inc (172.21.101.187) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:40 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:40 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , My Chuang , Kevenny Hsieh , Willix Yeh , Liju Chen Subject: [PATCH v6 08/15] virt: geniezone: Add irqchip support for virtual interrupt injection Date: Tue, 19 Sep 2023 19:12:03 +0800 Message-ID: <20230919111210.19615-9-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MTK: N X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED,RDNS_NONE, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Yingshiuan Pan" Enable GenieZone to handle virtual interrupt injection request. Signed-off-by: Yingshiuan Pan Signed-off-by: kevenny hsieh Signed-off-by: Liju Chen Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/Makefile | 2 +- arch/arm64/geniezone/gzvm_arch_common.h | 4 ++ arch/arm64/geniezone/vgic.c | 50 +++++++++++++++ drivers/virt/geniezone/gzvm_common.h | 12 ++++ drivers/virt/geniezone/gzvm_vm.c | 81 +++++++++++++++++++++++++ include/linux/gzvm_drv.h | 4 ++ include/uapi/linux/gzvm.h | 66 ++++++++++++++++++++ 7 files changed, 218 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/geniezone/vgic.c create mode 100644 drivers/virt/geniezone/gzvm_common.h diff --git a/arch/arm64/geniezone/Makefile b/arch/arm64/geniezone/Makefile index 69b0a4abeab0..0e4f1087f9de 100644 --- a/arch/arm64/geniezone/Makefile +++ b/arch/arm64/geniezone/Makefile @@ -4,6 +4,6 @@ # include $(srctree)/drivers/virt/geniezone/Makefile -gzvm-y += vm.o vcpu.o +gzvm-y += vm.o vcpu.o vgic.o obj-$(CONFIG_MTK_GZVM) += gzvm.o diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 4a5b9e3f669a..9be9cf77faa3 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -17,6 +17,8 @@ enum { GZVM_FUNC_RUN = 5, GZVM_FUNC_GET_ONE_REG = 8, GZVM_FUNC_SET_ONE_REG = 9, + GZVM_FUNC_IRQ_LINE = 10, + GZVM_FUNC_CREATE_DEVICE = 11, GZVM_FUNC_PROBE = 12, GZVM_FUNC_ENABLE_CAP = 13, GZVM_FUNC_INFORM_EXIT = 14, @@ -37,6 +39,8 @@ enum { #define MT_HVC_GZVM_RUN GZVM_HCALL_ID(GZVM_FUNC_RUN) #define MT_HVC_GZVM_GET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_GET_ONE_REG) #define MT_HVC_GZVM_SET_ONE_REG GZVM_HCALL_ID(GZVM_FUNC_SET_ONE_REG) +#define MT_HVC_GZVM_IRQ_LINE GZVM_HCALL_ID(GZVM_FUNC_IRQ_LINE) +#define MT_HVC_GZVM_CREATE_DEVICE GZVM_HCALL_ID(GZVM_FUNC_CREATE_DEVICE) #define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE) #define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP) #define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT) diff --git a/arch/arm64/geniezone/vgic.c b/arch/arm64/geniezone/vgic.c new file mode 100644 index 000000000000..122125f7f8d4 --- /dev/null +++ b/arch/arm64/geniezone/vgic.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include +#include "gzvm_arch_common.h" + +int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_CREATE_DEVICE, vm_id, + virt_to_phys(gzvm_dev), 0, 0, 0, 0, 0, + &res); +} + +/** + * gzvm_arch_inject_irq() - Inject virtual interrupt to a VM + * @gzvm: Pointer to struct gzvm + * @vcpu_idx: vcpu index, only valid if PPI + * @irq: *SPI* irq number (excluding offset value `32`) + * @level: 1 if true else 0 + * + * Return: + * * 0 - Success. + * * Negative - Failure. + */ +int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq, bool level) +{ + unsigned long a1 = assemble_vm_vcpu_tuple(gzvm->vm_id, vcpu_idx); + struct arm_smccc_res res; + + /* + * VMM's virtual device irq number starts from 0, but ARM's shared peripheral + * interrupt number starts from 32. hypervisor adds offset 32 + */ + gzvm_hypcall_wrapper(MT_HVC_GZVM_IRQ_LINE, a1, irq, level, + 0, 0, 0, 0, &res); + if (res.a0) { + pr_err("Failed to set IRQ level (%d) to irq#%u on vcpu %d with ret=%d\n", + level, irq, vcpu_idx, (int)res.a0); + return -EFAULT; + } + + return 0; +} diff --git a/drivers/virt/geniezone/gzvm_common.h b/drivers/virt/geniezone/gzvm_common.h new file mode 100644 index 000000000000..c8d90fee3a18 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_common.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#ifndef __GZ_COMMON_H__ +#define __GZ_COMMON_H__ + +int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq, bool level); + +#endif /* __GZVM_COMMON_H__ */ diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index a33dc3d68c5c..72606a6cc995 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -11,6 +11,7 @@ #include #include #include +#include "gzvm_common.h" static DEFINE_MUTEX(gzvm_list_lock); static LIST_HEAD(gzvm_list); @@ -170,6 +171,72 @@ gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm, return register_memslot_addr_range(gzvm, memslot); } +int gzvm_irqchip_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq, bool level) +{ + return gzvm_arch_inject_irq(gzvm, vcpu_idx, irq, level); +} + +static int gzvm_vm_ioctl_irq_line(struct gzvm *gzvm, + struct gzvm_irq_level *irq_level) +{ + u32 irq = irq_level->irq; + u32 vcpu_idx, vcpu2_idx, irq_num; + bool level = irq_level->level; + + vcpu_idx = FIELD_GET(GZVM_IRQ_LINE_VCPU, irq); + vcpu2_idx = FIELD_GET(GZVM_IRQ_LINE_VCPU2, irq) * (GZVM_IRQ_VCPU_MASK + 1); + irq_num = FIELD_GET(GZVM_IRQ_LINE_NUM, irq); + + return gzvm_irqchip_inject_irq(gzvm, vcpu_idx + vcpu2_idx, irq_num, + level); +} + +static int gzvm_vm_ioctl_create_device(struct gzvm *gzvm, void __user *argp) +{ + struct gzvm_create_device *gzvm_dev; + void *dev_data = NULL; + int ret; + + gzvm_dev = (struct gzvm_create_device *)alloc_pages_exact(PAGE_SIZE, + GFP_KERNEL); + if (!gzvm_dev) + return -ENOMEM; + if (copy_from_user(gzvm_dev, argp, sizeof(*gzvm_dev))) { + ret = -EFAULT; + goto err_free_dev; + } + + if (gzvm_dev->attr_addr != 0 && gzvm_dev->attr_size != 0) { + size_t attr_size = gzvm_dev->attr_size; + void __user *attr_addr = (void __user *)gzvm_dev->attr_addr; + + /* Size of device specific data should not be over a page. */ + if (attr_size > PAGE_SIZE) + return -EINVAL; + + dev_data = alloc_pages_exact(attr_size, GFP_KERNEL); + if (!dev_data) { + ret = -ENOMEM; + goto err_free_dev; + } + + if (copy_from_user(dev_data, attr_addr, attr_size)) { + ret = -EFAULT; + goto err_free_dev_data; + } + gzvm_dev->attr_addr = virt_to_phys(dev_data); + } + + ret = gzvm_arch_create_device(gzvm->vm_id, gzvm_dev); +err_free_dev_data: + if (dev_data) + free_pages_exact(dev_data, 0); +err_free_dev: + free_pages_exact(gzvm_dev, 0); + return ret; +} + static int gzvm_vm_ioctl_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp) @@ -204,6 +271,20 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_vm_ioctl_set_memory_region(gzvm, &userspace_mem); break; } + case GZVM_IRQ_LINE: { + struct gzvm_irq_level irq_event; + + if (copy_from_user(&irq_event, argp, sizeof(irq_event))) { + ret = -EFAULT; + goto out; + } + ret = gzvm_vm_ioctl_irq_line(gzvm, &irq_event); + break; + } + case GZVM_CREATE_DEVICE: { + ret = gzvm_vm_ioctl_create_device(gzvm, argp); + break; + } case GZVM_ENABLE_CAP: { struct gzvm_enable_cap cap; diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index a4203801740c..805a866acecc 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -120,4 +120,8 @@ int gzvm_arch_vcpu_run(struct gzvm_vcpu *vcpu, __u64 *exit_reason); int gzvm_arch_destroy_vcpu(u16 vm_id, int vcpuid); int gzvm_arch_inform_exit(u16 vm_id); +int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev); +int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, + u32 irq, bool level); + #endif /* __GZVM_DRV_H__ */ diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index bdf277fa248a..a1f6acdc4055 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -99,6 +99,72 @@ struct gzvm_userspace_memory_region { #define GZVM_SET_USER_MEMORY_REGION _IOW(GZVM_IOC_MAGIC, 0x46, \ struct gzvm_userspace_memory_region) +/* for GZVM_IRQ_LINE, irq field index values */ +#define GZVM_IRQ_VCPU_MASK 0xff +#define GZVM_IRQ_LINE_TYPE GENMASK(27, 24) +#define GZVM_IRQ_LINE_VCPU GENMASK(23, 16) +#define GZVM_IRQ_LINE_VCPU2 GENMASK(31, 28) +#define GZVM_IRQ_LINE_NUM GENMASK(15, 0) + +/* irq_type field */ +#define GZVM_IRQ_TYPE_CPU 0 +#define GZVM_IRQ_TYPE_SPI 1 +#define GZVM_IRQ_TYPE_PPI 2 + +/* out-of-kernel GIC cpu interrupt injection irq_number field */ +#define GZVM_IRQ_CPU_IRQ 0 +#define GZVM_IRQ_CPU_FIQ 1 + +struct gzvm_irq_level { + union { + __u32 irq; + __s32 status; + }; + __u32 level; +}; + +#define GZVM_IRQ_LINE _IOW(GZVM_IOC_MAGIC, 0x61, \ + struct gzvm_irq_level) + +enum gzvm_device_type { + GZVM_DEV_TYPE_ARM_VGIC_V3_DIST = 0, + GZVM_DEV_TYPE_ARM_VGIC_V3_REDIST = 1, + GZVM_DEV_TYPE_MAX, +}; + +/** + * struct gzvm_create_device: For GZVM_CREATE_DEVICE. + * @dev_type: Device type. + * @id: Device id. + * @flags: Bypass to hypervisor to handle them and these are flags of virtual + * devices. + * @dev_addr: Device ipa address in VM's view. + * @dev_reg_size: Device register range size. + * @attr_addr: If user -> kernel, this is user virtual address of device + * specific attributes (if needed). If kernel->hypervisor, + * this is ipa. + * @attr_size: This attr_size is the buffer size in bytes of each attribute + * needed from various devices. The attribute here refers to the + * additional data passed from VMM(e.g. Crosvm) to GenieZone + * hypervisor when virtual devices were to be created. Thus, + * we need attr_addr and attr_size in the gzvm_create_device + * structure to keep track of the attribute mentioned. + * + * Store information needed to create device. + */ +struct gzvm_create_device { + __u32 dev_type; + __u32 id; + __u64 flags; + __u64 dev_addr; + __u64 dev_reg_size; + __u64 attr_addr; + __u64 attr_size; +}; + +#define GZVM_CREATE_DEVICE _IOWR(GZVM_IOC_MAGIC, 0xe0, \ + struct gzvm_create_device) + /* * ioctls for vcpu fds */ From patchwork Tue Sep 19 11:12:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724455 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73F8A3D393 for ; Tue, 19 Sep 2023 11:12:55 +0000 (UTC) Received: from mailgw01.mediatek.com (unknown [60.244.123.138]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DA08B8; Tue, 19 Sep 2023 04:12:51 -0700 (PDT) X-UUID: 73f9cec256dd11eea33bb35ae8d461a2-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=fj6dttztFrnDVpLt2nhcXUPn81plVj2kGoQw5LtfVBk=; b=TTFRWCiEBjNpxD0WG2FUBg73IpbR1HP+EV1NcWdM+R6KzNPH/RMRXVxyP21baECTd7hz34NqpBXUclNIRpkKypIsMi7IVzDijTm0RC4vcEB6dsnyDiVBL8rxihcJd4m5mcounN43bOPChYKyhl9ZQ2CY1SaXLVmdJjMk1Mce/SY=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:9cfe4d44-7574-48d5-9264-96cd33d94244, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:6d581e14-4929-4845-9571-38c601e9c3c9, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:0,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR:NO, DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR X-UUID: 73f9cec256dd11eea33bb35ae8d461a2-20230919 Received: from mtkmbs10n1.mediatek.inc [(172.21.101.34)] by mailgw01.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 530445892; Tue, 19 Sep 2023 19:12:42 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by MTKMBS14N1.mediatek.inc (172.21.101.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:41 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:40 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , "My Chuang" , Kevenny Hsieh , Willix Yeh , Liju Chen Subject: [PATCH v6 09/15] virt: geniezone: Add irqfd support Date: Tue, 19 Sep 2023 19:12:04 +0800 Message-ID: <20230919111210.19615-10-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-Product-Ver: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-AS-Result: No-10--13.231500-8.000000 X-TMASE-MatchedRID: AC84ZEk84qCT+V7OGy1eF7CvlllU7Dl15E5u1OdPWsTgr/zYTDOZCJeq sXSeEviPpSHzbpDtJi1Yo3G+rvxrNao77AcuQhw7A9lly13c/gGZ2scyRQcerz6DYdLKc78CxfJ mJLSy2rYPRm7vZvATR5urMMf0bt5M/YThwZVOWJKmG+fak9r3agrefVId6fzVZ5yuplze9pssVU 53f9voZBgdl9SWxnPJVoAmHMW7f3t/+iNkyaz33ws9VkfCh3uAEbs7d+z9c8vjsTquy0JRizzAK 7q1A+IiJVwLGBypRhaJ2bX8lNKTfG1yv+64My/eQuFiD+xrWCwxXH/dlhvLv/t4POt4bDKR4nmg 1woDU7al6qI2izqkidSArurYxE2pnZCu38NkgL9VXhlmZsTdjARryDXHx6oXDtPbEOCY2f6IbCF Fbr32N7FwN+qSs8wuYB7Rg0hkocmW+cf6nEkoSuQoIU4rAATMIZm2INWXDp4PIi+nG+iNN6nn4h ZeU6w6HEXpGNdFlNwRCKHXY1XXLGX3SpdG0+lWDB+ErBr0bAMrHkgIan9a0ddMDzSLr8ovZYNYI CIPuYi0xgJ2yHF4fogWROA0BBvYSwojFX8WCDC+dJWHbg4ITilayzmQ9QV0S4KPPiCB23AChdJp Cl+Gky37gDEBTOpEOME7yje9iCU9S3IiQd+eNeTuT3JcmKqqAZn/4A9db2SYFp2iw4hoIZPGlAh wdrelxHy4fm/VTOGAMuqetGVetnyef22ep6XYOwBXM346/+ygN6zMY1vB0dZ4iKTsxyGe8vvzMU l7c/o3Bf0+iT7T9MX/HOitp/8/ X-TM-AS-User-Approved-Sender: No X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--13.231500-8.000000 X-TMASE-Version: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-SNTS-SMTP: 35A9F7B0E6E66B31604231B048F34353A18253EAFE18DD1D9AD922FD010014C02000:8 X-MTK: N X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Yingshiuan Pan" irqfd enables other threads than vcpu threads to inject virtual interrupt through irqfd asynchronously rather through ioctl interface. This interface is necessary for VMM which creates separated thread for IO handling or uses vhost devices. Signed-off-by: Yingshiuan Pan Signed-off-by: kevenny hsieh Signed-off-by: Liju Chen Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/gzvm_arch_common.h | 18 ++ drivers/virt/geniezone/Makefile | 3 +- drivers/virt/geniezone/gzvm_irqfd.c | 382 ++++++++++++++++++++++++ drivers/virt/geniezone/gzvm_main.c | 12 +- drivers/virt/geniezone/gzvm_vcpu.c | 1 + drivers/virt/geniezone/gzvm_vm.c | 18 ++ include/linux/gzvm_drv.h | 25 ++ include/uapi/linux/gzvm.h | 26 ++ 8 files changed, 483 insertions(+), 2 deletions(-) create mode 100644 drivers/virt/geniezone/gzvm_irqfd.c diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 9be9cf77faa3..051d8f49a1df 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -45,6 +45,8 @@ enum { #define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP) #define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT) +#define GIC_V3_NR_LRS 16 + /** * gzvm_hypcall_wrapper() - the wrapper for hvc calls * @a0-a7: arguments passed in registers 0 to 7 @@ -72,6 +74,22 @@ static inline u16 get_vcpuid_from_tuple(unsigned int tuple) return (u16)(tuple & 0xffff); } +/** + * struct gzvm_vcpu_hwstate: Sync architecture state back to host for handling + * @nr_lrs: The available LRs(list registers) in Soc. + * @__pad: add an explicit '__u32 __pad;' in the middle to make it clear + * what the actual layout is. + * @lr: The array of LRs(list registers). + * + * - Keep the same layout of hypervisor data struct. + * - Sync list registers back for acking virtual device interrupt status. + */ +struct gzvm_vcpu_hwstate { + __le32 nr_lrs; + __le32 __pad; + __le64 lr[GIC_V3_NR_LRS]; +}; + static inline unsigned int assemble_vm_vcpu_tuple(u16 vmid, u16 vcpuid) { diff --git a/drivers/virt/geniezone/Makefile b/drivers/virt/geniezone/Makefile index ad3a7a8b2fa4..05203166bf09 100644 --- a/drivers/virt/geniezone/Makefile +++ b/drivers/virt/geniezone/Makefile @@ -7,4 +7,5 @@ GZVM_DIR ?= ../../../drivers/virt/geniezone gzvm-y := $(GZVM_DIR)/gzvm_main.o $(GZVM_DIR)/gzvm_mmu.o \ - $(GZVM_DIR)/gzvm_vm.o $(GZVM_DIR)/gzvm_vcpu.o + $(GZVM_DIR)/gzvm_vm.o $(GZVM_DIR)/gzvm_vcpu.o \ + $(GZVM_DIR)/gzvm_irqfd.o diff --git a/drivers/virt/geniezone/gzvm_irqfd.c b/drivers/virt/geniezone/gzvm_irqfd.c new file mode 100644 index 000000000000..fe77d074cf20 --- /dev/null +++ b/drivers/virt/geniezone/gzvm_irqfd.c @@ -0,0 +1,382 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2023 MediaTek Inc. + */ + +#include +#include +#include +#include "gzvm_common.h" + +struct gzvm_irq_ack_notifier { + struct hlist_node link; + unsigned int gsi; + void (*irq_acked)(struct gzvm_irq_ack_notifier *ian); +}; + +/** + * struct gzvm_kernel_irqfd: gzvm kernel irqfd descriptor. + * @gzvm: Pointer to struct gzvm. + * @wait: Wait queue entry. + * @gsi: Used for level IRQ fast-path. + * @eventfd: Used for setup/shutdown. + * @list: struct list_head. + * @pt: struct poll_table_struct. + * @shutdown: struct work_struct. + */ +struct gzvm_kernel_irqfd { + struct gzvm *gzvm; + wait_queue_entry_t wait; + + int gsi; + + struct eventfd_ctx *eventfd; + struct list_head list; + poll_table pt; + struct work_struct shutdown; +}; + +static struct workqueue_struct *irqfd_cleanup_wq; + +/** + * irqfd_set_irq(): irqfd to inject virtual interrupt. + * @gzvm: Pointer to gzvm. + * @irq: This is spi interrupt number (starts from 0 instead of 32). + * @level: irq triggered level. + */ +static void irqfd_set_irq(struct gzvm *gzvm, u32 irq, int level) +{ + if (level) + gzvm_irqchip_inject_irq(gzvm, 0, irq, level); +} + +/** + * irqfd_shutdown() - Race-free decouple logic (ordering is critical). + * @work: Pointer to work_struct. + */ +static void irqfd_shutdown(struct work_struct *work) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(work, struct gzvm_kernel_irqfd, shutdown); + struct gzvm *gzvm = irqfd->gzvm; + u64 cnt; + + /* Make sure irqfd has been initialized in assign path. */ + synchronize_srcu(&gzvm->irq_srcu); + + /* + * Synchronize with the wait-queue and unhook ourselves to prevent + * further events. + */ + eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt); + + /* + * It is now safe to release the object's resources + */ + eventfd_ctx_put(irqfd->eventfd); + kfree(irqfd); +} + +/** + * irqfd_is_active() - Assumes gzvm->irqfds.lock is held. + * @irqfd: Pointer to gzvm_kernel_irqfd. + * + * Return: + * * true - irqfd is active. + */ +static bool irqfd_is_active(struct gzvm_kernel_irqfd *irqfd) +{ + return list_empty(&irqfd->list) ? false : true; +} + +/** + * irqfd_deactivate() - Mark the irqfd as inactive and schedule it for removal. + * assumes gzvm->irqfds.lock is held. + * @irqfd: Pointer to gzvm_kernel_irqfd. + */ +static void irqfd_deactivate(struct gzvm_kernel_irqfd *irqfd) +{ + if (!irqfd_is_active(irqfd)) + return; + + list_del_init(&irqfd->list); + + queue_work(irqfd_cleanup_wq, &irqfd->shutdown); +} + +/** + * irqfd_wakeup() - Callback of irqfd wait queue, would be woken by writing to + * irqfd to do virtual interrupt injection. + * @wait: Pointer to wait_queue_entry_t. + * @mode: Unused. + * @sync: Unused. + * @key: Get flags about Epoll events. + * + * Return: + * * 0 - Success + */ +static int irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, int sync, + void *key) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(wait, struct gzvm_kernel_irqfd, wait); + __poll_t flags = key_to_poll(key); + struct gzvm *gzvm = irqfd->gzvm; + + if (flags & EPOLLIN) { + u64 cnt; + + eventfd_ctx_do_read(irqfd->eventfd, &cnt); + /* gzvm's irq injection is not blocked, don't need workq */ + irqfd_set_irq(gzvm, irqfd->gsi, 1); + } + + if (flags & EPOLLHUP) { + /* The eventfd is closing, detach from GZVM */ + unsigned long iflags; + + spin_lock_irqsave(&gzvm->irqfds.lock, iflags); + + /* + * Do more check if someone deactivated the irqfd before + * we could acquire the irqfds.lock. + */ + if (irqfd_is_active(irqfd)) + irqfd_deactivate(irqfd); + + spin_unlock_irqrestore(&gzvm->irqfds.lock, iflags); + } + + return 0; +} + +static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, + poll_table *pt) +{ + struct gzvm_kernel_irqfd *irqfd = + container_of(pt, struct gzvm_kernel_irqfd, pt); + add_wait_queue_priority(wqh, &irqfd->wait); +} + +static int gzvm_irqfd_assign(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + struct fd f; + struct eventfd_ctx *eventfd = NULL; + int ret; + int idx; + + irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL_ACCOUNT); + if (!irqfd) + return -ENOMEM; + + irqfd->gzvm = gzvm; + irqfd->gsi = args->gsi; + + INIT_LIST_HEAD(&irqfd->list); + INIT_WORK(&irqfd->shutdown, irqfd_shutdown); + + f = fdget(args->fd); + if (!f.file) { + ret = -EBADF; + goto out; + } + + eventfd = eventfd_ctx_fileget(f.file); + if (IS_ERR(eventfd)) { + ret = PTR_ERR(eventfd); + goto fail; + } + + irqfd->eventfd = eventfd; + + /* + * Install our own custom wake-up handling so we are notified via + * a callback whenever someone signals the underlying eventfd + */ + init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup); + init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc); + + spin_lock_irq(&gzvm->irqfds.lock); + + ret = 0; + list_for_each_entry(tmp, &gzvm->irqfds.items, list) { + if (irqfd->eventfd != tmp->eventfd) + continue; + /* This fd is used for another irq already. */ + pr_err("already used: gsi=%d fd=%d\n", args->gsi, args->fd); + ret = -EBUSY; + spin_unlock_irq(&gzvm->irqfds.lock); + goto fail; + } + + idx = srcu_read_lock(&gzvm->irq_srcu); + + list_add_tail(&irqfd->list, &gzvm->irqfds.items); + + spin_unlock_irq(&gzvm->irqfds.lock); + + vfs_poll(f.file, &irqfd->pt); + + srcu_read_unlock(&gzvm->irq_srcu, idx); + + /* + * do not drop the file until the irqfd is fully initialized, otherwise + * we might race against the EPOLLHUP + */ + fdput(f); + return 0; + +fail: + if (eventfd && !IS_ERR(eventfd)) + eventfd_ctx_put(eventfd); + + fdput(f); + +out: + kfree(irqfd); + return ret; +} + +static void gzvm_notify_acked_gsi(struct gzvm *gzvm, int gsi) +{ + struct gzvm_irq_ack_notifier *gian; + + hlist_for_each_entry_srcu(gian, &gzvm->irq_ack_notifier_list, + link, srcu_read_lock_held(&gzvm->irq_srcu)) + if (gian->gsi == gsi) + gian->irq_acked(gian); +} + +void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi) +{ + int idx; + + idx = srcu_read_lock(&gzvm->irq_srcu); + gzvm_notify_acked_gsi(gzvm, gsi); + srcu_read_unlock(&gzvm->irq_srcu, idx); +} + +/** + * gzvm_irqfd_deassign() - Shutdown any irqfd's that match fd+gsi. + * @gzvm: Pointer to gzvm. + * @args: Pointer to gzvm_irqfd. + * + * Return: + * * 0 - Success. + * * Negative value - Failure. + */ +static int gzvm_irqfd_deassign(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + struct eventfd_ctx *eventfd; + + eventfd = eventfd_ctx_fdget(args->fd); + if (IS_ERR(eventfd)) + return PTR_ERR(eventfd); + + spin_lock_irq(&gzvm->irqfds.lock); + + list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list) { + if (irqfd->eventfd == eventfd && irqfd->gsi == args->gsi) + irqfd_deactivate(irqfd); + } + + spin_unlock_irq(&gzvm->irqfds.lock); + eventfd_ctx_put(eventfd); + + /* + * Block until we know all outstanding shutdown jobs have completed + * so that we guarantee there will not be any more interrupts on this + * gsi once this deassign function returns. + */ + flush_workqueue(irqfd_cleanup_wq); + + return 0; +} + +int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args) +{ + for (int i = 0; i < ARRAY_SIZE(args->pad); i++) { + if (args->pad[i]) + return -EINVAL; + } + + if (args->flags & + ~(GZVM_IRQFD_FLAG_DEASSIGN | GZVM_IRQFD_FLAG_RESAMPLE)) + return -EINVAL; + + if (args->flags & GZVM_IRQFD_FLAG_DEASSIGN) + return gzvm_irqfd_deassign(gzvm, args); + + return gzvm_irqfd_assign(gzvm, args); +} + +/** + * gzvm_vm_irqfd_init() - Initialize irqfd data structure per VM + * + * @gzvm: Pointer to struct gzvm. + * + * Return: + * * 0 - Success. + * * Negative - Failure. + */ +int gzvm_vm_irqfd_init(struct gzvm *gzvm) +{ + mutex_init(&gzvm->irq_lock); + + spin_lock_init(&gzvm->irqfds.lock); + INIT_LIST_HEAD(&gzvm->irqfds.items); + if (init_srcu_struct(&gzvm->irq_srcu)) + return -EINVAL; + INIT_HLIST_HEAD(&gzvm->irq_ack_notifier_list); + + return 0; +} + +/** + * gzvm_vm_irqfd_release() - This function is called as the gzvm VM fd is being + * released. Shutdown all irqfds that still remain open. + * @gzvm: Pointer to gzvm. + */ +void gzvm_vm_irqfd_release(struct gzvm *gzvm) +{ + struct gzvm_kernel_irqfd *irqfd, *tmp; + + spin_lock_irq(&gzvm->irqfds.lock); + + list_for_each_entry_safe(irqfd, tmp, &gzvm->irqfds.items, list) + irqfd_deactivate(irqfd); + + spin_unlock_irq(&gzvm->irqfds.lock); + + /* + * Block until we know all outstanding shutdown jobs have completed. + */ + flush_workqueue(irqfd_cleanup_wq); +} + +/** + * gzvm_drv_irqfd_init() - Erase flushing work items when a VM exits. + * + * Return: + * * 0 - Success. + * * Negative - Failure. + * + * Create a host-wide workqueue for issuing deferred shutdown requests + * aggregated from all vm* instances. We need our own isolated + * queue to ease flushing work items when a VM exits. + */ +int gzvm_drv_irqfd_init(void) +{ + irqfd_cleanup_wq = alloc_workqueue("gzvm-irqfd-cleanup", 0, 0); + if (!irqfd_cleanup_wq) + return -ENOMEM; + + return 0; +} + +void gzvm_drv_irqfd_exit(void) +{ + destroy_workqueue(irqfd_cleanup_wq); +} diff --git a/drivers/virt/geniezone/gzvm_main.c b/drivers/virt/geniezone/gzvm_main.c index 30f6c3975026..4e5d1b83df4a 100644 --- a/drivers/virt/geniezone/gzvm_main.c +++ b/drivers/virt/geniezone/gzvm_main.c @@ -97,16 +97,26 @@ static struct miscdevice gzvm_dev = { static int gzvm_drv_probe(struct platform_device *pdev) { + int ret; + if (gzvm_arch_probe() != 0) { dev_err(&pdev->dev, "Not found available conduit\n"); return -ENODEV; } - return misc_register(&gzvm_dev); + ret = misc_register(&gzvm_dev); + if (ret) + return ret; + + ret = gzvm_drv_irqfd_init(); + if (ret) + return ret; + return 0; } static int gzvm_drv_remove(struct platform_device *pdev) { + gzvm_drv_irqfd_exit(); gzvm_destroy_all_vms(); misc_deregister(&gzvm_dev); return 0; diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c index b8070a46c087..85b59592cb5c 100644 --- a/drivers/virt/geniezone/gzvm_vcpu.c +++ b/drivers/virt/geniezone/gzvm_vcpu.c @@ -229,6 +229,7 @@ int gzvm_vm_ioctl_create_vcpu(struct gzvm *gzvm, u32 cpuid) ret = -ENOMEM; goto free_vcpu; } + vcpu->hwstate = (void *)vcpu->run + PAGE_SIZE; vcpu->vcpuid = cpuid; vcpu->gzvm = gzvm; mutex_init(&vcpu->lock); diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 72606a6cc995..eb760f418552 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -285,6 +285,16 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_vm_ioctl_create_device(gzvm, argp); break; } + case GZVM_IRQFD: { + struct gzvm_irqfd data; + + if (copy_from_user(&data, argp, sizeof(data))) { + ret = -EFAULT; + goto out; + } + ret = gzvm_irqfd(gzvm, &data); + break; + } case GZVM_ENABLE_CAP: { struct gzvm_enable_cap cap; @@ -308,6 +318,7 @@ static void gzvm_destroy_vm(struct gzvm *gzvm) mutex_lock(&gzvm->lock); + gzvm_vm_irqfd_release(gzvm); gzvm_destroy_vcpus(gzvm); gzvm_arch_destroy_vm(gzvm->vm_id); @@ -353,6 +364,13 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) gzvm->mm = current->mm; mutex_init(&gzvm->lock); + ret = gzvm_vm_irqfd_init(gzvm); + if (ret) { + pr_err("Failed to initialize irqfd\n"); + kfree(gzvm); + return ERR_PTR(ret); + } + mutex_lock(&gzvm_list_lock); list_add(&gzvm->vm_list, &gzvm_list); mutex_unlock(&gzvm_list_lock); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index 805a866acecc..676e6b5714e8 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -10,6 +10,7 @@ #include #include #include +#include /* * For the normal physical address, the highest 12 bits should be zero, so we @@ -30,6 +31,7 @@ #define ERR_NOT_SUPPORTED (-24) #define ERR_NOT_IMPLEMENTED (-27) #define ERR_FAULT (-40) +#define GZVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1 /* * The following data structures are for data transferring between driver and @@ -73,6 +75,7 @@ struct gzvm_vcpu { /* lock of vcpu*/ struct mutex lock; struct gzvm_vcpu_run *run; + struct gzvm_vcpu_hwstate *hwstate; }; struct gzvm { @@ -82,8 +85,23 @@ struct gzvm { struct gzvm_memslot memslot[GZVM_MAX_MEM_REGION]; /* lock for list_add*/ struct mutex lock; + + struct { + /* lock for irqfds list operation */ + spinlock_t lock; + struct list_head items; + struct list_head resampler_list; + /* lock for irqfds resampler */ + struct mutex resampler_lock; + } irqfds; + struct list_head vm_list; u16 vm_id; + + struct hlist_head irq_ack_notifier_list; + struct srcu_struct irq_srcu; + /* lock for irq injection */ + struct mutex irq_lock; }; long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args); @@ -124,4 +142,11 @@ int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev); int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, u32 irq, bool level); +void gzvm_notify_acked_irq(struct gzvm *gzvm, unsigned int gsi); +int gzvm_irqfd(struct gzvm *gzvm, struct gzvm_irqfd *args); +int gzvm_drv_irqfd_init(void); +void gzvm_drv_irqfd_exit(void); +int gzvm_vm_irqfd_init(struct gzvm *gzvm); +void gzvm_vm_irqfd_release(struct gzvm *gzvm); + #endif /* __GZVM_DRV_H__ */ diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index a1f6acdc4055..cb02f278972f 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -309,4 +309,30 @@ struct gzvm_one_reg { #define GZVM_REG_GENERIC 0x0000000000000000ULL +#define GZVM_IRQFD_FLAG_DEASSIGN BIT(0) +/* + * GZVM_IRQFD_FLAG_RESAMPLE indicates resamplefd is valid and specifies + * the irqfd to operate in resampling mode for level triggered interrupt + * emulation. + */ +#define GZVM_IRQFD_FLAG_RESAMPLE BIT(1) + +/** + * struct gzvm_irqfd: gzvm irqfd descriptor + * @fd: File descriptor. + * @gsi: Used for level IRQ fast-path. + * @flags: FLAG_DEASSIGN or FLAG_RESAMPLE. + * @resamplefd: The file descriptor of the resampler. + * @pad: Reserved for future-proof. + */ +struct gzvm_irqfd { + __u32 fd; + __u32 gsi; + __u32 flags; + __u32 resamplefd; + __u8 pad[16]; +}; + +#define GZVM_IRQFD _IOW(GZVM_IOC_MAGIC, 0x76, struct gzvm_irqfd) + #endif /* __GZVM_H__ */ From patchwork Tue Sep 19 11:12:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724457 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19DF838BCA for ; Tue, 19 Sep 2023 11:12:53 +0000 (UTC) Received: from mailgw01.mediatek.com (unknown [60.244.123.138]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0315FFC; Tue, 19 Sep 2023 04:12:50 -0700 (PDT) X-UUID: 7501304456dd11eea33bb35ae8d461a2-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=abrsKiV55spnn6ljLUviqqJXuNSbbJyTQEb3oKEPd3E=; b=kwWK3pWQm/CCuMy0DssWhk0QrFzUDAvEEltgs9Bbl6Tv1xxr31A90NAPTlRATno9zFwgkkdna9msWYbs2IlcqFuibF9ubN6HZJkTOK2DkwqpKK1DLwIqE/Ovzd7i/S/ECUeU5/jSM+gkZOIXiGvQGS7CiCFzE5uI5YgFdVJYCnc=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:10a07dea-0507-43ea-8f09-7ff9e82aada5, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:83581e14-4929-4845-9571-38c601e9c3c9, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR: NO,DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0,NGT X-CID-BAS: 0,NGT,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR,TF_CID_SPAM_ULN X-UUID: 7501304456dd11eea33bb35ae8d461a2-20230919 Received: from mtkmbs10n2.mediatek.inc [(172.21.101.183)] by mailgw01.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1172536937; Tue, 19 Sep 2023 19:12:44 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs10n2.mediatek.inc (172.21.101.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:41 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:41 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , "My Chuang" , Kevenny Hsieh , Willix Yeh , Liju-clr Chen Subject: [PATCH v6 11/15] virt: geniezone: Add memory region support Date: Tue, 19 Sep 2023 19:12:06 +0800 Message-ID: <20230919111210.19615-12-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-TM-AS-Product-Ver: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-AS-Result: No-10--3.824400-8.000000 X-TMASE-MatchedRID: Z2Vi4n4mYCQw2a0RxGqSGyVypP66BP0QNNuh+5zmS68Cgjr7b0ytGZas fA8Y/RCF8B1Eq8wDKAEQzM3Grt8RghUBkTmMruyZhK8o4aoss8pKPIx+MJF9o99RlPzeVuQQpc2 xgJAY6Lr/EdEp4HtkngHGJy9aPQkxszLAY5oHhBDd+fuf9kcapoLFgHaE9Li9myiLZetSf8mfop 0ytGwvXiq2rl3dzGQ1R0bQh4y98h8kZqDRPD5Y8NH+tTsqW+n+woyq4xxEb8Maqtm5/umcYdASM 74NIV6teXqdGP7hxdzcqVQKytsAjbUCbH0/l7fHQcFNZPwmQm10BNB20+SxH7f8mJY57oZddJaB DYald1lvF9+X2GEIHA== X-TM-AS-User-Approved-Sender: No X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--3.824400-8.000000 X-TMASE-Version: SMEX-14.0.0.3152-9.1.1006-23728.005 X-TM-SNTS-SMTP: 003A8E40B735E028E828893E79ADC94490F280830873D6C0A4C22AC54B79F9762000:8 X-MTK: N X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Jerry Wang" Hypervisor might need to know the precise purpose of each memory region, so that it can provide specific memory protection. We add a new uapi to pass address and size of a memory region and its purpose. Signed-off-by: Jerry Wang Signed-off-by: Liju-clr Chen Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/gzvm_arch_common.h | 2 ++ arch/arm64/geniezone/vm.c | 10 ++++++++++ drivers/virt/geniezone/gzvm_vm.c | 7 +++++++ include/linux/gzvm_drv.h | 3 +++ 4 files changed, 22 insertions(+) diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 051d8f49a1df..321f5dbcd616 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -22,6 +22,7 @@ enum { GZVM_FUNC_PROBE = 12, GZVM_FUNC_ENABLE_CAP = 13, GZVM_FUNC_INFORM_EXIT = 14, + GZVM_FUNC_MEMREGION_PURPOSE = 15, NR_GZVM_FUNC, }; @@ -44,6 +45,7 @@ enum { #define MT_HVC_GZVM_PROBE GZVM_HCALL_ID(GZVM_FUNC_PROBE) #define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP) #define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT) +#define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE) #define GIC_V3_NR_LRS 16 diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c index 6db5e7f0cf2f..1198f32f4785 100644 --- a/arch/arm64/geniezone/vm.c +++ b/arch/arm64/geniezone/vm.c @@ -104,6 +104,16 @@ int gzvm_arch_destroy_vm(u16 vm_id) 0, 0, &res); } +int gzvm_arch_memregion_purpose(struct gzvm *gzvm, + struct gzvm_userspace_memory_region *mem) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_MEMREGION_PURPOSE, gzvm->vm_id, + mem->guest_phys_addr, mem->memory_size, + mem->flags, 0, 0, 0, &res); +} + static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, struct arm_smccc_res *res) diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 7a935b1cf509..dfa03e850e8a 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -144,6 +144,7 @@ static int gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm, struct gzvm_userspace_memory_region *mem) { + int ret; struct vm_area_struct *vma; struct gzvm_memslot *memslot; unsigned long size; @@ -168,6 +169,12 @@ gzvm_vm_ioctl_set_memory_region(struct gzvm *gzvm, memslot->vma = vma; memslot->flags = mem->flags; memslot->slot_id = mem->slot; + + ret = gzvm_arch_memregion_purpose(gzvm, mem); + if (ret) { + pr_err("Failed to config memory region for the specified purpose\n"); + return -EFAULT; + } return register_memslot_addr_range(gzvm, memslot); } diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index 406bc9f821b2..d35ca179b688 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -152,6 +152,9 @@ void gzvm_drv_irqfd_exit(void); int gzvm_vm_irqfd_init(struct gzvm *gzvm); void gzvm_vm_irqfd_release(struct gzvm *gzvm); +int gzvm_arch_memregion_purpose(struct gzvm *gzvm, + struct gzvm_userspace_memory_region *mem); + int gzvm_init_ioeventfd(struct gzvm *gzvm); int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args); bool gzvm_ioevent_write(struct gzvm_vcpu *vcpu, __u64 addr, int len, From patchwork Tue Sep 19 11:12:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724452 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93C4D3D3BC for ; Tue, 19 Sep 2023 11:12:56 +0000 (UTC) Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEFEE114; Tue, 19 Sep 2023 04:12:53 -0700 (PDT) X-UUID: 73ffffb856dd11ee8051498923ad61e6-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=oMJnqTaqk0cC+W7BGuBZRReNA9mlb4C1cqyyl53azJE=; b=eKdRuiHCPlXN7SGYYoSk4nIVmFb1wVMqpUywFQZ9vLLZRtxMFIIwwhcSi++Nkceok29BPOIiA11I0GEr3jxEAARuoRmZvhKiKW1TtYaGhbGZ0ouWfMfTz8HT5oVEK3tZbcmmM1UiGmFoQiP41lcalkPpsHYCGzPvHarYSZ1o7mc=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:7f11a970-c3e4-4fc0-9d63-14a4a3e96676, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:79581e14-4929-4845-9571-38c601e9c3c9, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR: NO,DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0,NGT X-CID-BAS: 0,NGT,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR,TF_CID_SPAM_ULN X-UUID: 73ffffb856dd11ee8051498923ad61e6-20230919 Received: from mtkmbs10n2.mediatek.inc [(172.21.101.183)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1759978427; Tue, 19 Sep 2023 19:12:42 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs13n1.mediatek.inc (172.21.101.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:41 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:41 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , My Chuang , Kevenny Hsieh , Willix Yeh , Liju-clr Chen Subject: [PATCH v6 12/15] virt: geniezone: Add dtb config support Date: Tue, 19 Sep 2023 19:12:07 +0800 Message-ID: <20230919111210.19615-13-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MTK: N X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED,RDNS_NONE, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Jerry Wang" Hypervisor might need to know the accurate address and size of dtb passed from userspace. And then hypervisor would parse the dtb and get vm information. Signed-off-by: Jerry Wang Signed-off-by: Liju-clr Chen Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/gzvm_arch_common.h | 2 ++ arch/arm64/geniezone/vm.c | 9 +++++++++ drivers/virt/geniezone/gzvm_vm.c | 10 ++++++++++ include/linux/gzvm_drv.h | 1 + include/uapi/linux/gzvm.h | 14 ++++++++++++++ 5 files changed, 36 insertions(+) diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 321f5dbcd616..82d2c44e819b 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -23,6 +23,7 @@ enum { GZVM_FUNC_ENABLE_CAP = 13, GZVM_FUNC_INFORM_EXIT = 14, GZVM_FUNC_MEMREGION_PURPOSE = 15, + GZVM_FUNC_SET_DTB_CONFIG = 16, NR_GZVM_FUNC, }; @@ -46,6 +47,7 @@ enum { #define MT_HVC_GZVM_ENABLE_CAP GZVM_HCALL_ID(GZVM_FUNC_ENABLE_CAP) #define MT_HVC_GZVM_INFORM_EXIT GZVM_HCALL_ID(GZVM_FUNC_INFORM_EXIT) #define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE) +#define MT_HVC_GZVM_SET_DTB_CONFIG GZVM_HCALL_ID(GZVM_FUNC_SET_DTB_CONFIG) #define GIC_V3_NR_LRS 16 diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c index 1198f32f4785..cf18b607bc81 100644 --- a/arch/arm64/geniezone/vm.c +++ b/arch/arm64/geniezone/vm.c @@ -114,6 +114,15 @@ int gzvm_arch_memregion_purpose(struct gzvm *gzvm, mem->flags, 0, 0, 0, &res); } +int gzvm_arch_set_dtb_config(struct gzvm *gzvm, struct gzvm_dtb_config *cfg) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_SET_DTB_CONFIG, gzvm->vm_id, + cfg->dtb_addr, cfg->dtb_size, 0, 0, 0, 0, + &res); +} + static int gzvm_vm_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, struct arm_smccc_res *res) diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index dfa03e850e8a..79fe6eb4196f 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -322,6 +322,16 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, ret = gzvm_vm_ioctl_enable_cap(gzvm, &cap, argp); break; } + case GZVM_SET_DTB_CONFIG: { + struct gzvm_dtb_config cfg; + + if (copy_from_user(&cfg, argp, sizeof(cfg))) { + ret = -EFAULT; + goto out; + } + ret = gzvm_arch_set_dtb_config(gzvm, &cfg); + break; + } default: ret = -ENOTTY; } diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index d35ca179b688..43d85dc5d7c0 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -154,6 +154,7 @@ void gzvm_vm_irqfd_release(struct gzvm *gzvm); int gzvm_arch_memregion_purpose(struct gzvm *gzvm, struct gzvm_userspace_memory_region *mem); +int gzvm_arch_set_dtb_config(struct gzvm *gzvm, struct gzvm_dtb_config *args); int gzvm_init_ioeventfd(struct gzvm *gzvm); int gzvm_ioeventfd(struct gzvm *gzvm, struct gzvm_ioeventfd *args); diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index ef433e311fa7..f4f04403d5b3 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -360,4 +360,18 @@ struct gzvm_ioeventfd { #define GZVM_IOEVENTFD _IOW(GZVM_IOC_MAGIC, 0x79, struct gzvm_ioeventfd) +/** + * struct gzvm_dtb_config: store address and size of dtb passed from userspace + * + * @dtb_addr: dtb address set by VMM (guset memory) + * @dtb_size: dtb size + */ +struct gzvm_dtb_config { + __u64 dtb_addr; + __u64 dtb_size; +}; + +#define GZVM_SET_DTB_CONFIG _IOW(GZVM_IOC_MAGIC, 0xff, \ + struct gzvm_dtb_config) + #endif /* __GZVM_H__ */ From patchwork Tue Sep 19 11:12:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724453 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68D063FB11 for ; Tue, 19 Sep 2023 11:12:57 +0000 (UTC) Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91E7A118; Tue, 19 Sep 2023 04:12:54 -0700 (PDT) X-UUID: 743a7b5256dd11ee8051498923ad61e6-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=DUcoGtAYHoFI1ruAGw3mT00nT3wQL/qjXv5AAFn7SOs=; b=Ju9SvvvcdSn/4CyodrrSJZHUXR7ZVyIn1nNyoYN7F1qbhNizyf4l+aqinYhHRWyaT0BP4P6A93RAIAW6tgQwHLV+w44EzBPH48AQv2j/F8alJmmKYd+k2kIBPeqkimyKrBj7nznJdj+ivvBSEZuLUu/S4r74PGO6xxQuhtVE20k=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:c709b901-a9b7-4534-8b1a-3bd3f11c1f67, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:372604bf-14cc-44ca-b657-2d2783296e72, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR: NO,DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0,NGT X-CID-BAS: 0,NGT,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR,TF_CID_SPAM_ULN X-UUID: 743a7b5256dd11ee8051498923ad61e6-20230919 Received: from mtkmbs14n1.mediatek.inc [(172.21.101.75)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1034992928; Tue, 19 Sep 2023 19:12:43 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs13n2.mediatek.inc (172.21.101.108) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:42 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:41 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , My Chuang , Kevenny Hsieh , Willix Yeh , Liju Chen Subject: [PATCH v6 14/15] virt: geniezone: Add memory pin/unpin support Date: Tue, 19 Sep 2023 19:12:09 +0800 Message-ID: <20230919111210.19615-15-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MTK: N X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED,RDNS_NONE, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Jerry Wang" Protected VM's memory cannot be swapped out because the memory pages are protected from host access. Once host accesses to those protected pages, the hardware exception is triggered and may crash the host. So, we have to make those protected pages be ineligible for swapping or merging by the host kernel to avoid host access. To do so, we pin the page when it is assigned (donated) to VM and unpin when VM relinquish the pages or is destroyed. Besides, the protected VM’s memory requires hypervisor to clear the content before returning to host, but VMM may free those memory before clearing, it will result in those memory pages are reclaimed and reused before totally clearing. Using pin/unpin can also avoid the above problems. The implementation is described as follows. - Use rb_tree to store pinned memory pages. - Pin the page when handling page fault. - Unpin the pages when VM relinquish the pages or is destroyed. Signed-off-by: Jerry Wang Signed-off-by: Yingshiuan Pan Signed-off-by: Liju Chen Signed-off-by: Yi-De Wu --- drivers/virt/geniezone/gzvm_exception.c | 28 ++++++++ drivers/virt/geniezone/gzvm_mmu.c | 89 +++++++++++++++++++++++++ drivers/virt/geniezone/gzvm_vcpu.c | 7 +- drivers/virt/geniezone/gzvm_vm.c | 18 +++++ include/linux/gzvm_drv.h | 13 ++++ include/uapi/linux/gzvm.h | 5 ++ 6 files changed, 157 insertions(+), 3 deletions(-) diff --git a/drivers/virt/geniezone/gzvm_exception.c b/drivers/virt/geniezone/gzvm_exception.c index 31fdb4ae8db4..58d1ce305cf7 100644 --- a/drivers/virt/geniezone/gzvm_exception.c +++ b/drivers/virt/geniezone/gzvm_exception.c @@ -37,3 +37,31 @@ bool gzvm_handle_guest_exception(struct gzvm_vcpu *vcpu) else return false; } + +/** + * gzvm_handle_guest_hvc() - Handle guest hvc + * @vcpu: Pointer to struct gzvm_vcpu_run in userspace + * Return: + * * true - This hvc has been processed, no need to back to VMM. + * * false - This hvc has not been processed, require userspace. + */ +bool gzvm_handle_guest_hvc(struct gzvm_vcpu *vcpu) +{ + unsigned long ipa; + int ret; + + switch (vcpu->run->hypercall.args[0]) { + case GZVM_HVC_MEM_RELINQUISH: + ipa = vcpu->run->hypercall.args[1]; + ret = gzvm_handle_relinquish(vcpu, ipa); + break; + default: + ret = false; + break; + } + + if (!ret) + return true; + else + return false; +} diff --git a/drivers/virt/geniezone/gzvm_mmu.c b/drivers/virt/geniezone/gzvm_mmu.c index 4fdbdd8e809d..542599f104db 100644 --- a/drivers/virt/geniezone/gzvm_mmu.c +++ b/drivers/virt/geniezone/gzvm_mmu.c @@ -107,6 +107,78 @@ int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, return 0; } +static int cmp_ppages(struct rb_node *node, const struct rb_node *parent) +{ + struct gzvm_pinned_page *a = container_of(node, + struct gzvm_pinned_page, + node); + struct gzvm_pinned_page *b = container_of(parent, + struct gzvm_pinned_page, + node); + + if (a->ipa < b->ipa) + return -1; + if (a->ipa > b->ipa) + return 1; + return 0; +} + +static int rb_ppage_cmp(const void *key, const struct rb_node *node) +{ + struct gzvm_pinned_page *p = container_of(node, + struct gzvm_pinned_page, + node); + phys_addr_t ipa = (phys_addr_t)key; + + return (ipa < p->ipa) ? -1 : (ipa > p->ipa); +} + +static int gzvm_insert_ppage(struct gzvm *vm, struct gzvm_pinned_page *ppage) +{ + if (rb_find_add(&ppage->node, &vm->pinned_pages, cmp_ppages)) + return -EEXIST; + return 0; +} + +static int pin_one_page(unsigned long hva, struct page **page) +{ + struct mm_struct *mm = current->mm; + unsigned int flags = FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE; + + mmap_read_lock(mm); + pin_user_pages(hva, 1, flags, page); + mmap_read_unlock(mm); + + return 0; +} + +/** + * gzvm_handle_relinquish() - Handle memory relinquish request from hypervisor + * + * @vcpu: Pointer to struct gzvm_vcpu_run in userspace + * @ipa: Start address(gpa) of a reclaimed page + * + * Return: Always return 0 because there are no cases of failure + */ +int gzvm_handle_relinquish(struct gzvm_vcpu *vcpu, phys_addr_t ipa) +{ + struct gzvm_pinned_page *ppage; + struct rb_node *node; + struct gzvm *vm = vcpu->gzvm; + + node = rb_find((void *)ipa, &vm->pinned_pages, rb_ppage_cmp); + + if (node) + rb_erase(node, &vm->pinned_pages); + else + return 0; + + ppage = container_of(node, struct gzvm_pinned_page, node); + unpin_user_pages_dirty_lock(&ppage->page, 1, true); + kfree(ppage); + return 0; +} + /** * gzvm_handle_page_fault() - Handle guest page fault, find corresponding page * for the faulting gpa @@ -118,7 +190,10 @@ int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, */ int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu) { + struct gzvm_pinned_page *ppage = NULL; struct gzvm *vm = vcpu->gzvm; + struct page *page = NULL; + unsigned long hva; u64 pfn, gfn; int memslot_id; int ret; @@ -136,5 +211,19 @@ int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu) if (unlikely(ret)) return -EFAULT; + hva = gzvm_gfn_to_hva_memslot(&vm->memslot[memslot_id], gfn); + pin_one_page(hva, &page); + + if (!page) + return -EFAULT; + + ppage = kmalloc(sizeof(*ppage), GFP_KERNEL_ACCOUNT); + if (!ppage) + return -ENOMEM; + + ppage->page = page; + ppage->ipa = vcpu->run->exception.fault_gpa; + gzvm_insert_ppage(vm, ppage); + return 0; } diff --git a/drivers/virt/geniezone/gzvm_vcpu.c b/drivers/virt/geniezone/gzvm_vcpu.c index 0c62fe7f2c37..c3d84644c3eb 100644 --- a/drivers/virt/geniezone/gzvm_vcpu.c +++ b/drivers/virt/geniezone/gzvm_vcpu.c @@ -36,8 +36,7 @@ static long gzvm_vcpu_update_one_reg(struct gzvm_vcpu *vcpu, return -EINVAL; if (is_write) { - if (vcpu->vcpuid > 0) - return -EPERM; + /* GZ hypervisor would filter out invalid vcpu register access */ if (copy_from_user(&data, reg_addr, reg_size)) return -EFAULT; } else { @@ -119,7 +118,9 @@ static long gzvm_vcpu_run(struct gzvm_vcpu *vcpu, void * __user argp) need_userspace = true; break; case GZVM_EXIT_HYPERCALL: - fallthrough; + if (!gzvm_handle_guest_hvc(vcpu)) + need_userspace = true; + break; case GZVM_EXIT_DEBUG: fallthrough; case GZVM_EXIT_FAIL_ENTRY: diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 9f7e44521de5..747e5cf523bf 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -292,6 +292,21 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, return ret; } +static void gzvm_destroy_ppage(struct gzvm *gzvm) +{ + struct gzvm_pinned_page *ppage; + struct rb_node *node; + + node = rb_first(&gzvm->pinned_pages); + while (node) { + ppage = rb_entry(node, struct gzvm_pinned_page, node); + unpin_user_pages_dirty_lock(&ppage->page, 1, true); + node = rb_next(node); + rb_erase(&ppage->node, &gzvm->pinned_pages); + kfree(ppage); + } +} + static void gzvm_destroy_vm(struct gzvm *gzvm) { pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id); @@ -308,6 +323,8 @@ static void gzvm_destroy_vm(struct gzvm *gzvm) mutex_unlock(&gzvm->lock); + gzvm_destroy_ppage(gzvm); + kfree(gzvm); } @@ -343,6 +360,7 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) gzvm->vm_id = ret; gzvm->mm = current->mm; mutex_init(&gzvm->lock); + gzvm->pinned_pages = RB_ROOT; ret = gzvm_vm_irqfd_init(gzvm); if (ret) { diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index b9e60fe5dcde..77c839c02b17 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -12,6 +12,8 @@ #include #include #include +#include +#include /* * For the normal physical address, the highest 12 bits should be zero, so we @@ -80,6 +82,12 @@ struct gzvm_vcpu { struct gzvm_vcpu_hwstate *hwstate; }; +struct gzvm_pinned_page { + struct rb_node node; + struct page *page; + u64 ipa; +}; + struct gzvm { struct gzvm_vcpu *vcpus[GZVM_MAX_VCPUS]; /* userspace tied to this vm */ @@ -106,6 +114,9 @@ struct gzvm { struct srcu_struct irq_srcu; /* lock for irq injection */ struct mutex irq_lock; + + /* Use rb-tree to record pin/unpin page */ + struct rb_root pinned_pages; }; long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args); @@ -147,6 +158,8 @@ int gzvm_arch_inform_exit(u16 vm_id); int gzvm_find_memslot(struct gzvm *vm, u64 gpa); int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu); bool gzvm_handle_guest_exception(struct gzvm_vcpu *vcpu); +bool gzvm_handle_guest_hvc(struct gzvm_vcpu *vcpu); +int gzvm_handle_relinquish(struct gzvm_vcpu *vcpu, phys_addr_t ipa); int gzvm_arch_create_device(u16 vm_id, struct gzvm_create_device *gzvm_dev); int gzvm_arch_inject_irq(struct gzvm *gzvm, unsigned int vcpu_idx, diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index 1f134c55ac2a..edac8e3ae790 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -191,6 +191,11 @@ enum { GZVM_EXCEPTION_PAGE_FAULT = 0x1, }; +/* hypercall definitions of GZVM_EXIT_HYPERCALL */ +enum { + GZVM_HVC_MEM_RELINQUISH = 0xc6000009, +}; + /** * struct gzvm_vcpu_run: Same purpose as kvm_run, this struct is * shared between userspace, kernel and From patchwork Tue Sep 19 11:12:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi-De Wu X-Patchwork-Id: 724454 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 728ED38BCA for ; Tue, 19 Sep 2023 11:12:56 +0000 (UTC) Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23353A9; Tue, 19 Sep 2023 04:12:52 -0700 (PDT) X-UUID: 740394f256dd11ee8051498923ad61e6-20230919 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=VD1e+qXEqMFBPXW2F3jMvAoTbp/+vpAF8wQLl5umujY=; b=Rue72yjuYHUirBgDaMz6HrbxrPAOm4QNJuHppO8gQlzi84jAIy1V33b7vdINLoXTj6Z2vuIKWUEhkKKcqwdA1M0lQBn528Bng807N8m4uS9MQ3+18oZ3bplDs2jQtwJkyFBtvht/+7Cw+3AVpnfnERJrVLhtcUm0DmBo7fKXXkg=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.32, REQID:bf092059-1912-42d0-81b9-8893abf473f3, IP:0, U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:5f78ec9, CLOUDID:382604bf-14cc-44ca-b657-2d2783296e72, B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:0,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR:NO, DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR X-UUID: 740394f256dd11ee8051498923ad61e6-20230919 Received: from mtkmbs10n2.mediatek.inc [(172.21.101.183)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1569459223; Tue, 19 Sep 2023 19:12:42 +0800 Received: from mtkmbs13n1.mediatek.inc (172.21.101.193) by mtkmbs13n1.mediatek.inc (172.21.101.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Tue, 19 Sep 2023 19:12:41 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n1.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Tue, 19 Sep 2023 19:12:41 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , My Chuang , Kevenny Hsieh , Willix Yeh , Liju Chen Subject: [PATCH v6 15/15] virt: geniezone: Add block-based demand paging support Date: Tue, 19 Sep 2023 19:12:10 +0800 Message-ID: <20230919111210.19615-16-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20230919111210.19615-1-yi-de.wu@mediatek.com> References: <20230919111210.19615-1-yi-de.wu@mediatek.com> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MTK: N X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED,RDNS_NONE, SPF_HELO_PASS,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: "Yingshiuan Pan" To balance memory usage and performance, GenieZone supports larger granularity demand paging, called block-based demand paging. Gzvm driver uses enable_cap to query the hypervisor if it supports block-based demand paging and the given granularity or not. Meanwhile, the gzvm driver allocates a shared buffer for storing the physical pages later. If the hypervisor supports, every time the gzvm driver handles guest page faults, it allocates more memory in advance (default: 2MB) for demand paging. And fills those physical pages into the allocated shared memory, then calls the hypervisor to map to guest's memory. The physical pages allocated for block-based demand paging is not necessary to be contiguous because in many cases, 2MB block is not followed. 1st, the memory is allocated because of VMM's page fault (VMM loads kernel image to guest memory before running). In this case, the page is allocated by the host kernel and using PAGE_SIZE. 2nd is that guest may return memory to host via ballooning and that is still 4KB (or PAGE_SIZE) granularity. Therefore, we do not have to allocate physically contiguous 2MB pages. Signed-off-by: Yingshiuan Pan Signed-off-by: Liju Chen Signed-off-by: Yi-De Wu --- arch/arm64/geniezone/gzvm_arch_common.h | 2 + arch/arm64/geniezone/vm.c | 19 +++- drivers/virt/geniezone/gzvm_mmu.c | 110 +++++++++++++++++------- drivers/virt/geniezone/gzvm_vm.c | 49 +++++++++++ include/linux/gzvm_drv.h | 16 ++++ include/uapi/linux/gzvm.h | 2 + 6 files changed, 164 insertions(+), 34 deletions(-) diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index c21eb8ca8d4b..f57f799bbbc9 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -25,6 +25,7 @@ enum { GZVM_FUNC_MEMREGION_PURPOSE = 15, GZVM_FUNC_SET_DTB_CONFIG = 16, GZVM_FUNC_MAP_GUEST = 17, + GZVM_FUNC_MAP_GUEST_BLOCK = 18, NR_GZVM_FUNC, }; @@ -50,6 +51,7 @@ enum { #define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE) #define MT_HVC_GZVM_SET_DTB_CONFIG GZVM_HCALL_ID(GZVM_FUNC_SET_DTB_CONFIG) #define MT_HVC_GZVM_MAP_GUEST GZVM_HCALL_ID(GZVM_FUNC_MAP_GUEST) +#define MT_HVC_GZVM_MAP_GUEST_BLOCK GZVM_HCALL_ID(GZVM_FUNC_MAP_GUEST_BLOCK) #define GIC_V3_NR_LRS 16 diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c index d236b6cf84b3..d1948045ed72 100644 --- a/arch/arm64/geniezone/vm.c +++ b/arch/arm64/geniezone/vm.c @@ -299,10 +299,11 @@ static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm, fallthrough; case GZVM_CAP_PVM_SET_PROTECTED_VM: /* - * To improve performance for protected VM, we have to populate VM's memory - * before VM booting + * If the hypervisor doesn't support block-based demand paging, we + * populate memory in advance to improve performance for protected VM. */ - populate_mem_region(gzvm); + if (gzvm->demand_page_gran == PAGE_SIZE) + populate_mem_region(gzvm); ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res); return ret; case GZVM_CAP_PVM_GET_PVMFW_SIZE: @@ -319,12 +320,16 @@ int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp) { + struct arm_smccc_res res = {0}; int ret; switch (cap->cap) { case GZVM_CAP_PROTECTED_VM: ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp); return ret; + case GZVM_CAP_BLOCK_BASED_DEMAND_PAGING: + ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res); + return ret; default: break; } @@ -365,3 +370,11 @@ int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn, return gzvm_hypcall_wrapper(MT_HVC_GZVM_MAP_GUEST, vm_id, memslot_id, pfn, gfn, nr_pages, 0, 0, &res); } + +int gzvm_arch_map_guest_block(u16 vm_id, int memslot_id, u64 gfn) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_MAP_GUEST_BLOCK, vm_id, + memslot_id, gfn, 0, 0, 0, 0, &res); +} diff --git a/drivers/virt/geniezone/gzvm_mmu.c b/drivers/virt/geniezone/gzvm_mmu.c index 542599f104db..02b9d89d482d 100644 --- a/drivers/virt/geniezone/gzvm_mmu.c +++ b/drivers/virt/geniezone/gzvm_mmu.c @@ -140,15 +140,30 @@ static int gzvm_insert_ppage(struct gzvm *vm, struct gzvm_pinned_page *ppage) return 0; } -static int pin_one_page(unsigned long hva, struct page **page) +static int pin_one_page(struct gzvm *vm, unsigned long hva, u64 gpa) { - struct mm_struct *mm = current->mm; unsigned int flags = FOLL_HWPOISON | FOLL_LONGTERM | FOLL_WRITE; + struct gzvm_pinned_page *ppage = NULL; + struct mm_struct *mm = current->mm; + struct page *page = NULL; + + ppage = kmalloc(sizeof(*ppage), GFP_KERNEL_ACCOUNT); + if (!ppage) + return -ENOMEM; mmap_read_lock(mm); - pin_user_pages(hva, 1, flags, page); + pin_user_pages(hva, 1, flags, &page); mmap_read_unlock(mm); + if (!page) { + kfree(ppage); + return -EFAULT; + } + + ppage->page = page; + ppage->ipa = gpa; + gzvm_insert_ppage(vm, ppage); + return 0; } @@ -179,6 +194,62 @@ int gzvm_handle_relinquish(struct gzvm_vcpu *vcpu, phys_addr_t ipa) return 0; } +static int handle_block_demand_page(struct gzvm *vm, int memslot_id, u64 gfn) +{ + u64 pfn, __gfn, start_gfn; + u32 nr_entries; + unsigned long hva; + int ret, i; + + nr_entries = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE; + start_gfn = ALIGN_DOWN(gfn, nr_entries); + + mutex_lock(&vm->demand_paging_lock); + for (i = 0, __gfn = start_gfn; i < nr_entries; i++, __gfn++) { + ret = gzvm_gfn_to_pfn_memslot(&vm->memslot[memslot_id], __gfn, + &pfn); + if (unlikely(ret)) { + ret = -ERR_FAULT; + goto err_unlock; + } + vm->demand_page_buffer[i] = pfn; + + hva = gzvm_gfn_to_hva_memslot(&vm->memslot[memslot_id], __gfn); + ret = pin_one_page(vm, hva, PFN_PHYS(__gfn)); + if (ret) + goto err_unlock; + } + + ret = gzvm_arch_map_guest_block(vm->vm_id, memslot_id, start_gfn); + if (unlikely(ret)) { + ret = -EFAULT; + goto err_unlock; + } + +err_unlock: + mutex_unlock(&vm->demand_paging_lock); + + return ret; +} + +static int handle_single_demand_page(struct gzvm *vm, int memslot_id, u64 gfn) +{ + unsigned long hva; + int ret; + u64 pfn; + + ret = gzvm_gfn_to_pfn_memslot(&vm->memslot[memslot_id], gfn, &pfn); + if (unlikely(ret)) + return -EFAULT; + + ret = gzvm_arch_map_guest(vm->vm_id, memslot_id, pfn, gfn, 1); + if (unlikely(ret)) + return -EFAULT; + + hva = gzvm_gfn_to_hva_memslot(&vm->memslot[memslot_id], gfn); + return pin_one_page(vm, hva, PFN_PHYS(gfn)); +} + /** * gzvm_handle_page_fault() - Handle guest page fault, find corresponding page * for the faulting gpa @@ -190,40 +261,17 @@ int gzvm_handle_relinquish(struct gzvm_vcpu *vcpu, phys_addr_t ipa) */ int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu) { - struct gzvm_pinned_page *ppage = NULL; struct gzvm *vm = vcpu->gzvm; - struct page *page = NULL; - unsigned long hva; - u64 pfn, gfn; int memslot_id; - int ret; + u64 gfn; gfn = PHYS_PFN(vcpu->run->exception.fault_gpa); memslot_id = gzvm_find_memslot(vm, gfn); if (unlikely(memslot_id < 0)) return -EFAULT; - ret = gzvm_gfn_to_pfn_memslot(&vm->memslot[memslot_id], gfn, &pfn); - if (unlikely(ret)) - return -EFAULT; - - ret = gzvm_arch_map_guest(vm->vm_id, memslot_id, pfn, gfn, 1); - if (unlikely(ret)) - return -EFAULT; - - hva = gzvm_gfn_to_hva_memslot(&vm->memslot[memslot_id], gfn); - pin_one_page(hva, &page); - - if (!page) - return -EFAULT; - - ppage = kmalloc(sizeof(*ppage), GFP_KERNEL_ACCOUNT); - if (!ppage) - return -ENOMEM; - - ppage->page = page; - ppage->ipa = vcpu->run->exception.fault_gpa; - gzvm_insert_ppage(vm, ppage); - - return 0; + if (vm->demand_page_gran == PAGE_SIZE) + return handle_single_demand_page(vm, memslot_id, gfn); + else + return handle_block_demand_page(vm, memslot_id, gfn); } diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 747e5cf523bf..a7d43bedfad0 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -309,6 +309,8 @@ static void gzvm_destroy_ppage(struct gzvm *gzvm) static void gzvm_destroy_vm(struct gzvm *gzvm) { + size_t allocated_size; + pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id); mutex_lock(&gzvm->lock); @@ -321,6 +323,11 @@ static void gzvm_destroy_vm(struct gzvm *gzvm) list_del(&gzvm->vm_list); mutex_unlock(&gzvm_list_lock); + if (gzvm->demand_page_buffer) { + allocated_size = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE * sizeof(u64); + free_pages_exact(gzvm->demand_page_buffer, allocated_size); + } + mutex_unlock(&gzvm->lock); gzvm_destroy_ppage(gzvm); @@ -342,6 +349,46 @@ static const struct file_operations gzvm_vm_fops = { .llseek = noop_llseek, }; +/** + * setup_vm_demand_paging - Query hypervisor suitable demand page size and set + * @vm: gzvm instance for setting up demand page size + * + * Return: void + */ +static void setup_vm_demand_paging(struct gzvm *vm) +{ + u32 buf_size = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE * sizeof(u64); + struct gzvm_enable_cap cap = {0}; + void *buffer; + int ret; + + mutex_init(&vm->demand_paging_lock); + buffer = alloc_pages_exact(buf_size, GFP_KERNEL); + if (!buffer) { + /* Fall back to use default page size for demand paging */ + vm->demand_page_gran = PAGE_SIZE; + vm->demand_page_buffer = NULL; + return; + } + + cap.cap = GZVM_CAP_BLOCK_BASED_DEMAND_PAGING; + cap.args[0] = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE; + cap.args[1] = (__u64)virt_to_phys(buffer); + /* demand_page_buffer is freed when destroy VM */ + vm->demand_page_buffer = buffer; + + ret = gzvm_vm_ioctl_enable_cap(vm, &cap, NULL); + if (ret == 0) { + vm->demand_page_gran = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE; + /* freed when destroy vm */ + vm->demand_page_buffer = buffer; + } else { + vm->demand_page_gran = PAGE_SIZE; + vm->demand_page_buffer = NULL; + free_pages_exact(buffer, buf_size); + } +} + static struct gzvm *gzvm_create_vm(unsigned long vm_type) { int ret; @@ -376,6 +423,8 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) return ERR_PTR(ret); } + setup_vm_demand_paging(gzvm); + mutex_lock(&gzvm_list_lock); list_add(&gzvm->vm_list, &gzvm_list); mutex_unlock(&gzvm_list_lock); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index 77c839c02b17..a6994a1c391a 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -46,6 +46,8 @@ #define GZVM_VCPU_RUN_MAP_SIZE (PAGE_SIZE * 2) +#define GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE (2 * 1024 * 1024) /* 2MB */ + /* struct mem_region_addr_range - Identical to ffa memory constituent */ struct mem_region_addr_range { /* the base IPA of the constituent memory region, aligned to 4 kiB */ @@ -117,6 +119,19 @@ struct gzvm { /* Use rb-tree to record pin/unpin page */ struct rb_root pinned_pages; + + /* + * demand page granularity: how much memory we allocate for VM in a + * single page fault + */ + u32 demand_page_gran; + /* the mailbox for transferring large portion pages */ + u64 *demand_page_buffer; + /* + * lock for preventing multiple cpu using the same demand page mailbox + * at the same time + */ + struct mutex demand_paging_lock; }; long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args); @@ -137,6 +152,7 @@ int gzvm_arch_create_vm(unsigned long vm_type); int gzvm_arch_destroy_vm(u16 vm_id); int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn, u64 nr_pages); +int gzvm_arch_map_guest_block(u16 vm_id, int memslot_id, u64 gfn); int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp); diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index edac8e3ae790..a98787a18c36 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -18,6 +18,8 @@ #define GZVM_CAP_VM_GPA_SIZE 0xa5 #define GZVM_CAP_PROTECTED_VM 0xffbadab1 +/* query hypervisor supported block-based demand page */ +#define GZVM_CAP_BLOCK_BASED_DEMAND_PAGING 0x9201 /* sub-commands put in args[0] for GZVM_CAP_PROTECTED_VM */ #define GZVM_CAP_PVM_SET_PVMFW_GPA 0