From patchwork Tue Mar 18 16:20:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874484 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5AF52116F6 for ; Tue, 18 Mar 2025 16:20:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314854; cv=none; b=PrnjI+34fCXQydN+CAOFPEexcnBK5gn2aPI52ZNzbHH4Hkm/ArkF6t/KDEYxMGydE/N3b2dn+ltpg6zAz9BN1GwGgj1I8XndKbRArH5EQDyzo+cn86x3qZAK8xHewJqMJ8usnE7xs9cS8hWracfRSL8363JislFLsB6/vALq/A0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314854; c=relaxed/simple; bh=+uAQQFv/HC4X/TsoZiJYsInJJ/tobCJzXnm8Ee+ujoU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=uyYmhcGqI68hWYDn2bUWmRL2xsWxBLXFCV/NHjHdt937FZXuld/QnEUdJJEVh1BsJ7z3lBgVo743EX1k3h4jne1q2Bwr86V4l9k7IpGwg48FYS0R38smVOQfrjrEaalMUDKqRysO60zf2LWHeLY9m3eX79jm+zxXKk86R+rvQGs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=HC7CJ57n; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="HC7CJ57n" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-438da39bb69so27293675e9.0 for ; Tue, 18 Mar 2025 09:20:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314850; x=1742919650; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=v5BCJQxA8skMwKLEJl+6kBnMuf3AEUFL1v827aCOrto=; b=HC7CJ57nD1VJOOGlQP7/q/R7Oul3XIuHDc33aRZQLxdVcHn3tcsZFaIZfOY99CSba7 6wQujL16MXBZuS47WgUTa/C7H8D6fObdoP+npFm9MrOfT/DaevwjyUVLdHMHmYVjmXpX aj4NtBUE/LWrAj2AvlVitAwWqtY4MkPNEFPnFPJA8QpQSVHsw74RVWPwisgibJqzxfJx qOSdH5NK6Ebrf9vDX/Lzd1pPIaYcXc28pnRBZg0NY1qdp8n8IUVt2BFUTlb8fWr85pCX jFGy/mRdmwvspGowso7GjvKmqL1nYqH2zDkWCqmge44mcWqTo+PvvmwJrFdPyS72IdLX 7XaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314850; x=1742919650; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=v5BCJQxA8skMwKLEJl+6kBnMuf3AEUFL1v827aCOrto=; b=PfJvz9StWigojzFntNt7y3QkYxXmk6cOK9nIVKh5mum/WVlUsk2Z7bvBtWeKgK84KT /ffo+M4WfT3lIB6CGlcoHBesDdsOuLZK9McepXb/7mwvzPjFX3qaCIzGkVC0z47cKdb+ pxobTJekXCA2kk5WUkcwTtrYDnsBvgEfaTzzwxAp15Ix0UbH8lJqFDsiRiEerYhYKHUj 4RkHlgOMCezf7UR1qaZTAT20ZIbXtQ+J01ut1sa9wjpbZMTlWsaHQIG4YMACXhpI8Yr1 R/JQwEIocnha/4O9uIoly7GrJamaNfPSTPlvIH2kRd4mJNLZs4MYDEpi7b2qmjemFJau xQpg== X-Forwarded-Encrypted: i=1; AJvYcCUDkXveoXTfTO9D/Eg5ugc2spyU20009QpxStmkoxNaqLp5xAP0XiYtEMRFTz8ZayD1PpihfFhar/cK1t3g@vger.kernel.org X-Gm-Message-State: AOJu0Ywo1+Td3xOKX1qMFwy7X6SO/79dtCkFgyZ4vmuSsqIqNnS6n98d fbJgGGBlw9o1WsiSYl+IKc0uXI1bfIyGpoiMVyILrhm5uq/Azd0zjVoAosPSiElXCHGFAynT6g= = X-Google-Smtp-Source: AGHT+IGbzRLPDDV73rxMsaRerWBWjkFmSiW+9snF6eGB9z04UAhCgCAjBT3TYQ4Timp1oFUP+R5pRnwAyg== X-Received: from wmbgx13.prod.google.com ([2002:a05:600c:858d:b0:43d:1d5b:1c79]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4690:b0:43c:ec28:d303 with SMTP id 5b1f17b1804b1-43d3b950aefmr32213585e9.5.1742314850210; Tue, 18 Mar 2025 09:20:50 -0700 (PDT) Date: Tue, 18 Mar 2025 16:20:40 +0000 In-Reply-To: <20250318162046.4016367-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318162046.4016367-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318162046.4016367-2-tabba@google.com> Subject: [PATCH v6 1/7] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com From: Ackerley Tng Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Fuad Tabba Signed-off-by: Ackerley Tng --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 130 +++++++++++++++++++++++++++++++------ 2 files changed, 111 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index fbf89e643add..844e70c82558 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,16 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -320,6 +324,38 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + #ifdef CONFIG_KVM_GMEM_SHARED_MEM static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) { @@ -430,6 +466,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -511,11 +549,79 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + int err; + + err = -ENOENT; + if (!try_module_get(kvm_gmem_fops.owner)) + goto err; + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) { + err = PTR_ERR(inode); + goto err_put_module; + } + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + err = PTR_ERR(file); + goto err_put_inode; + } + + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + +out: + return file; + +err_put_inode: + iput(inode); +err_put_module: + module_put(kvm_gmem_fops.owner); +err: + file = ERR_PTR(err); + goto out; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -529,32 +635,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Tue Mar 18 16:20:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874483 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6840A210F5A for ; Tue, 18 Mar 2025 16:20:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314858; cv=none; b=YE5nj6a3IbjkUQNuyuqhk2PoxL+pxwnV+Me5y3dbmxm7soovVpQ9Dk8Y8w1Y3CjMx+9m2CgJgRDPq7FNP358f5UDHGGuQLHjA+p57KXOPjOLVPjroCmKqfEuRr5Z5UKp7iK1ftVaLJX1neRBO3+klKopK5tSBt+7G2WA6VwMuWY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314858; c=relaxed/simple; bh=ZUn2qEyVEu13Wirg+1ah/shBPJIBQ0RhDgdACeIp8YM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WcO+qQGmHD8x5Bmd8xFEfNAnE/KGoW/lFftj5ukvLPa+V4ZGd6PqSzvtePnDHbCMa3ghXUPKrox+D1FGH2TBbBWl1ladSGq7laLdnhd2luDjMjLsX3yTPQETFox9jTBqEH5IeGvl1bLIxzxO8QzwcIJl4iT6ow+9hwMn4wcChjs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=m6NE0H9s; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="m6NE0H9s" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43ceed237efso28903935e9.0 for ; Tue, 18 Mar 2025 09:20:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314852; x=1742919652; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rIMCAizGZejWtu7P2Nd20c3m6UoVLaxoQ99DBpyfZ/M=; b=m6NE0H9s3btCA0iEmLMmgYJjcpx0wc9mE/rkvXb2GFXtQuD+aytTZPk+n6Fe2xjUBM YoDIqT1aytlHfBcAvd6AtDKrjuFaaj1bmWGYIE7XXEwfqo6zpdfLRWxpGn3TbpIi9tLu dO9dfquCyHzMMY+776HEHPtprUIWPVcBnu1kk04bKgQRyipBlmvsRmC6QQ5icwcALl/6 FZ/0U05JUG2wMbxBLbP9lyRfp6W+rnJmAOq5R9AGseYmhWLFzHrDKzfx6QdwDAXBIZBH iJnPOWUJkVWC1jcacJvZzUHnOsDqJpjUEuNGVJMYYTwUXsJU8TecS1Mi0D51exWtWy1r jW9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314852; x=1742919652; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rIMCAizGZejWtu7P2Nd20c3m6UoVLaxoQ99DBpyfZ/M=; b=byvePRuhvDMUgmF2S1BWRYETpdETLhgXa9qaYoUX6yjTi3t1uibfKVQFFZn1DKVRxl bs53Ih9+b4GvZb+aPNZhFvj6QduRGrSHYuaiJF6uXdkGPPO91P22iDwGe8eZWBHrs7xc ifslYzPgNe3RlHC2vjNNp3gEWoaXDw20Kig1mOKv+HkI1pq44xJei3USxwN/q9/oVdPv qUk0bKnQF/eynq1GCm0TyhsEiU4Fg1IM/7cMY3gykYqsL05ZPj4ujRYQRJhbPW+2Wd0E JUp8oWMs6xiwmBDYo05ylvgzfIIOSsrv5krP62p7/EYAs8Kd0dHnG74+9I4TagYMWckk 7yUw== X-Forwarded-Encrypted: i=1; AJvYcCVfN0E5Ml9Wnqjer6HKGmaniCrVDvMpmjmXHt3B6h9RD42dZtr7DApAJOL5qowxzMtGytPj9M8eA8NveGjA@vger.kernel.org X-Gm-Message-State: AOJu0Yxh5wdbvGUyQBCff/0wi9Ez5AbYGny01qUJxDFmVFrclF4t3P9U Guki8/27ZY/ivE+7Rnye+PnJLroxAZqIDeEINjHtC9gzSg79qaGYkswsqZR/yY+njlOD84/35Q= = X-Google-Smtp-Source: AGHT+IFmHd9CyHM8TklDJLl/Zwd2hd7SSZMAwOsu5I2BqW7wDeKi1D4jAZf4EsiqIdJLp0CMxLUAq5zQDw== X-Received: from wmcn4.prod.google.com ([2002:a05:600c:c0c4:b0:43c:fae1:8125]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:5103:b0:43d:fa:1f9a with SMTP id 5b1f17b1804b1-43d3ba2971emr26297695e9.30.1742314852121; Tue, 18 Mar 2025 09:20:52 -0700 (PDT) Date: Tue, 18 Mar 2025 16:20:41 +0000 In-Reply-To: <20250318162046.4016367-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318162046.4016367-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318162046.4016367-3-tabba@google.com> Subject: [PATCH v6 2/7] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Create a new variant of kvm_gmem_get_pfn(), which retains the folio lock if it returns successfully. This is needed in subsequent patches to protect against races when checking whether a folio can be shared with the host. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 +++++++++++ virt/kvm/guest_memfd.c | 27 ++++++++++++++++++++------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ec3bedc18eab..bc73d7426363 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2535,6 +2535,9 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, struct page **page, int *max_order); +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order); #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -2544,6 +2547,14 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_get_pfn_locked(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, + struct page **page, int *max_order) +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 844e70c82558..ac6b8853699d 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -802,9 +802,9 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, return folio; } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, struct page **page, - int *max_order) +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) { pgoff_t index = kvm_gmem_get_index(slot, gfn); struct file *file = kvm_gmem_get_file(slot); @@ -824,17 +824,30 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, if (!is_prepared) r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio); - folio_unlock(folio); - - if (!r) + if (!r) { *page = folio_file_page(folio, index); - else + } else { + folio_unlock(folio); folio_put(folio); + } out: fput(file); return r; } +EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn_locked); + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) +{ + int r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, pfn, page, max_order); + + if (!r) + unlock_page(*page); + + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM From patchwork Tue Mar 18 16:20:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874482 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E86A212D6A for ; Tue, 18 Mar 2025 16:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314862; cv=none; b=RAXgSlciIT+/E/Z+LMDNsGc5gfYoXIxN3omC6xEjAPbKY3sou52sa8ZRCDGJoiVepY37oGoweFMUoimW+U4GtH26tORnf8oNVZY+zYqfEqdLux5lUQ+UEG6c005HQhbqamRsPZb/oJfH08H34s8NU9TnrBRogfMCjerswXluoF8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314862; c=relaxed/simple; bh=nukMnpo5hy9oJe7htxSYLAXzKXNYRN9Nkodg1PhI/fU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gd+kCQmHWYL6erYfpkplQbj028vUCD9naORXFcwBo4ku36P87UtGgcJHZbqiMlWxAHbbBcWrTHqNgIN27Hri5deh/aZIX55Q3vg94auoV+FSLAcr0rH9AY5yF3aKPHDQt4W0wulpO7qTW5liAm455l8KbtINJ7ddpog9fxtkKrI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kTeCcvX9; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kTeCcvX9" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43947a0919aso24738565e9.0 for ; Tue, 18 Mar 2025 09:21:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314860; x=1742919660; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=G9KWHi/KBkZHmOxvUg1u2dLw8hruOCqp8/fshelQJPQ=; b=kTeCcvX9F0zulynzjuz71s38neWEGV/QOas8kjqCcl8hlPdwMjWMYu6U2HNui60A9d asPN6896fjAzN42Tco++6JL1Gcd/4Vt5Dhs/iq4AfApaKdVok5vQ5tVZRb5ZEFQfUd6U rDZ2Rhl40Tjcx0CWUs6U6OMLhAH9cO0meoCTCQbj4jTJ4zcH92L1wT2555LR8PPx0D7n XeX75K730KzTZTr8UUpND3/E6Y4KUJyZ2mq6quC4lebvyiWJTio05Bq9GdNwL36cOPpo i9YpLavnx7GsD8huMBaZFo4YJUN7TTWz5lmZdtxD2Y60LQFj2cDryxPvRevARkmPiijn vMew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314860; x=1742919660; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=G9KWHi/KBkZHmOxvUg1u2dLw8hruOCqp8/fshelQJPQ=; b=H5XW1fKUkUuL4gZy1C9TX54D1PL/zlauRtU2IuMMHLchejvs6X6CjWjxj3bDGYav2F 3Mo4swDSvKcE3rC3/jWn//jAl0tjnXKIgT7PPR69h04KU70tjIcwPx5Laq81TOJN7Go1 ipeDccw5M08xTmZhHW17nvw8xNKx+YKuHxUk+A45bNKJLf/t3DM1fGk70Jmln+UXPbiV IV7zIdiO1UuB3j85Sb/MuQHE8UF5b3qWte+LVCtp84dy0+GeNUSSOrh32QEMWf2SVV4n MytV5oCvGM5PRXjq88k21BOfq0p2KfhgCvOQH7q6ePby4gApGZ5C/yvGRRHga5+lIqKe 2A4Q== X-Forwarded-Encrypted: i=1; AJvYcCUuOC6I++gwv2qAtiL/s6UuLsYeQ+TV7/MBSDWp5cQVAj1+frYmu4Yrz2uy7jDWEdjeOsug9Y27fWhUl+Br@vger.kernel.org X-Gm-Message-State: AOJu0YzSUZfWkQHg2N3u5Ogh83BZHZ3prN6owCGCpfWni2aI6LwJMZc8 Nh6LcI2+Ckhpwc87anpJKUo5Sh6lf3bnk7I/MLlg5YUmRJV4E92Q1jDTR//YFCxtpXdnWSUVIw= = X-Google-Smtp-Source: AGHT+IEh2X4kUYxpJwUkKLEJgfMN+mg0RefgQxFOkAM1n9rRoUkko5MREaX9qI+5+A7CeT0+SuksFzPCZg== X-Received: from wmqe11.prod.google.com ([2002:a05:600c:4e4b:b0:43d:4038:9229]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:848d:b0:43d:94:cff0 with SMTP id 5b1f17b1804b1-43d3b9dc3b0mr24262095e9.19.1742314859735; Tue, 18 Mar 2025 09:20:59 -0700 (PDT) Date: Tue, 18 Mar 2025 16:20:45 +0000 In-Reply-To: <20250318162046.4016367-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318162046.4016367-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318162046.4016367-7-tabba@google.com> Subject: [PATCH v6 6/7] KVM: guest_memfd: Handle invalidation of shared memory From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com When guest_memfd backed memory is invalidated, e.g., on punching holes, releasing, ensure that the sharing states are updated and that any folios in a transient state are restored to an appropriate state. Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 56 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 4fd9e5760503..0487a08615f0 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -117,6 +117,16 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) return filemap_grab_folio(inode->i_mapping, index); } +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +static void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end); +#else +static inline void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ +} +#endif + static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, pgoff_t end) { @@ -126,6 +136,7 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, unsigned long index; xa_for_each_range(&gmem->bindings, index, slot, start, end - 1) { + struct file *file = READ_ONCE(slot->gmem.file); pgoff_t pgoff = slot->gmem.pgoff; struct kvm_gfn_range gfn_range = { @@ -145,6 +156,16 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, } flush |= kvm_mmu_unmap_gfn_range(kvm, &gfn_range); + + /* + * If this gets called after kvm_gmem_unbind() it means that all + * in-flight operations are gone, and the file has been closed. + */ + if (file) { + kvm_gmem_offset_range_invalidate_shared(file_inode(file), + gfn_range.start, + gfn_range.end); + } } if (flush) @@ -509,6 +530,41 @@ static int kvm_gmem_offset_clear_shared(struct inode *inode, pgoff_t index) return r; } +/* + * Callback when invalidating memory that is potentially shared. + * + * Must be called with the filemap (inode->i_mapping) invalidate_lock held. + */ +static void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + pgoff_t i; + + rwsem_assert_held_write_nolockdep(&inode->i_mapping->invalidate_lock); + + for (i = start; i < end; i++) { + /* + * If the folio is NONE_SHARED, it indicates that it is + * transitioning to private (GUEST_SHARED). Transition it to + * shared (ALL_SHARED) and remove the callback. + */ + if (xa_to_value(xa_load(shared_offsets, i)) == KVM_GMEM_NONE_SHARED) { + struct folio *folio = folio = filemap_lock_folio(inode->i_mapping, i); + + if (!WARN_ON_ONCE(IS_ERR(folio))) { + if (folio_test_guestmem(folio)) + kvm_gmem_restore_pending_folio(folio, inode); + + folio_unlock(folio); + folio_put(folio); + } + } + + xa_erase(shared_offsets, i); + } +} + /* * Marks the range [start, end) as not shared with the host. If the host doesn't * have any references to a particular folio, then that folio is marked as From patchwork Tue Mar 18 16:20:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 874481 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5955D212F94 for ; Tue, 18 Mar 2025 16:21:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314865; cv=none; b=L4af0E75LVodQCs+GziS7tIl/qjEXfimmJXebZjs7TI9HiFQcdFwxFZjn69rxf/yJSurnVs2CgPs2x/KgP15ZOOQ96zdwWFasmWka3iBMDonZ8bpk0H/heWuo+r5UsEn+uB/+pfrmvNgDo1a0O3qqgXsKZ7sQ2oG+2nAYx6Ua7Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742314865; c=relaxed/simple; bh=PCszwYmlt7Nl7NqWG5DO7Cuq8LIT0s0e6rIMF1sSTTE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QkcVU1EQi+R5afcAAiCMZJvlQtF5S+NChbOFEspqkhQQ0Spt059VAPIBbra15z/Pu7yBM150RNGhSNfLRqEQeDguIFXfAAMs9Z11cgDf1RDCPT66pOMBg8RlxJH6d6p/IVSRwMuXXG8LwNXtUcocc4acEaWIyA2/8Qi1vL/BVCE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ScgpIAv8; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ScgpIAv8" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43cf44b66f7so25660155e9.1 for ; Tue, 18 Mar 2025 09:21:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1742314862; x=1742919662; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Rd6be5I6g0Af011AYVWod/OD1lPj9HqVBk4Grxnf+mY=; b=ScgpIAv8Oqew8fq/v23VIyePgocwc55aI+Ptreyeh69xAFx6SXahaV7WhQBvQt/Zon Hyz9XzwdSlu0Am0KdbJT2JAwxxhOU7eOr8Idm/dNV9LIQsZ17il3QIkcsS1m5gspAcvY bF4ittSUrFlizYthPSK8nGfiCELWW6JWNfEooNTGTUYmdqLs8qQs0pvuSGWXAinlSVUV +0v8BdQllH/aEhDds+HyeBNeyjNI7ZnT0nH3JPV1vNHIy34d1kuXrDLChlsARIAsGJs5 KqiqX+qHEBB4i438jd57j7d4vOtgRG7xshLZpTcfVASnHVfCK27PET1D9CcnIeZii3o8 17VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742314862; x=1742919662; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Rd6be5I6g0Af011AYVWod/OD1lPj9HqVBk4Grxnf+mY=; b=IA+ZM4HVDau9rozJLTYcDjxlJbwG3b22yYTJklAhfPvDvDGbkvK+A7f+/3wJEjd9j9 yDzWLc280a/grP/q61XbXcfpXLMhShdPwXyT+uxFPNdzbZ+jwhItdPPn2gzBECrClHak C4fIwYmgR3lxalAJcgPNhVLSF86tox2JoIuj/xTGbwkxTCS6O3GRBg5d54IX0bVMjtkF yNDqw/q3C29O0sX68fNqcMWksldIoam4viKVSv5EmeXEvgrF3ojn91DaSlrEfw9ADah7 Lf8CPQPAnr0du04ex+1ZXaqXgFz7ZnQ/L8YOltkpPyLCGe32t0HY+hTxp07TVVvuqLAk b2Nw== X-Forwarded-Encrypted: i=1; AJvYcCU1bzxfJKIaIwBgwY1RLD38iG2Dm1qSwodgChaMe4QELyXJ5OMLRh/Q98KNKNqA6mXCGcYIjZLlBDkltpOs@vger.kernel.org X-Gm-Message-State: AOJu0YyixWruynHhlXAVBh8ygCPR/AZx1L07QbqoWaC6Nf/nUfYw+Cgy CZ27zocuR9EnsJZM9m1Xzdvl4MwWTfpMsbsPtj01jwZ1shAf1+3i7aD5AjQdngkBZhnOZ2Q/8Q= = X-Google-Smtp-Source: AGHT+IFTQwZzOpfOU0EGgsy+KvdlZSjBRDdavIz7oAkVvTl07k3n0vOwWg2N3mGBwzwh7UY/m+wnPF3ykA== X-Received: from wmby26.prod.google.com ([2002:a05:600c:c05a:b0:43d:c77:3fd8]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:19d2:b0:43d:5ec:b2f4 with SMTP id 5b1f17b1804b1-43d3b98d127mr28080345e9.10.1742314861661; Tue, 18 Mar 2025 09:21:01 -0700 (PDT) Date: Tue, 18 Mar 2025 16:20:46 +0000 In-Reply-To: <20250318162046.4016367-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250318162046.4016367-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.rc1.451.g8f38331e32-goog Message-ID: <20250318162046.4016367-8-tabba@google.com> Subject: [PATCH v6 7/7] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Not all use cases require guest_memfd() to be shared with the host when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_SHARED, which when set on KVM_CREATE_GUEST_MEMFD initializes the memory as shared with the host, and therefore mappable by it. Otherwise, memory is private until explicitly shared by the guest with the host. Signed-off-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 4 ++++ include/uapi/linux/kvm.h | 1 + tools/testing/selftests/kvm/guest_memfd_test.c | 7 +++++-- virt/kvm/guest_memfd.c | 12 ++++++++++++ 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 2b52eb77e29c..a5496d7d323b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6386,6 +6386,10 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If the capability KVM_CAP_GMEM_SHARED_MEM is supported, then the flags field +supports GUEST_MEMFD_FLAG_INIT_SHARED, which initializes the memory as shared +with the host, and thereby, mappable by it. + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 117937a895da..22d7e33bf09c 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1566,6 +1566,7 @@ struct kvm_memory_attributes { #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define GUEST_MEMFD_FLAG_INIT_SHARED (1UL << 0) struct kvm_create_guest_memfd { __u64 size; diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index 38c501e49e0e..4a7fcd6aa372 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -159,7 +159,7 @@ static void test_invalid_punch_hole(int fd, size_t page_size, size_t total_size) static void test_create_guest_memfd_invalid(struct kvm_vm *vm) { size_t page_size = getpagesize(); - uint64_t flag; + uint64_t flag = BIT(0); size_t size; int fd; @@ -170,7 +170,10 @@ static void test_create_guest_memfd_invalid(struct kvm_vm *vm) size); } - for (flag = BIT(0); flag; flag <<= 1) { + if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM)) + flag = GUEST_MEMFD_FLAG_INIT_SHARED << 1; + + for (; flag; flag <<= 1) { fd = __vm_create_guest_memfd(vm, page_size, flag); TEST_ASSERT(fd == -1 && errno == EINVAL, "guest_memfd() with flag '0x%lx' should fail with EINVAL", diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 0487a08615f0..d7313e11c2cb 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1045,6 +1045,15 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) && + (flags & GUEST_MEMFD_FLAG_INIT_SHARED)) { + err = kvm_gmem_offset_range_set_shared(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); @@ -1066,6 +1075,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM)) + valid_flags |= GUEST_MEMFD_FLAG_INIT_SHARED; + if (flags & ~valid_flags) return -EINVAL;