diff mbox series

[tip:,x86/boot] x86/boot/sev: Avoid shared GHCB page for early memory acceptance

Message ID 174448976513.31282.4012948519562214371.tip-bot2@tip-bot2
State New
Headers show
Series [tip:,x86/boot] x86/boot/sev: Avoid shared GHCB page for early memory acceptance | expand

Commit Message

tip-bot2 for Ard Biesheuvel April 12, 2025, 8:29 p.m. UTC
The following commit has been merged into the x86/boot branch of tip:

Commit-ID:     5871c9d650bc0ccb16ed8444f16045808ba936bf
Gitweb:        https://git.kernel.org/tip/5871c9d650bc0ccb16ed8444f16045808ba936bf
Author:        Ard Biesheuvel <ardb@kernel.org>
AuthorDate:    Thu, 10 Apr 2025 15:28:51 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 12 Apr 2025 20:43:41 +02:00

x86/boot/sev: Avoid shared GHCB page for early memory acceptance

Communicating with the hypervisor using the shared GHCB page requires
clearing the C bit in the mapping of that page. When executing in the
context of the EFI boot services, the page tables are owned by the
firmware, and this manipulation is not possible.

So switch to a different API for accepting memory in SEV-SNP guests, one
which is actually supported at the point during boot where the EFI stub
may need to accept memory, but the SEV-SNP init code has not executed
yet.

For simplicity, also switch the memory acceptance carried out by the
decompressor when not booting via EFI - this only involves the
allocation for the decompressed kernel, and is generally only called
after kexec, as normal boot will jump straight into the kernel from the
EFI stub.

Non-EFI stub boot will become slower if the memory that is used to
decompress the kernel has not been accepted yet. But given how heavily
SEV-SNP depends on EFI boot, this typically only happens on kexec, as
that is the only boot path that goes through the traditional
decompressor.

The GHCB shared page method never worked for accepting memory from the
EFI stub, but this is rarely needed in practice: when using the higher
level page allocation APIs, the firmware will make sure that memory is
accepted before it is returned. The only use case for explicit memory
acceptance by the EFI stub is when populating the 'unaccepted memory'
bitmap, which tracks unaccepted memory at a 2MB granularity, and so
chunks of unaccepted memory that are misaligned wrt that are accepted
without being allocated or used.

AFAICS this never worked correctly for SEV-SNP, we're just lucky
the firmware appears to accept memory in 2+ MB batches and so
these misaligned chunks are rare in practice. Tom did manage to
trigger it IIUC by giving a VM an amount of memory that is not a
multiple of 2M.

Co-developed-by: Tom Lendacky <thomas.lendacky@amd.com>

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410132850.3708703-2-ardb+git@google.com
---
 arch/x86/boot/compressed/sev.c | 67 +++++++--------------------------
 1 file changed, 15 insertions(+), 52 deletions(-)

Comments

Ingo Molnar April 12, 2025, 8:45 p.m. UTC | #1
* Ard Biesheuvel <ardb@kernel.org> wrote:

> On Sat, 12 Apr 2025 at 22:29, tip-bot2 for Ard Biesheuvel
> <tip-bot2@linutronix.de> wrote:
> >
> > The following commit has been merged into the x86/boot branch of tip:
> >
> 
> This may be slightly premature. I took some of Tom's code, hence the
> co-developed-by, but the should really confirm that what I did is
> correct before we queue this up.

OK, I've zapped it again, especially as the rest of the series wasn't 
ready either, please include the latest version of this patch as part 
of the boot/setup/ series, which hard-relies upon it.

Thanks,
	
	Ingo
Ard Biesheuvel April 12, 2025, 8:50 p.m. UTC | #2
On Sat, 12 Apr 2025 at 22:45, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Ard Biesheuvel <ardb@kernel.org> wrote:
>
> > On Sat, 12 Apr 2025 at 22:29, tip-bot2 for Ard Biesheuvel
> > <tip-bot2@linutronix.de> wrote:
> > >
> > > The following commit has been merged into the x86/boot branch of tip:
> > >
> >
> > This may be slightly premature. I took some of Tom's code, hence the
> > co-developed-by, but the should really confirm that what I did is
> > correct before we queue this up.
>
> OK, I've zapped it again, especially as the rest of the series wasn't
> ready either, please include the latest version of this patch as part
> of the boot/setup/ series, which hard-relies upon it.
>

OK
diff mbox series

Patch

diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 6eadd79..478eca4 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -169,10 +169,7 @@  bool sev_snp_enabled(void)
 
 static void __page_state_change(unsigned long paddr, enum psc_op op)
 {
-	u64 val;
-
-	if (!sev_snp_enabled())
-		return;
+	u64 val, msr;
 
 	/*
 	 * If private -> shared then invalidate the page before requesting the
@@ -181,6 +178,9 @@  static void __page_state_change(unsigned long paddr, enum psc_op op)
 	if (op == SNP_PAGE_STATE_SHARED)
 		pvalidate_4k_page(paddr, paddr, false);
 
+	/* Save the current GHCB MSR value */
+	msr = sev_es_rd_ghcb_msr();
+
 	/* Issue VMGEXIT to change the page state in RMP table. */
 	sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
 	VMGEXIT();
@@ -190,6 +190,9 @@  static void __page_state_change(unsigned long paddr, enum psc_op op)
 	if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
 		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
 
+	/* Restore the GHCB MSR value */
+	sev_es_wr_ghcb_msr(msr);
+
 	/*
 	 * Now that page state is changed in the RMP table, validate it so that it is
 	 * consistent with the RMP entry.
@@ -200,11 +203,17 @@  static void __page_state_change(unsigned long paddr, enum psc_op op)
 
 void snp_set_page_private(unsigned long paddr)
 {
+	if (!sev_snp_enabled())
+		return;
+
 	__page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
 }
 
 void snp_set_page_shared(unsigned long paddr)
 {
+	if (!sev_snp_enabled())
+		return;
+
 	__page_state_change(paddr, SNP_PAGE_STATE_SHARED);
 }
 
@@ -228,56 +237,10 @@  static bool early_setup_ghcb(void)
 	return true;
 }
 
-static phys_addr_t __snp_accept_memory(struct snp_psc_desc *desc,
-				       phys_addr_t pa, phys_addr_t pa_end)
-{
-	struct psc_hdr *hdr;
-	struct psc_entry *e;
-	unsigned int i;
-
-	hdr = &desc->hdr;
-	memset(hdr, 0, sizeof(*hdr));
-
-	e = desc->entries;
-
-	i = 0;
-	while (pa < pa_end && i < VMGEXIT_PSC_MAX_ENTRY) {
-		hdr->end_entry = i;
-
-		e->gfn = pa >> PAGE_SHIFT;
-		e->operation = SNP_PAGE_STATE_PRIVATE;
-		if (IS_ALIGNED(pa, PMD_SIZE) && (pa_end - pa) >= PMD_SIZE) {
-			e->pagesize = RMP_PG_SIZE_2M;
-			pa += PMD_SIZE;
-		} else {
-			e->pagesize = RMP_PG_SIZE_4K;
-			pa += PAGE_SIZE;
-		}
-
-		e++;
-		i++;
-	}
-
-	if (vmgexit_psc(boot_ghcb, desc))
-		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
-
-	pvalidate_pages(desc);
-
-	return pa;
-}
-
 void snp_accept_memory(phys_addr_t start, phys_addr_t end)
 {
-	struct snp_psc_desc desc = {};
-	unsigned int i;
-	phys_addr_t pa;
-
-	if (!boot_ghcb && !early_setup_ghcb())
-		sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
-
-	pa = start;
-	while (pa < end)
-		pa = __snp_accept_memory(&desc, pa, end);
+	for (phys_addr_t pa = start; pa < end; pa += PAGE_SIZE)
+		__page_state_change(pa, SNP_PAGE_STATE_PRIVATE);
 }
 
 void sev_es_shutdown_ghcb(void)