diff mbox series

efi/x86-mixed: leave RET unmitigated but move it into .rodata

Message ID 20220722160612.2976-1-ardb@kernel.org
State New
Headers show
Series efi/x86-mixed: leave RET unmitigated but move it into .rodata | expand

Commit Message

Ard Biesheuvel July 22, 2022, 4:06 p.m. UTC
Thadeu reports that the retbleed mitigations have broken EFI runtime
services in mixed mode, as the RET macro now expands to a relative
branch that jumps to nowhere when executed from the 1:1 mapping of the
kernel text that the EFI mixed mode thunk uses on its return back to the
caller.

So as Thadeu suggested in [1], we should switch to a bare 'ret' opcode
followed by 'int3' (to limit straight line speculation). However, doing
so leaves an unmitigated RET in the kernel text that is always present,
even on non-EFI or non-mixed mode systems (which are quite rare these
days to begin with)

So let's take Thadeu's fix a bit further, and move the EFI mixed mode
return trampoline that contains the RET into .rodata, so it is normally
mapped without executable permissions. And given that this snippet of
code is really the only kernel code that we ever execute via this 1:1
mapping, let's make the 1:1 mapping of the kernel .text non-executable
as well, and only map the page that covers the return trampoline with
executable permissions.

Note that mapping .text and .rodata is still necessary, as otherwise,
they will be covered by the default 1:1 mapping of the RAM below 4 GB,
which uses read-write permissions. Also note that merging the mappings
of .text and .rodata is not possible, even if they now use the same
permissions, due to the fact that the hole in the middle may contain
read-write data (such as the mixed mode stack)

[1] https://lore.kernel.org/linux-efi/20220715194550.793957-1-cascardo@canonical.com/

Cc: tglx@linutronix.de
Cc: torvalds@linux-foundation.org
Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/platform/efi/efi_64.c       | 15 ++++++++++++---
 arch/x86/platform/efi/efi_thunk_64.S |  9 +++++++--
 2 files changed, 19 insertions(+), 5 deletions(-)

Comments

Ard Biesheuvel July 24, 2022, 8:39 a.m. UTC | #1
On Sat, 23 Jul 2022 at 19:20, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, Jul 22, 2022 at 9:06 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > Thadeu reports that the retbleed mitigations have broken EFI runtime
> > services in mixed mode, as the RET macro now expands to a relative
> > branch that jumps to nowhere when executed from the 1:1 mapping of the
> > kernel text that the EFI mixed mode thunk uses on its return back to the
> > caller.
> >
> > So as Thadeu suggested in [1], we should switch to a bare 'ret' opcode
> > followed by 'int3' (to limit straight line speculation). However, doing
> > so leaves an unmitigated RET in the kernel text that is always present,
> > even on non-EFI or non-mixed mode systems (which are quite rare these
> > days to begin with)
> >
> > So let's take Thadeu's fix a bit further [..]
>
> Note that Thadeu's patch already made it into my kernel as commit
> 51a6fa0732d6 ("efi/x86: use naked RET on mixed mode call wrapper"), so
> that "take the fix further" should probably be done incrementally.
>
> I'm going to ignore this for 5.19, because I'm not sure how big of a
> problem that "unmitigated ret" is. Honestly, it's probably easy enough
> to find byte 0xc3 as part of other instructions and constants in the
> kernel data section anyway, so I wouldn't worry too much about "hey,
> we have a 'ret' instruction here that people could mis-use".
>

Fair enough. I still think it is better for general hygiene to apply
these changes, but if there is no urgency, I'll leave this for now and
revisit+rebase somewhere during the next cycle.
Borislav Petkov July 24, 2022, 5:27 p.m. UTC | #2
On Sun, Jul 24, 2022 at 10:39:45AM +0200, Ard Biesheuvel wrote:
> Fair enough. I still think it is better for general hygiene to apply
> these changes, but if there is no urgency, I'll leave this for now and
> revisit+rebase somewhere during the next cycle.

Best it would be if you do that early in the cycle so that it gets
maximum testing in linux-next.

Oh, and my compiler doesn't like it for whatever reason:

/tmp/ccLB2vIC.s: Assembler messages:
/tmp/ccLB2vIC.s: Error: invalid operands (.rodata and .text sections) for `-' when setting `.L__sym_size___efi64_thunk'
make[3]: *** [scripts/Makefile.build:322: arch/x86/platform/efi/efi_thunk_64.o] Error 1
make[3]: *** Waiting for unfinished jobs....

But otherwise, I like the direction where this is going, of us not
mapping as much into the EFI PGT. But I've said that already...

Thx.
Ard Biesheuvel July 24, 2022, 6:34 p.m. UTC | #3
On Sun, 24 Jul 2022 at 19:28, Borislav Petkov <bp@suse.de> wrote:
>
> On Sun, Jul 24, 2022 at 10:39:45AM +0200, Ard Biesheuvel wrote:
> > Fair enough. I still think it is better for general hygiene to apply
> > these changes, but if there is no urgency, I'll leave this for now and
> > revisit+rebase somewhere during the next cycle.
>
> Best it would be if you do that early in the cycle so that it gets
> maximum testing in linux-next.
>

Sure

> Oh, and my compiler doesn't like it for whatever reason:
>
> /tmp/ccLB2vIC.s: Assembler messages:
> /tmp/ccLB2vIC.s: Error: invalid operands (.rodata and .text sections) for `-' when setting `.L__sym_size___efi64_thunk'
> make[3]: *** [scripts/Makefile.build:322: arch/x86/platform/efi/efi_thunk_64.o] Error 1
> make[3]: *** Waiting for unfinished jobs....
>

Are you sure you fixed up the conflict correctly? It seems the
__efi64_thunk end marker ends up in .rodata in your case.


> But otherwise, I like the direction where this is going, of us not
> mapping as much into the EFI PGT. But I've said that already...
>

Yes. I'll update the patch to unmap text and rodata entirely, and only
leave the trampoline mapped.
Borislav Petkov July 24, 2022, 7:17 p.m. UTC | #4
On Sun, Jul 24, 2022 at 08:34:36PM +0200, Ard Biesheuvel wrote:
> Are you sure you fixed up the conflict correctly? It seems the
> __efi64_thunk end marker ends up in .rodata in your case.

Yep, I f*cked up the merge even if it was pretty easy in meld - sorry
about that.

Now it is correct but it complains differently:

vmlinux.o: warning: objtool: efi_thunk_query_variable_info_nonblocking+0x1ba: unreachable instruction

$ ./scripts/faddr2line vmlinux.o efi_thunk_query_variable_info_nonblocking+0x1ba
efi_thunk_query_variable_info_nonblocking+0x1ba/0x330:
efi_thunk_query_variable_info_nonblocking at /home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:787
(inlined by) efi_thunk_query_variable_info_nonblocking at /home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:769

and looking at the asm, it points to:

# 0 "" 2
#NO_APP
	movq	efi(%rip), %rax	# efi.runtime, efi.runtime
	movl	12(%rsp), %r8d	# %sfp, prephitmp_87
	leaq	16(%rsp), %r9	#,
	movl	%r15d, %ecx	# _104,
	movl	%r14d, %edx	# _95,
	movl	%ebp, %esi	# attr,
	movl	76(%rax), %edi	# _30->mixed_mode.query_variable_info, _30->mixed_mode.query_variable_info
	call	__efi64_thunk	#
#APP
# 787 "arch/x86/platform/efi/efi_64.c" 1

1:	movl %r12d,%ds			# __val		<---

this here, after the __efi64_thunk call, which is that segment restoring
after the __efi_thunk call:

	loadsegment(ds, __ds);

Weird, I don't see why though - that should be reachable.
diff mbox series

Patch

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 1f3675453a57..d8661fb31c76 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -176,7 +176,8 @@  virt_to_phys_or_null_size(void *va, unsigned long size)
 
 int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 {
-	unsigned long pfn, text, pf, rodata;
+	extern const u8 __efi64_thunk_ret_tramp[];
+	unsigned long pfn, text, pf, rodata, tramp;
 	struct page *page;
 	unsigned npages;
 	pgd_t *pgd = efi_mm.pgd;
@@ -240,7 +241,7 @@  int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	text = __pa(_text);
 	pfn = text >> PAGE_SHIFT;
 
-	pf = _PAGE_ENC;
+	pf = _PAGE_NX | _PAGE_ENC;
 	if (kernel_map_pages_in_pgd(pgd, pfn, text, npages, pf)) {
 		pr_err("Failed to map kernel text 1:1\n");
 		return 1;
@@ -250,12 +251,20 @@  int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
 	rodata = __pa(__start_rodata);
 	pfn = rodata >> PAGE_SHIFT;
 
-	pf = _PAGE_NX | _PAGE_ENC;
 	if (kernel_map_pages_in_pgd(pgd, pfn, rodata, npages, pf)) {
 		pr_err("Failed to map kernel rodata 1:1\n");
 		return 1;
 	}
 
+	tramp = __pa(__efi64_thunk_ret_tramp);
+	pfn = tramp >> PAGE_SHIFT;
+
+	pf = _PAGE_ENC;
+	if (kernel_map_pages_in_pgd(pgd, pfn, tramp, 1, pf)) {
+		pr_err("Failed to map kernel rodata 1:1\n");
+		return 1;
+	}
+
 	return 0;
 }
 
diff --git a/arch/x86/platform/efi/efi_thunk_64.S b/arch/x86/platform/efi/efi_thunk_64.S
index 9ffe2bad27d5..e436ce03741e 100644
--- a/arch/x86/platform/efi/efi_thunk_64.S
+++ b/arch/x86/platform/efi/efi_thunk_64.S
@@ -71,17 +71,22 @@  STACK_FRAME_NON_STANDARD __efi64_thunk
 	pushq	$__KERNEL32_CS
 	pushq	%rdi			/* EFI runtime service address */
 	lretq
+SYM_FUNC_END(__efi64_thunk)
 
+	.section ".rodata", "a", @progbits
+	.balign	16
+SYM_DATA_START(__efi64_thunk_ret_tramp)
 1:	movq	0x20(%rsp), %rsp
 	pop	%rbx
 	pop	%rbp
-	RET
+	ret
+	int3
 
 	.code32
 2:	pushl	$__KERNEL_CS
 	pushl	%ebp
 	lret
-SYM_FUNC_END(__efi64_thunk)
+SYM_DATA_END(__efi64_thunk_ret_tramp)
 
 	.bss
 	.balign 8