Message ID | 20220722160612.2976-1-ardb@kernel.org |
---|---|
State | New |
Headers | show |
Series | efi/x86-mixed: leave RET unmitigated but move it into .rodata | expand |
On Sat, 23 Jul 2022 at 19:20, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Fri, Jul 22, 2022 at 9:06 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > Thadeu reports that the retbleed mitigations have broken EFI runtime > > services in mixed mode, as the RET macro now expands to a relative > > branch that jumps to nowhere when executed from the 1:1 mapping of the > > kernel text that the EFI mixed mode thunk uses on its return back to the > > caller. > > > > So as Thadeu suggested in [1], we should switch to a bare 'ret' opcode > > followed by 'int3' (to limit straight line speculation). However, doing > > so leaves an unmitigated RET in the kernel text that is always present, > > even on non-EFI or non-mixed mode systems (which are quite rare these > > days to begin with) > > > > So let's take Thadeu's fix a bit further [..] > > Note that Thadeu's patch already made it into my kernel as commit > 51a6fa0732d6 ("efi/x86: use naked RET on mixed mode call wrapper"), so > that "take the fix further" should probably be done incrementally. > > I'm going to ignore this for 5.19, because I'm not sure how big of a > problem that "unmitigated ret" is. Honestly, it's probably easy enough > to find byte 0xc3 as part of other instructions and constants in the > kernel data section anyway, so I wouldn't worry too much about "hey, > we have a 'ret' instruction here that people could mis-use". > Fair enough. I still think it is better for general hygiene to apply these changes, but if there is no urgency, I'll leave this for now and revisit+rebase somewhere during the next cycle.
On Sun, Jul 24, 2022 at 10:39:45AM +0200, Ard Biesheuvel wrote: > Fair enough. I still think it is better for general hygiene to apply > these changes, but if there is no urgency, I'll leave this for now and > revisit+rebase somewhere during the next cycle. Best it would be if you do that early in the cycle so that it gets maximum testing in linux-next. Oh, and my compiler doesn't like it for whatever reason: /tmp/ccLB2vIC.s: Assembler messages: /tmp/ccLB2vIC.s: Error: invalid operands (.rodata and .text sections) for `-' when setting `.L__sym_size___efi64_thunk' make[3]: *** [scripts/Makefile.build:322: arch/x86/platform/efi/efi_thunk_64.o] Error 1 make[3]: *** Waiting for unfinished jobs.... But otherwise, I like the direction where this is going, of us not mapping as much into the EFI PGT. But I've said that already... Thx.
On Sun, 24 Jul 2022 at 19:28, Borislav Petkov <bp@suse.de> wrote: > > On Sun, Jul 24, 2022 at 10:39:45AM +0200, Ard Biesheuvel wrote: > > Fair enough. I still think it is better for general hygiene to apply > > these changes, but if there is no urgency, I'll leave this for now and > > revisit+rebase somewhere during the next cycle. > > Best it would be if you do that early in the cycle so that it gets > maximum testing in linux-next. > Sure > Oh, and my compiler doesn't like it for whatever reason: > > /tmp/ccLB2vIC.s: Assembler messages: > /tmp/ccLB2vIC.s: Error: invalid operands (.rodata and .text sections) for `-' when setting `.L__sym_size___efi64_thunk' > make[3]: *** [scripts/Makefile.build:322: arch/x86/platform/efi/efi_thunk_64.o] Error 1 > make[3]: *** Waiting for unfinished jobs.... > Are you sure you fixed up the conflict correctly? It seems the __efi64_thunk end marker ends up in .rodata in your case. > But otherwise, I like the direction where this is going, of us not > mapping as much into the EFI PGT. But I've said that already... > Yes. I'll update the patch to unmap text and rodata entirely, and only leave the trampoline mapped.
On Sun, Jul 24, 2022 at 08:34:36PM +0200, Ard Biesheuvel wrote: > Are you sure you fixed up the conflict correctly? It seems the > __efi64_thunk end marker ends up in .rodata in your case. Yep, I f*cked up the merge even if it was pretty easy in meld - sorry about that. Now it is correct but it complains differently: vmlinux.o: warning: objtool: efi_thunk_query_variable_info_nonblocking+0x1ba: unreachable instruction $ ./scripts/faddr2line vmlinux.o efi_thunk_query_variable_info_nonblocking+0x1ba efi_thunk_query_variable_info_nonblocking+0x1ba/0x330: efi_thunk_query_variable_info_nonblocking at /home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:787 (inlined by) efi_thunk_query_variable_info_nonblocking at /home/boris/kernel/linux/arch/x86/platform/efi/efi_64.c:769 and looking at the asm, it points to: # 0 "" 2 #NO_APP movq efi(%rip), %rax # efi.runtime, efi.runtime movl 12(%rsp), %r8d # %sfp, prephitmp_87 leaq 16(%rsp), %r9 #, movl %r15d, %ecx # _104, movl %r14d, %edx # _95, movl %ebp, %esi # attr, movl 76(%rax), %edi # _30->mixed_mode.query_variable_info, _30->mixed_mode.query_variable_info call __efi64_thunk # #APP # 787 "arch/x86/platform/efi/efi_64.c" 1 1: movl %r12d,%ds # __val <--- this here, after the __efi64_thunk call, which is that segment restoring after the __efi_thunk call: loadsegment(ds, __ds); Weird, I don't see why though - that should be reachable.
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 1f3675453a57..d8661fb31c76 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -176,7 +176,8 @@ virt_to_phys_or_null_size(void *va, unsigned long size) int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) { - unsigned long pfn, text, pf, rodata; + extern const u8 __efi64_thunk_ret_tramp[]; + unsigned long pfn, text, pf, rodata, tramp; struct page *page; unsigned npages; pgd_t *pgd = efi_mm.pgd; @@ -240,7 +241,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) text = __pa(_text); pfn = text >> PAGE_SHIFT; - pf = _PAGE_ENC; + pf = _PAGE_NX | _PAGE_ENC; if (kernel_map_pages_in_pgd(pgd, pfn, text, npages, pf)) { pr_err("Failed to map kernel text 1:1\n"); return 1; @@ -250,12 +251,20 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages) rodata = __pa(__start_rodata); pfn = rodata >> PAGE_SHIFT; - pf = _PAGE_NX | _PAGE_ENC; if (kernel_map_pages_in_pgd(pgd, pfn, rodata, npages, pf)) { pr_err("Failed to map kernel rodata 1:1\n"); return 1; } + tramp = __pa(__efi64_thunk_ret_tramp); + pfn = tramp >> PAGE_SHIFT; + + pf = _PAGE_ENC; + if (kernel_map_pages_in_pgd(pgd, pfn, tramp, 1, pf)) { + pr_err("Failed to map kernel rodata 1:1\n"); + return 1; + } + return 0; } diff --git a/arch/x86/platform/efi/efi_thunk_64.S b/arch/x86/platform/efi/efi_thunk_64.S index 9ffe2bad27d5..e436ce03741e 100644 --- a/arch/x86/platform/efi/efi_thunk_64.S +++ b/arch/x86/platform/efi/efi_thunk_64.S @@ -71,17 +71,22 @@ STACK_FRAME_NON_STANDARD __efi64_thunk pushq $__KERNEL32_CS pushq %rdi /* EFI runtime service address */ lretq +SYM_FUNC_END(__efi64_thunk) + .section ".rodata", "a", @progbits + .balign 16 +SYM_DATA_START(__efi64_thunk_ret_tramp) 1: movq 0x20(%rsp), %rsp pop %rbx pop %rbp - RET + ret + int3 .code32 2: pushl $__KERNEL_CS pushl %ebp lret -SYM_FUNC_END(__efi64_thunk) +SYM_DATA_END(__efi64_thunk_ret_tramp) .bss .balign 8
Thadeu reports that the retbleed mitigations have broken EFI runtime services in mixed mode, as the RET macro now expands to a relative branch that jumps to nowhere when executed from the 1:1 mapping of the kernel text that the EFI mixed mode thunk uses on its return back to the caller. So as Thadeu suggested in [1], we should switch to a bare 'ret' opcode followed by 'int3' (to limit straight line speculation). However, doing so leaves an unmitigated RET in the kernel text that is always present, even on non-EFI or non-mixed mode systems (which are quite rare these days to begin with) So let's take Thadeu's fix a bit further, and move the EFI mixed mode return trampoline that contains the RET into .rodata, so it is normally mapped without executable permissions. And given that this snippet of code is really the only kernel code that we ever execute via this 1:1 mapping, let's make the 1:1 mapping of the kernel .text non-executable as well, and only map the page that covers the return trampoline with executable permissions. Note that mapping .text and .rodata is still necessary, as otherwise, they will be covered by the default 1:1 mapping of the RAM below 4 GB, which uses read-write permissions. Also note that merging the mappings of .text and .rodata is not possible, even if they now use the same permissions, due to the fact that the hole in the middle may contain read-write data (such as the mixed mode stack) [1] https://lore.kernel.org/linux-efi/20220715194550.793957-1-cascardo@canonical.com/ Cc: tglx@linutronix.de Cc: torvalds@linux-foundation.org Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/x86/platform/efi/efi_64.c | 15 ++++++++++++--- arch/x86/platform/efi/efi_thunk_64.S | 9 +++++++-- 2 files changed, 19 insertions(+), 5 deletions(-)