Message ID | 20231011192528.262425-1-nik.borisov@suse.com |
---|---|
State | Accepted |
Commit | ff07186b4d774ac22a5345d30763045af4569416 |
Headers | show |
Series | x86/efistub: Don't try to print after ExitBootService() | expand |
On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: > setup_e820() is executed after UEFI's ExitBootService has been called. > This causes the firmware to throw an exception because Console IO > protocol handler is supposed to work only during boot service > environment. As per UEFI 2.9, section 12.1: > > "This protocol isused to handle input and output of text-based > information intended for the system user during the operation of code > in the boot services environment." > > Running a TDX guest with TDVF with unaccepted memory disabled results in > the following output: Oh. My bad. But there's other codepath that does the same. If setup_e820() fails with EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() failed\n". I wouldner if it is feasible to hook up earlyprintk console into efi_printk() machinery for after ExitBootService() case? Silent boot failure is not the best UX.
On 12.10.23 г. 13:14 ч., kirill.shutemov@linux.intel.com wrote: > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: >> setup_e820() is executed after UEFI's ExitBootService has been called. >> This causes the firmware to throw an exception because Console IO >> protocol handler is supposed to work only during boot service >> environment. As per UEFI 2.9, section 12.1: >> >> "This protocol isused to handle input and output of text-based >> information intended for the system user during the operation of code >> in the boot services environment." >> >> Running a TDX guest with TDVF with unaccepted memory disabled results in >> the following output: > > Oh. My bad. > > But there's other codepath that does the same. If setup_e820() fails with > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > failed\n". > > I wouldner if it is feasible to hook up earlyprintk console into > efi_printk() machinery for after ExitBootService() case? Silent boot > failure is not the best UX. > In my testing I was able to transpose setup_e820 and efi exit_boot_service by calling exit_boot_func before setup_e820 which ensures the various memory variables are populated. Is there any specific reason why ExitBootServices is called before setting up the e820 table? AFAIU this is an arbitrary choice?
On Thu, Oct 12, 2023 at 01:51:13PM +0300, Nikolay Borisov wrote: > > > On 12.10.23 г. 13:14 ч., kirill.shutemov@linux.intel.com wrote: > > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: > > > setup_e820() is executed after UEFI's ExitBootService has been called. > > > This causes the firmware to throw an exception because Console IO > > > protocol handler is supposed to work only during boot service > > > environment. As per UEFI 2.9, section 12.1: > > > > > > "This protocol isused to handle input and output of text-based > > > information intended for the system user during the operation of code > > > in the boot services environment." > > > > > > Running a TDX guest with TDVF with unaccepted memory disabled results in > > > the following output: > > > > Oh. My bad. > > > > But there's other codepath that does the same. If setup_e820() fails with > > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > > failed\n". > > > > I wouldner if it is feasible to hook up earlyprintk console into > > efi_printk() machinery for after ExitBootService() case? Silent boot > > failure is not the best UX. > > > > > In my testing I was able to transpose setup_e820 and efi exit_boot_service > by calling exit_boot_func before setup_e820 which ensures the various memory > variables are populated. Is there any specific reason why ExitBootServices > is called before setting up the e820 table? AFAIU this is an arbitrary > choice? Because if you allocate memory with EFI service it can alter EFI memory map and we need the last version to convert it to e820.
On Thu, 12 Oct 2023 at 13:25, <kirill.shutemov@linux.intel.com> wrote: > > On Thu, Oct 12, 2023 at 01:51:13PM +0300, Nikolay Borisov wrote: > > > > > > On 12.10.23 г. 13:14 ч., kirill.shutemov@linux.intel.com wrote: > > > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: > > > > setup_e820() is executed after UEFI's ExitBootService has been called. > > > > This causes the firmware to throw an exception because Console IO > > > > protocol handler is supposed to work only during boot service > > > > environment. As per UEFI 2.9, section 12.1: > > > > > > > > "This protocol isused to handle input and output of text-based > > > > information intended for the system user during the operation of code > > > > in the boot services environment." > > > > > > > > Running a TDX guest with TDVF with unaccepted memory disabled results in > > > > the following output: > > > > > > Oh. My bad. > > > > > > But there's other codepath that does the same. If setup_e820() fails with > > > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > > > failed\n". > > > > > > I wouldner if it is feasible to hook up earlyprintk console into > > > efi_printk() machinery for after ExitBootService() case? Silent boot > > > failure is not the best UX. > > > > > > > > > In my testing I was able to transpose setup_e820 and efi exit_boot_service > > by calling exit_boot_func before setup_e820 which ensures the various memory > > variables are populated. Is there any specific reason why ExitBootServices > > is called before setting up the e820 table? AFAIU this is an arbitrary > > choice? > > Because if you allocate memory with EFI service it can alter EFI memory > map and we need the last version to convert it to e820. > Indeed, and note that the memory map may change due to asynchronous events, which only get shut down when ExitBootServices() is called. This is the reason for this complicated dance around ExitBootServices() with the callback etc
On 12.10.23 г. 13:14 ч., kirill.shutemov@linux.intel.com wrote: > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: >> setup_e820() is executed after UEFI's ExitBootService has been called. >> This causes the firmware to throw an exception because Console IO >> protocol handler is supposed to work only during boot service >> environment. As per UEFI 2.9, section 12.1: >> >> "This protocol isused to handle input and output of text-based >> information intended for the system user during the operation of code >> in the boot services environment." >> >> Running a TDX guest with TDVF with unaccepted memory disabled results in >> the following output: > > Oh. My bad. > > But there's other codepath that does the same. If setup_e820() fails with > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > failed\n". > > I wouldner if it is feasible to hook up earlyprintk console into > efi_printk() machinery for after ExitBootService() case? Silent boot > failure is not the best UX. > So looking at the code the only thing which would prevent refactoring to exit logic to directly call exit_boot_func etc and setup_e820 before calling efi_exit_boot_services is if the memory map changes. The current code ensures that we really have the latest memory map version and so setup_e820 is called with the latest version. Ard, how likely it is that the memory map can indeed change between the calls to getmemorymap and exitbootservice?
(cc Matthew) On Thu, 12 Oct 2023 at 13:28, Nikolay Borisov <nik.borisov@suse.com> wrote: > > > > On 12.10.23 г. 13:14 ч., kirill.shutemov@linux.intel.com wrote: > > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: > >> setup_e820() is executed after UEFI's ExitBootService has been called. > >> This causes the firmware to throw an exception because Console IO > >> protocol handler is supposed to work only during boot service > >> environment. As per UEFI 2.9, section 12.1: > >> > >> "This protocol isused to handle input and output of text-based > >> information intended for the system user during the operation of code > >> in the boot services environment." > >> > >> Running a TDX guest with TDVF with unaccepted memory disabled results in > >> the following output: > > > > Oh. My bad. > > > > But there's other codepath that does the same. If setup_e820() fails with > > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > > failed\n". > > > > I wouldner if it is feasible to hook up earlyprintk console into > > efi_printk() machinery for after ExitBootService() case? Silent boot > > failure is not the best UX. > > > > So looking at the code the only thing which would prevent refactoring to > exit logic to directly call exit_boot_func etc and setup_e820 before > calling efi_exit_boot_services is if the memory map changes. The current > code ensures that we really have the latest memory map version and so > setup_e820 is called with the latest version. > > Ard, how likely it is that the memory map can indeed change between the > calls to getmemorymap and exitbootservice? Very likely. Matthew mentioned this to me not too long ago, i.e., that on real-world platforms, ExitBootServices() typically fails the first time because of this, and only succeeds the second time (note that the first call disables async event delivery so the second call is guaranteed to succeed unless the caller modifies the memory map themselves)
On Thu, 12 Oct 2023 at 12:15, <kirill.shutemov@linux.intel.com> wrote: > > On Wed, Oct 11, 2023 at 10:25:28PM +0300, Nikolay Borisov wrote: > > setup_e820() is executed after UEFI's ExitBootService has been called. > > This causes the firmware to throw an exception because Console IO > > protocol handler is supposed to work only during boot service > > environment. As per UEFI 2.9, section 12.1: > > > > "This protocol isused to handle input and output of text-based > > information intended for the system user during the operation of code > > in the boot services environment." > > Thanks. I've queued this up as a fix. > > Running a TDX guest with TDVF with unaccepted memory disabled results in > > the following output: > > Oh. My bad. > > But there's other codepath that does the same. If setup_e820() fails with > EFI_BUFFER_TOO_SMALL, efi_stub_entry() would try to print "exit_boot() > failed\n". > > I wouldner if it is feasible to hook up earlyprintk console into > efi_printk() machinery for after ExitBootService() case? Silent boot > failure is not the best UX. > I don't disagree with that in principle, but wiring this up seems non-trivial, and will be x86-only. The EFI output is not recorded in the kernel log, and this particular issue is something we can warn about later on, when it is much more likely that someone will notice. So if we want to keep this functionality, I'd prefer it if we could add something to the generic EFI memmap code that warns_once about unaccepted memory being present and CONFIG_UNACCEPTED_MEMORY being disabled.
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 2fee52ed335d..3b8bccd7c216 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -605,11 +605,8 @@ setup_e820(struct boot_params *params, struct setup_data *e820ext, u32 e820ext_s break; case EFI_UNACCEPTED_MEMORY: - if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) { - efi_warn_once( -"The system has unaccepted memory, but kernel does not support it\nConsider enabling CONFIG_UNACCEPTED_MEMORY\n"); + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) continue; - } e820_type = E820_TYPE_RAM; process_unaccepted_memory(d->phys_addr, d->phys_addr + PAGE_SIZE * d->num_pages);
setup_e820() is executed after UEFI's ExitBootService has been called. This causes the firmware to throw an exception because Console IO protocol handler is supposed to work only during boot service environment. As per UEFI 2.9, section 12.1: "This protocol isused to handle input and output of text-based information intended for the system user during the operation of code in the boot services environment." Running a TDX guest with TDVF with unaccepted memory disabled results in the following output: !!!! X64 Exception Type - 06(#UD - Invalid Opcode) CPU Apic ID - 00000000 !!!! RIP - 0000000000603D51, CS - 0000000000000038, RFLAGS - 0000000000010046 RAX - 0000000000000000, RCX - 0000000000000000, RDX - 000000007EC27530 RBX - 0000000001C227A1, RSP - 000000007EC274D8, RBP - 000000007EC27530 RSI - 000000000000000A, RDI - 000000007EC27530 R8 - 00000000AC1C4720, R9 - 000000007D2C5F18, R10 - 0000000000400000 R11 - 0000000000000000, R12 - 0000000001C22B0E, R13 - 0000000000000000 R14 - 0000000000000032, R15 - 000000007C6022D0 DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030 GS - 0000000000000030, SS - 0000000000000030 CR0 - 0000000080010031, CR2 - 0000000000000000, CR3 - 000000007EA01000 CR4 - 0000000000000268, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 000000007E7E6000 0000000000000047, LDTR - 0000000000000000 IDTR - 000000007D2BD018 0000000000000FFF, TR - 0000000000000000 FXSAVE_STATE - 000000007EC27130 !!!! Can't find image information. !!!! Signed-off-by: Nikolay Borisov <nik.borisov@suse.com> --- drivers/firmware/efi/libstub/x86-stub.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)