Message ID | 20220918213544.2176249-12-ardb@kernel.org |
---|---|
State | Superseded |
Headers | show |
Series | efi: disentangle the generic EFI stub from FDT | expand |
Hi, Ard, I think the parameters passed to the core kernel need to be discussed. The old way (so-called old world): a0=argc, a1=argv, a1=bpi The current way (so-called new world): a0=efi boot flag, a1=fdt pointer The new way (in this patchset): a0=efi boot flag, a1=systemtable, a2=cmdline I prefer to use the current way, there are 3 reasons: 1, both acpi system and dt system can use the same parameters passing method; 2, arm64, riscv and loongarch can use similar logics (light FDT); 3, out-of-tree patches can make compatibility with the old world easier by just judging whether a2 is zero. Huacai On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > LoongArch does not use FDT or DT natively [yet], and the only reason it > currently uses it is so that it can reuse the existing EFI stub code. > > Overloading the DT with data passed between the EFI stub and the core > kernel has been a source of problems: there is the overlap between > information provided by EFI which DT can also provide (initrd base/size, > command line, memory descriptions), requiring us to reason about which > is which and what to prioritize. It has also resulted in ABI leaks, > i.e., internal ABI being promoted to external ABI inadvertently because > the bootloader can set the EFI stub's DT properties as well (e.g., > "kaslr-seed"). This has become especially problematic with boot > environments that want to pretend that EFI boot is being done (to access > ACPI and SMBIOS tables, for instance) but have no ability to execute the > EFI stub, and so the environment that the EFI stub creates is emulated > [poorly, in some cases]. > > Another downside of treating DT like this is that the DT binary that the > kernel receives is different from the one created by the firmware, which > is undesirable in the context of secure and measured boot. > > Given that LoongArch support in Linux is brand new, we can avoid these > pitfalls, and treat the DT strictly as a hardware description, and use a > separate handover method between the EFI stub and the kernel. Now that > initrd loading and passing the EFI memory map have been refactored into > pure EFI routines that use EFI configuration tables, the only thing we > need to pass directly is the kernel command line (even if we could pass > this via a config table as well, it is used extremely early, so passing > it directly is preferred in this case.) > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > arch/loongarch/Kconfig | 3 -- > arch/loongarch/include/asm/bootinfo.h | 2 +- > arch/loongarch/kernel/efi.c | 30 ++++++++++- > arch/loongarch/kernel/env.c | 22 ++++---- > arch/loongarch/kernel/head.S | 2 + > arch/loongarch/kernel/setup.c | 4 +- > drivers/firmware/efi/libstub/Makefile | 13 +++-- > drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++--- > 8 files changed, 100 insertions(+), 32 deletions(-) > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig > index fca106a8b8af..14a2a1ec8561 100644 > --- a/arch/loongarch/Kconfig > +++ b/arch/loongarch/Kconfig > @@ -104,8 +104,6 @@ config LOONGARCH > select MODULES_USE_ELF_RELA if MODULES > select NEED_PER_CPU_EMBED_FIRST_CHUNK > select NEED_PER_CPU_PAGE_FIRST_CHUNK > - select OF > - select OF_EARLY_FLATTREE > select PCI > select PCI_DOMAINS_GENERIC > select PCI_ECAM if ACPI > @@ -311,7 +309,6 @@ config DMI > config EFI > bool "EFI runtime service support" > select UCS2_STRING > - select EFI_PARAMS_FROM_FDT > select EFI_RUNTIME_WRAPPERS > help > This enables the kernel to use EFI runtime services that are > diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h > index e02ac4af7f6e..8e5881bc5ad1 100644 > --- a/arch/loongarch/include/asm/bootinfo.h > +++ b/arch/loongarch/include/asm/bootinfo.h > @@ -36,7 +36,7 @@ struct loongson_system_configuration { > }; > > extern u64 efi_system_table; > -extern unsigned long fw_arg0, fw_arg1; > +extern unsigned long fw_arg0, fw_arg1, fw_arg2; > extern struct loongson_board_info b_info; > extern struct loongson_system_configuration loongson_sysconf; > > diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c > index 1f1f755fb425..3b80675726ec 100644 > --- a/arch/loongarch/kernel/efi.c > +++ b/arch/loongarch/kernel/efi.c > @@ -27,8 +27,13 @@ > static unsigned long efi_nr_tables; > static unsigned long efi_config_table; > > +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR; > + > static efi_system_table_t *efi_systab; > -static efi_config_table_type_t arch_tables[] __initdata = {{},}; > +static efi_config_table_type_t arch_tables[] __initdata = { > + {LINUX_EFI_BOOT_MEMMAP_GUID, &boot_memmap, "MEMMAP" }, > + {}, > +}; > > void __init efi_runtime_init(void) > { > @@ -61,6 +66,8 @@ void __init efi_init(void) > return; > } > > + efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor); > + > set_bit(EFI_64BIT, &efi.flags); > efi_nr_tables = efi_systab->nr_tables; > efi_config_table = (unsigned long)efi_systab->tables; > @@ -72,4 +79,25 @@ void __init efi_init(void) > > if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI) > memblock_reserve(screen_info.lfb_base, screen_info.lfb_size); > + > + if (boot_memmap != EFI_INVALID_TABLE_ADDR) { > + struct efi_memory_map_data data; > + struct efi_boot_memmap *tbl; > + > + tbl = early_memremap_ro(boot_memmap, sizeof(*tbl)); > + if (tbl) { > + data.phys_map = boot_memmap + sizeof(*tbl); > + data.size = tbl->map_size; > + data.desc_size = tbl->desc_size; > + data.desc_version = tbl->desc_ver; > + > + if (efi_memmap_init_early(&data) < 0) > + panic("Unable to map EFI memory map.\n"); > + > + memblock_reserve(data.phys_map & PAGE_MASK, > + PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > + > + early_memunmap(tbl, sizeof(*tbl)); > + } > + } > } > diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c > index 82b478a5c665..05c38d28476e 100644 > --- a/arch/loongarch/kernel/env.c > +++ b/arch/loongarch/kernel/env.c > @@ -8,7 +8,6 @@ > #include <linux/efi.h> > #include <linux/export.h> > #include <linux/memblock.h> > -#include <linux/of_fdt.h> > #include <asm/early_ioremap.h> > #include <asm/bootinfo.h> > #include <asm/loongson.h> > @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf); > void __init init_environ(void) > { > int efi_boot = fw_arg0; > - struct efi_memory_map_data data; > - void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > - if (efi_boot) > - set_bit(EFI_BOOT, &efi.flags); > - else > - clear_bit(EFI_BOOT, &efi.flags); > + if (efi_boot) { > + char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE); > > - early_init_dt_scan(fdt_ptr); > - early_init_fdt_reserve_self(); > - efi_system_table = efi_get_fdt_params(&data); > + strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE); > + early_memunmap(cmdline, COMMAND_LINE_SIZE); > > - efi_memmap_init_early(&data); > - memblock_reserve(data.phys_map & PAGE_MASK, > - PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > + efi_system_table = fw_arg1; > + set_bit(EFI_BOOT, &efi.flags); > + } else { > + clear_bit(EFI_BOOT, &efi.flags); > + } > } > > static int __init init_cpu_fullname(void) > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S > index 01bac62a6442..8f89f39fd31b 100644 > --- a/arch/loongarch/kernel/head.S > +++ b/arch/loongarch/kernel/head.S > @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry) # kernel entry point > st.d a0, t0, 0 # firmware arguments > la t0, fw_arg1 > st.d a1, t0, 0 > + la t0, fw_arg2 > + st.d a2, t0, 0 > > /* KSave3 used for percpu base, initialized as 0 */ > csrwr zero, PERCPU_BASE_KS > diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c > index e8714b1d94c8..7fabf2306e80 100644 > --- a/arch/loongarch/kernel/setup.c > +++ b/arch/loongarch/kernel/setup.c > @@ -51,7 +51,7 @@ > > struct screen_info screen_info __section(".data"); > > -unsigned long fw_arg0, fw_arg1; > +unsigned long fw_arg0, fw_arg1, fw_arg2; > DEFINE_PER_CPU(unsigned long, kernelsp); > struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly; > > @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem); > > void __init platform_init(void) > { > - efi_init(); > #ifdef CONFIG_ACPI_TABLE_UPGRADE > acpi_table_upgrade(); > #endif > @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p) > *cmdline_p = boot_command_line; > > init_environ(); > + efi_init(); > memblock_init(); > parse_early_param(); > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile > index 0dbc6d93f2e6..d8d6657e6277 100644 > --- a/drivers/firmware/efi/libstub/Makefile > +++ b/drivers/firmware/efi/libstub/Makefile > @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > -fpie > > -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt > +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT) += -I$(srctree)/scripts/dtc/libfdt > > KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ > -include $(srctree)/include/linux/hidden.h \ > @@ -60,14 +60,17 @@ lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \ > alignedmem.o relocate.o vsprintf.o \ > systable.o > > -# include the stub's generic dependencies from lib/ when building for ARM/arm64 > -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c > +# include the stub's libfdt dependencies from lib/ when needed > +libfdt-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \ > + fdt_empty_tree.c fdt_sw.c > + > +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \ > + $(patsubst %.c,lib-%.o,$(libfdt-deps)) > > $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE > $(call if_changed_rule,cc_o_c) > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \ > - $(patsubst %.c,lib-%.o,$(efi-deps-y)) > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o > > lib-$(CONFIG_ARM) += arm32-stub.o > lib-$(CONFIG_ARM64) += arm64-stub.o > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c > index b7ef8d2df59e..7c684d10f936 100644 > --- a/drivers/firmware/efi/libstub/loongarch-stub.c > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c > @@ -9,7 +9,8 @@ > #include <asm/addrspace.h> > #include "efistub.h" > > -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt); > +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab, > + unsigned long cmdline); > > extern int kernel_asize; > extern int kernel_fsize; > @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr, > return status; > } > > -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size) > +struct exit_boot_struct { > + efi_memory_desc_t *runtime_map; > + int runtime_entry_count; > +}; > + > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv) > +{ > + struct exit_boot_struct *p = priv; > + > + /* > + * Update the memory map with virtual addresses. The function will also > + * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME > + * entries so that we can pass it straight to SetVirtualAddressMap() > + */ > + efi_get_virtmap(map->map, map->map_size, map->desc_size, > + p->runtime_map, &p->runtime_entry_count); > + > + return EFI_SUCCESS; > +} > + > +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image, > + unsigned long image_addr, char *cmdline_ptr) > { > kernel_entry_t real_kernel_entry; > + struct exit_boot_struct priv; > + unsigned long desc_size; > + efi_status_t status; > + u32 desc_ver; > + > + status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver); > + if (status != EFI_SUCCESS) { > + efi_err("Unable to retrieve UEFI memory map.\n"); > + return status; > + } > + > + efi_info("Exiting boot services\n"); > + > + efi_novamap = false; > + status = efi_exit_boot_services(handle, &priv, exit_boot_func); > + if (status != EFI_SUCCESS) > + return status; > + > + /* Install the new virtual address map */ > + efi_rt_call(set_virtual_address_map, > + priv.runtime_entry_count * desc_size, desc_size, > + desc_ver, priv.runtime_map); > > /* Config Direct Mapping */ > csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0); > csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1); > > real_kernel_entry = (kernel_entry_t) > - ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS); > + ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS); > > - if (!efi_novamap) > - real_kernel_entry(true, fdt); > - else > - real_kernel_entry(false, fdt); > + real_kernel_entry(true, (unsigned long)efi_system_table, > + (unsigned long)cmdline_ptr); > } > -- > 2.35.1 > >
On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > > I think the parameters passed to the core kernel need to be discussed. > The old way (so-called old world): > a0=argc, a1=argv, a1=bpi > > The current way (so-called new world): > a0=efi boot flag, a1=fdt pointer > > The new way (in this patchset): > a0=efi boot flag, a1=systemtable, a2=cmdline > > I prefer to use the current way, there are 3 reasons: > 1, both acpi system and dt system can use the same parameters passing method; DT systems will use this too. The distinction is between EFI boot and non-EFI boot. We *really* should keep these separate, given the experience on ARM, where other projects invent ways to pass those values to the kernel without going through the stub. > 2, arm64, riscv and loongarch can use similar logics (light FDT); No need to repeat a mistake. I intend to fix RISC-V next. > 3, out-of-tree patches can make compatibility with the old world > easier by just judging whether a2 is zero. > The whole point of this series is that the EFI stub should not be touching the DT at all. In other words, there is no DT pointer, so the current method needs to be revised. What we might do is a0=systemtable, a1=cmdline as any non-zero value is treated as logic true. That way, your logic to test a2 is zero will still work. > > On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > LoongArch does not use FDT or DT natively [yet], and the only reason it > > currently uses it is so that it can reuse the existing EFI stub code. > > > > Overloading the DT with data passed between the EFI stub and the core > > kernel has been a source of problems: there is the overlap between > > information provided by EFI which DT can also provide (initrd base/size, > > command line, memory descriptions), requiring us to reason about which > > is which and what to prioritize. It has also resulted in ABI leaks, > > i.e., internal ABI being promoted to external ABI inadvertently because > > the bootloader can set the EFI stub's DT properties as well (e.g., > > "kaslr-seed"). This has become especially problematic with boot > > environments that want to pretend that EFI boot is being done (to access > > ACPI and SMBIOS tables, for instance) but have no ability to execute the > > EFI stub, and so the environment that the EFI stub creates is emulated > > [poorly, in some cases]. > > > > Another downside of treating DT like this is that the DT binary that the > > kernel receives is different from the one created by the firmware, which > > is undesirable in the context of secure and measured boot. > > > > Given that LoongArch support in Linux is brand new, we can avoid these > > pitfalls, and treat the DT strictly as a hardware description, and use a > > separate handover method between the EFI stub and the kernel. Now that > > initrd loading and passing the EFI memory map have been refactored into > > pure EFI routines that use EFI configuration tables, the only thing we > > need to pass directly is the kernel command line (even if we could pass > > this via a config table as well, it is used extremely early, so passing > > it directly is preferred in this case.) > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > --- > > arch/loongarch/Kconfig | 3 -- > > arch/loongarch/include/asm/bootinfo.h | 2 +- > > arch/loongarch/kernel/efi.c | 30 ++++++++++- > > arch/loongarch/kernel/env.c | 22 ++++---- > > arch/loongarch/kernel/head.S | 2 + > > arch/loongarch/kernel/setup.c | 4 +- > > drivers/firmware/efi/libstub/Makefile | 13 +++-- > > drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++--- > > 8 files changed, 100 insertions(+), 32 deletions(-) > > > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig > > index fca106a8b8af..14a2a1ec8561 100644 > > --- a/arch/loongarch/Kconfig > > +++ b/arch/loongarch/Kconfig > > @@ -104,8 +104,6 @@ config LOONGARCH > > select MODULES_USE_ELF_RELA if MODULES > > select NEED_PER_CPU_EMBED_FIRST_CHUNK > > select NEED_PER_CPU_PAGE_FIRST_CHUNK > > - select OF > > - select OF_EARLY_FLATTREE > > select PCI > > select PCI_DOMAINS_GENERIC > > select PCI_ECAM if ACPI > > @@ -311,7 +309,6 @@ config DMI > > config EFI > > bool "EFI runtime service support" > > select UCS2_STRING > > - select EFI_PARAMS_FROM_FDT > > select EFI_RUNTIME_WRAPPERS > > help > > This enables the kernel to use EFI runtime services that are > > diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h > > index e02ac4af7f6e..8e5881bc5ad1 100644 > > --- a/arch/loongarch/include/asm/bootinfo.h > > +++ b/arch/loongarch/include/asm/bootinfo.h > > @@ -36,7 +36,7 @@ struct loongson_system_configuration { > > }; > > > > extern u64 efi_system_table; > > -extern unsigned long fw_arg0, fw_arg1; > > +extern unsigned long fw_arg0, fw_arg1, fw_arg2; > > extern struct loongson_board_info b_info; > > extern struct loongson_system_configuration loongson_sysconf; > > > > diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c > > index 1f1f755fb425..3b80675726ec 100644 > > --- a/arch/loongarch/kernel/efi.c > > +++ b/arch/loongarch/kernel/efi.c > > @@ -27,8 +27,13 @@ > > static unsigned long efi_nr_tables; > > static unsigned long efi_config_table; > > > > +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR; > > + > > static efi_system_table_t *efi_systab; > > -static efi_config_table_type_t arch_tables[] __initdata = {{},}; > > +static efi_config_table_type_t arch_tables[] __initdata = { > > + {LINUX_EFI_BOOT_MEMMAP_GUID, &boot_memmap, "MEMMAP" }, > > + {}, > > +}; > > > > void __init efi_runtime_init(void) > > { > > @@ -61,6 +66,8 @@ void __init efi_init(void) > > return; > > } > > > > + efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor); > > + > > set_bit(EFI_64BIT, &efi.flags); > > efi_nr_tables = efi_systab->nr_tables; > > efi_config_table = (unsigned long)efi_systab->tables; > > @@ -72,4 +79,25 @@ void __init efi_init(void) > > > > if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI) > > memblock_reserve(screen_info.lfb_base, screen_info.lfb_size); > > + > > + if (boot_memmap != EFI_INVALID_TABLE_ADDR) { > > + struct efi_memory_map_data data; > > + struct efi_boot_memmap *tbl; > > + > > + tbl = early_memremap_ro(boot_memmap, sizeof(*tbl)); > > + if (tbl) { > > + data.phys_map = boot_memmap + sizeof(*tbl); > > + data.size = tbl->map_size; > > + data.desc_size = tbl->desc_size; > > + data.desc_version = tbl->desc_ver; > > + > > + if (efi_memmap_init_early(&data) < 0) > > + panic("Unable to map EFI memory map.\n"); > > + > > + memblock_reserve(data.phys_map & PAGE_MASK, > > + PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > > + > > + early_memunmap(tbl, sizeof(*tbl)); > > + } > > + } > > } > > diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c > > index 82b478a5c665..05c38d28476e 100644 > > --- a/arch/loongarch/kernel/env.c > > +++ b/arch/loongarch/kernel/env.c > > @@ -8,7 +8,6 @@ > > #include <linux/efi.h> > > #include <linux/export.h> > > #include <linux/memblock.h> > > -#include <linux/of_fdt.h> > > #include <asm/early_ioremap.h> > > #include <asm/bootinfo.h> > > #include <asm/loongson.h> > > @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf); > > void __init init_environ(void) > > { > > int efi_boot = fw_arg0; > > - struct efi_memory_map_data data; > > - void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > - if (efi_boot) > > - set_bit(EFI_BOOT, &efi.flags); > > - else > > - clear_bit(EFI_BOOT, &efi.flags); > > + if (efi_boot) { > > + char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE); > > > > - early_init_dt_scan(fdt_ptr); > > - early_init_fdt_reserve_self(); > > - efi_system_table = efi_get_fdt_params(&data); > > + strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE); > > + early_memunmap(cmdline, COMMAND_LINE_SIZE); > > > > - efi_memmap_init_early(&data); > > - memblock_reserve(data.phys_map & PAGE_MASK, > > - PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > > + efi_system_table = fw_arg1; > > + set_bit(EFI_BOOT, &efi.flags); > > + } else { > > + clear_bit(EFI_BOOT, &efi.flags); > > + } > > } > > > > static int __init init_cpu_fullname(void) > > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S > > index 01bac62a6442..8f89f39fd31b 100644 > > --- a/arch/loongarch/kernel/head.S > > +++ b/arch/loongarch/kernel/head.S > > @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry) # kernel entry point > > st.d a0, t0, 0 # firmware arguments > > la t0, fw_arg1 > > st.d a1, t0, 0 > > + la t0, fw_arg2 > > + st.d a2, t0, 0 > > > > /* KSave3 used for percpu base, initialized as 0 */ > > csrwr zero, PERCPU_BASE_KS > > diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c > > index e8714b1d94c8..7fabf2306e80 100644 > > --- a/arch/loongarch/kernel/setup.c > > +++ b/arch/loongarch/kernel/setup.c > > @@ -51,7 +51,7 @@ > > > > struct screen_info screen_info __section(".data"); > > > > -unsigned long fw_arg0, fw_arg1; > > +unsigned long fw_arg0, fw_arg1, fw_arg2; > > DEFINE_PER_CPU(unsigned long, kernelsp); > > struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly; > > > > @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem); > > > > void __init platform_init(void) > > { > > - efi_init(); > > #ifdef CONFIG_ACPI_TABLE_UPGRADE > > acpi_table_upgrade(); > > #endif > > @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p) > > *cmdline_p = boot_command_line; > > > > init_environ(); > > + efi_init(); > > memblock_init(); > > parse_early_param(); > > > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile > > index 0dbc6d93f2e6..d8d6657e6277 100644 > > --- a/drivers/firmware/efi/libstub/Makefile > > +++ b/drivers/firmware/efi/libstub/Makefile > > @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > > -fpie > > > > -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt > > +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT) += -I$(srctree)/scripts/dtc/libfdt > > > > KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ > > -include $(srctree)/include/linux/hidden.h \ > > @@ -60,14 +60,17 @@ lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \ > > alignedmem.o relocate.o vsprintf.o \ > > systable.o > > > > -# include the stub's generic dependencies from lib/ when building for ARM/arm64 > > -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c > > +# include the stub's libfdt dependencies from lib/ when needed > > +libfdt-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \ > > + fdt_empty_tree.c fdt_sw.c > > + > > +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \ > > + $(patsubst %.c,lib-%.o,$(libfdt-deps)) > > > > $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE > > $(call if_changed_rule,cc_o_c) > > > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \ > > - $(patsubst %.c,lib-%.o,$(efi-deps-y)) > > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o > > > > lib-$(CONFIG_ARM) += arm32-stub.o > > lib-$(CONFIG_ARM64) += arm64-stub.o > > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c > > index b7ef8d2df59e..7c684d10f936 100644 > > --- a/drivers/firmware/efi/libstub/loongarch-stub.c > > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c > > @@ -9,7 +9,8 @@ > > #include <asm/addrspace.h> > > #include "efistub.h" > > > > -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt); > > +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab, > > + unsigned long cmdline); > > > > extern int kernel_asize; > > extern int kernel_fsize; > > @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr, > > return status; > > } > > > > -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size) > > +struct exit_boot_struct { > > + efi_memory_desc_t *runtime_map; > > + int runtime_entry_count; > > +}; > > + > > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv) > > +{ > > + struct exit_boot_struct *p = priv; > > + > > + /* > > + * Update the memory map with virtual addresses. The function will also > > + * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME > > + * entries so that we can pass it straight to SetVirtualAddressMap() > > + */ > > + efi_get_virtmap(map->map, map->map_size, map->desc_size, > > + p->runtime_map, &p->runtime_entry_count); > > + > > + return EFI_SUCCESS; > > +} > > + > > +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image, > > + unsigned long image_addr, char *cmdline_ptr) > > { > > kernel_entry_t real_kernel_entry; > > + struct exit_boot_struct priv; > > + unsigned long desc_size; > > + efi_status_t status; > > + u32 desc_ver; > > + > > + status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver); > > + if (status != EFI_SUCCESS) { > > + efi_err("Unable to retrieve UEFI memory map.\n"); > > + return status; > > + } > > + > > + efi_info("Exiting boot services\n"); > > + > > + efi_novamap = false; > > + status = efi_exit_boot_services(handle, &priv, exit_boot_func); > > + if (status != EFI_SUCCESS) > > + return status; > > + > > + /* Install the new virtual address map */ > > + efi_rt_call(set_virtual_address_map, > > + priv.runtime_entry_count * desc_size, desc_size, > > + desc_ver, priv.runtime_map); > > > > /* Config Direct Mapping */ > > csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0); > > csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1); > > > > real_kernel_entry = (kernel_entry_t) > > - ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS); > > + ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS); > > > > - if (!efi_novamap) > > - real_kernel_entry(true, fdt); > > - else > > - real_kernel_entry(false, fdt); > > + real_kernel_entry(true, (unsigned long)efi_system_table, > > + (unsigned long)cmdline_ptr); > > } > > -- > > 2.35.1 > > > >
Hi, Ard, On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > > I think the parameters passed to the core kernel need to be discussed. > > The old way (so-called old world): > > a0=argc, a1=argv, a1=bpi > > > > The current way (so-called new world): > > a0=efi boot flag, a1=fdt pointer > > > > The new way (in this patchset): > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > I prefer to use the current way, there are 3 reasons: > > 1, both acpi system and dt system can use the same parameters passing method; > > DT systems will use this too. The distinction is between EFI boot and > non-EFI boot. We *really* should keep these separate, given the > experience on ARM, where other projects invent ways to pass those > values to the kernel without going through the stub. In the last patch I see: + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); + + early_init_dt_scan(fdt_ptr); + early_init_fdt_reserve_self(); + clear_bit(EFI_BOOT, &efi.flags); So I suppose for the DT system that means a0=efi boot flag, a1=fdt pointer, a2=cmdline? Then it is not exactly the same as the ACPI system, but similar. > > > 2, arm64, riscv and loongarch can use similar logics (light FDT); > > No need to repeat a mistake. I intend to fix RISC-V next. > > > 3, out-of-tree patches can make compatibility with the old world > > easier by just judging whether a2 is zero. > > > > The whole point of this series is that the EFI stub should not be > touching the DT at all. In other words, there is no DT pointer, so the > current method needs to be revised. > > What we might do is > > a0=systemtable, a1=cmdline > > as any non-zero value is treated as logic true. That way, your logic > to test a2 is zero will still work. I think the efi boot flag is still needed, even boot from efistub. Because if boot with "efi=novamap", the efi runtime should be disabled. Then we need efi_enabled(EFI_BOOT) to be false in efi_init(). Huacai > > > > > > On Mon, Sep 19, 2022 at 5:36 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > LoongArch does not use FDT or DT natively [yet], and the only reason it > > > currently uses it is so that it can reuse the existing EFI stub code. > > > > > > Overloading the DT with data passed between the EFI stub and the core > > > kernel has been a source of problems: there is the overlap between > > > information provided by EFI which DT can also provide (initrd base/size, > > > command line, memory descriptions), requiring us to reason about which > > > is which and what to prioritize. It has also resulted in ABI leaks, > > > i.e., internal ABI being promoted to external ABI inadvertently because > > > the bootloader can set the EFI stub's DT properties as well (e.g., > > > "kaslr-seed"). This has become especially problematic with boot > > > environments that want to pretend that EFI boot is being done (to access > > > ACPI and SMBIOS tables, for instance) but have no ability to execute the > > > EFI stub, and so the environment that the EFI stub creates is emulated > > > [poorly, in some cases]. > > > > > > Another downside of treating DT like this is that the DT binary that the > > > kernel receives is different from the one created by the firmware, which > > > is undesirable in the context of secure and measured boot. > > > > > > Given that LoongArch support in Linux is brand new, we can avoid these > > > pitfalls, and treat the DT strictly as a hardware description, and use a > > > separate handover method between the EFI stub and the kernel. Now that > > > initrd loading and passing the EFI memory map have been refactored into > > > pure EFI routines that use EFI configuration tables, the only thing we > > > need to pass directly is the kernel command line (even if we could pass > > > this via a config table as well, it is used extremely early, so passing > > > it directly is preferred in this case.) > > > > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > > --- > > > arch/loongarch/Kconfig | 3 -- > > > arch/loongarch/include/asm/bootinfo.h | 2 +- > > > arch/loongarch/kernel/efi.c | 30 ++++++++++- > > > arch/loongarch/kernel/env.c | 22 ++++---- > > > arch/loongarch/kernel/head.S | 2 + > > > arch/loongarch/kernel/setup.c | 4 +- > > > drivers/firmware/efi/libstub/Makefile | 13 +++-- > > > drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++--- > > > 8 files changed, 100 insertions(+), 32 deletions(-) > > > > > > diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig > > > index fca106a8b8af..14a2a1ec8561 100644 > > > --- a/arch/loongarch/Kconfig > > > +++ b/arch/loongarch/Kconfig > > > @@ -104,8 +104,6 @@ config LOONGARCH > > > select MODULES_USE_ELF_RELA if MODULES > > > select NEED_PER_CPU_EMBED_FIRST_CHUNK > > > select NEED_PER_CPU_PAGE_FIRST_CHUNK > > > - select OF > > > - select OF_EARLY_FLATTREE > > > select PCI > > > select PCI_DOMAINS_GENERIC > > > select PCI_ECAM if ACPI > > > @@ -311,7 +309,6 @@ config DMI > > > config EFI > > > bool "EFI runtime service support" > > > select UCS2_STRING > > > - select EFI_PARAMS_FROM_FDT > > > select EFI_RUNTIME_WRAPPERS > > > help > > > This enables the kernel to use EFI runtime services that are > > > diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h > > > index e02ac4af7f6e..8e5881bc5ad1 100644 > > > --- a/arch/loongarch/include/asm/bootinfo.h > > > +++ b/arch/loongarch/include/asm/bootinfo.h > > > @@ -36,7 +36,7 @@ struct loongson_system_configuration { > > > }; > > > > > > extern u64 efi_system_table; > > > -extern unsigned long fw_arg0, fw_arg1; > > > +extern unsigned long fw_arg0, fw_arg1, fw_arg2; > > > extern struct loongson_board_info b_info; > > > extern struct loongson_system_configuration loongson_sysconf; > > > > > > diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c > > > index 1f1f755fb425..3b80675726ec 100644 > > > --- a/arch/loongarch/kernel/efi.c > > > +++ b/arch/loongarch/kernel/efi.c > > > @@ -27,8 +27,13 @@ > > > static unsigned long efi_nr_tables; > > > static unsigned long efi_config_table; > > > > > > +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR; > > > + > > > static efi_system_table_t *efi_systab; > > > -static efi_config_table_type_t arch_tables[] __initdata = {{},}; > > > +static efi_config_table_type_t arch_tables[] __initdata = { > > > + {LINUX_EFI_BOOT_MEMMAP_GUID, &boot_memmap, "MEMMAP" }, > > > + {}, > > > +}; > > > > > > void __init efi_runtime_init(void) > > > { > > > @@ -61,6 +66,8 @@ void __init efi_init(void) > > > return; > > > } > > > > > > + efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor); > > > + > > > set_bit(EFI_64BIT, &efi.flags); > > > efi_nr_tables = efi_systab->nr_tables; > > > efi_config_table = (unsigned long)efi_systab->tables; > > > @@ -72,4 +79,25 @@ void __init efi_init(void) > > > > > > if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI) > > > memblock_reserve(screen_info.lfb_base, screen_info.lfb_size); > > > + > > > + if (boot_memmap != EFI_INVALID_TABLE_ADDR) { > > > + struct efi_memory_map_data data; > > > + struct efi_boot_memmap *tbl; > > > + > > > + tbl = early_memremap_ro(boot_memmap, sizeof(*tbl)); > > > + if (tbl) { > > > + data.phys_map = boot_memmap + sizeof(*tbl); > > > + data.size = tbl->map_size; > > > + data.desc_size = tbl->desc_size; > > > + data.desc_version = tbl->desc_ver; > > > + > > > + if (efi_memmap_init_early(&data) < 0) > > > + panic("Unable to map EFI memory map.\n"); > > > + > > > + memblock_reserve(data.phys_map & PAGE_MASK, > > > + PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > > > + > > > + early_memunmap(tbl, sizeof(*tbl)); > > > + } > > > + } > > > } > > > diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c > > > index 82b478a5c665..05c38d28476e 100644 > > > --- a/arch/loongarch/kernel/env.c > > > +++ b/arch/loongarch/kernel/env.c > > > @@ -8,7 +8,6 @@ > > > #include <linux/efi.h> > > > #include <linux/export.h> > > > #include <linux/memblock.h> > > > -#include <linux/of_fdt.h> > > > #include <asm/early_ioremap.h> > > > #include <asm/bootinfo.h> > > > #include <asm/loongson.h> > > > @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf); > > > void __init init_environ(void) > > > { > > > int efi_boot = fw_arg0; > > > - struct efi_memory_map_data data; > > > - void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > - if (efi_boot) > > > - set_bit(EFI_BOOT, &efi.flags); > > > - else > > > - clear_bit(EFI_BOOT, &efi.flags); > > > + if (efi_boot) { > > > + char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE); > > > > > > - early_init_dt_scan(fdt_ptr); > > > - early_init_fdt_reserve_self(); > > > - efi_system_table = efi_get_fdt_params(&data); > > > + strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE); > > > + early_memunmap(cmdline, COMMAND_LINE_SIZE); > > > > > > - efi_memmap_init_early(&data); > > > - memblock_reserve(data.phys_map & PAGE_MASK, > > > - PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); > > > + efi_system_table = fw_arg1; > > > + set_bit(EFI_BOOT, &efi.flags); > > > + } else { > > > + clear_bit(EFI_BOOT, &efi.flags); > > > + } > > > } > > > > > > static int __init init_cpu_fullname(void) > > > diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S > > > index 01bac62a6442..8f89f39fd31b 100644 > > > --- a/arch/loongarch/kernel/head.S > > > +++ b/arch/loongarch/kernel/head.S > > > @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry) # kernel entry point > > > st.d a0, t0, 0 # firmware arguments > > > la t0, fw_arg1 > > > st.d a1, t0, 0 > > > + la t0, fw_arg2 > > > + st.d a2, t0, 0 > > > > > > /* KSave3 used for percpu base, initialized as 0 */ > > > csrwr zero, PERCPU_BASE_KS > > > diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c > > > index e8714b1d94c8..7fabf2306e80 100644 > > > --- a/arch/loongarch/kernel/setup.c > > > +++ b/arch/loongarch/kernel/setup.c > > > @@ -51,7 +51,7 @@ > > > > > > struct screen_info screen_info __section(".data"); > > > > > > -unsigned long fw_arg0, fw_arg1; > > > +unsigned long fw_arg0, fw_arg1, fw_arg2; > > > DEFINE_PER_CPU(unsigned long, kernelsp); > > > struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly; > > > > > > @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem); > > > > > > void __init platform_init(void) > > > { > > > - efi_init(); > > > #ifdef CONFIG_ACPI_TABLE_UPGRADE > > > acpi_table_upgrade(); > > > #endif > > > @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p) > > > *cmdline_p = boot_command_line; > > > > > > init_environ(); > > > + efi_init(); > > > memblock_init(); > > > parse_early_param(); > > > > > > diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile > > > index 0dbc6d93f2e6..d8d6657e6277 100644 > > > --- a/drivers/firmware/efi/libstub/Makefile > > > +++ b/drivers/firmware/efi/libstub/Makefile > > > @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > > > cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ > > > -fpie > > > > > > -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt > > > +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT) += -I$(srctree)/scripts/dtc/libfdt > > > > > > KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ > > > -include $(srctree)/include/linux/hidden.h \ > > > @@ -60,14 +60,17 @@ lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \ > > > alignedmem.o relocate.o vsprintf.o \ > > > systable.o > > > > > > -# include the stub's generic dependencies from lib/ when building for ARM/arm64 > > > -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c > > > +# include the stub's libfdt dependencies from lib/ when needed > > > +libfdt-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \ > > > + fdt_empty_tree.c fdt_sw.c > > > + > > > +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \ > > > + $(patsubst %.c,lib-%.o,$(libfdt-deps)) > > > > > > $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE > > > $(call if_changed_rule,cc_o_c) > > > > > > -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \ > > > - $(patsubst %.c,lib-%.o,$(efi-deps-y)) > > > +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o > > > > > > lib-$(CONFIG_ARM) += arm32-stub.o > > > lib-$(CONFIG_ARM64) += arm64-stub.o > > > diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c > > > index b7ef8d2df59e..7c684d10f936 100644 > > > --- a/drivers/firmware/efi/libstub/loongarch-stub.c > > > +++ b/drivers/firmware/efi/libstub/loongarch-stub.c > > > @@ -9,7 +9,8 @@ > > > #include <asm/addrspace.h> > > > #include "efistub.h" > > > > > > -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt); > > > +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab, > > > + unsigned long cmdline); > > > > > > extern int kernel_asize; > > > extern int kernel_fsize; > > > @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr, > > > return status; > > > } > > > > > > -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size) > > > +struct exit_boot_struct { > > > + efi_memory_desc_t *runtime_map; > > > + int runtime_entry_count; > > > +}; > > > + > > > +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv) > > > +{ > > > + struct exit_boot_struct *p = priv; > > > + > > > + /* > > > + * Update the memory map with virtual addresses. The function will also > > > + * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME > > > + * entries so that we can pass it straight to SetVirtualAddressMap() > > > + */ > > > + efi_get_virtmap(map->map, map->map_size, map->desc_size, > > > + p->runtime_map, &p->runtime_entry_count); > > > + > > > + return EFI_SUCCESS; > > > +} > > > + > > > +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image, > > > + unsigned long image_addr, char *cmdline_ptr) > > > { > > > kernel_entry_t real_kernel_entry; > > > + struct exit_boot_struct priv; > > > + unsigned long desc_size; > > > + efi_status_t status; > > > + u32 desc_ver; > > > + > > > + status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver); > > > + if (status != EFI_SUCCESS) { > > > + efi_err("Unable to retrieve UEFI memory map.\n"); > > > + return status; > > > + } > > > + > > > + efi_info("Exiting boot services\n"); > > > + > > > + efi_novamap = false; > > > + status = efi_exit_boot_services(handle, &priv, exit_boot_func); > > > + if (status != EFI_SUCCESS) > > > + return status; > > > + > > > + /* Install the new virtual address map */ > > > + efi_rt_call(set_virtual_address_map, > > > + priv.runtime_entry_count * desc_size, desc_size, > > > + desc_ver, priv.runtime_map); > > > > > > /* Config Direct Mapping */ > > > csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0); > > > csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1); > > > > > > real_kernel_entry = (kernel_entry_t) > > > - ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS); > > > + ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS); > > > > > > - if (!efi_novamap) > > > - real_kernel_entry(true, fdt); > > > - else > > > - real_kernel_entry(false, fdt); > > > + real_kernel_entry(true, (unsigned long)efi_system_table, > > > + (unsigned long)cmdline_ptr); > > > } > > > -- > > > 2.35.1 > > > > > >
On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Ard, > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > The old way (so-called old world): > > > a0=argc, a1=argv, a1=bpi > > > > > > The current way (so-called new world): > > > a0=efi boot flag, a1=fdt pointer > > > > > > The new way (in this patchset): > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > I prefer to use the current way, there are 3 reasons: > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > DT systems will use this too. The distinction is between EFI boot and > > non-EFI boot. We *really* should keep these separate, given the > > experience on ARM, where other projects invent ways to pass those > > values to the kernel without going through the stub. > In the last patch I see: > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > + > + early_init_dt_scan(fdt_ptr); > + early_init_fdt_reserve_self(); > + > clear_bit(EFI_BOOT, &efi.flags); > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > system, but similar. > No, for non-EFI DT boot, the command line is passed via the DT, so a0=0x0 (non-efi), a1=DT, a2=0x0 Do you intend to support non-EFI DT boot by the way? So a2 != 0x0 means old world a0 != 0x0 means EFI boot, a1 is the command line a0 == 0x0 means !EFI boot, a1 is the DT > > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT); > > > > No need to repeat a mistake. I intend to fix RISC-V next. > > > > > 3, out-of-tree patches can make compatibility with the old world > > > easier by just judging whether a2 is zero. > > > > > > > The whole point of this series is that the EFI stub should not be > > touching the DT at all. In other words, there is no DT pointer, so the > > current method needs to be revised. > > > > What we might do is > > > > a0=systemtable, a1=cmdline > > > > as any non-zero value is treated as logic true. That way, your logic > > to test a2 is zero will still work. > I think the efi boot flag is still needed, even boot from efistub. > Because if boot with "efi=novamap", the efi runtime should be > disabled. Then we need efi_enabled(EFI_BOOT) to be false in > efi_init(). > I don't think it makes sense to allow efi=novamap on LoongArch, given that we cannot make use of the runtime services that way. So in my code, efi_novamap is set to false unconditionally.
On Mon, 19 Sept 2022 at 08:22, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Ard, > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > The old way (so-called old world): > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > The current way (so-called new world): > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > The new way (in this patchset): > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > non-EFI boot. We *really* should keep these separate, given the > > > experience on ARM, where other projects invent ways to pass those > > > values to the kernel without going through the stub. > > In the last patch I see: > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > + > > + early_init_dt_scan(fdt_ptr); > > + early_init_fdt_reserve_self(); > > + > > clear_bit(EFI_BOOT, &efi.flags); > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > system, but similar. > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > a0=0x0 (non-efi), a1=DT, a2=0x0 > > Do you intend to support non-EFI DT boot by the way? > > So > > a2 != 0x0 means old world > a0 != 0x0 means EFI boot, a1 is the command line > a0 == 0x0 means !EFI boot, a1 is the DT > Note: the above applies if we decide to merge the EFI boolean and the system table pointer into register #0.
Hi, Ard, On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Ard, > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > The old way (so-called old world): > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > The current way (so-called new world): > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > The new way (in this patchset): > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > non-EFI boot. We *really* should keep these separate, given the > > > experience on ARM, where other projects invent ways to pass those > > > values to the kernel without going through the stub. > > In the last patch I see: > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > + > > + early_init_dt_scan(fdt_ptr); > > + early_init_fdt_reserve_self(); > > + > > clear_bit(EFI_BOOT, &efi.flags); > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > system, but similar. > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > a0=0x0 (non-efi), a1=DT, a2=0x0 > > Do you intend to support non-EFI DT boot by the way? I think we needn't support non-EFI DT boot, so a0=efi boot flag, a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, which looks similar to the old-world). But I have another question: is it early enough to get DT from systemtable for DT boot (in the current way DT is the earliest thing)? > > So > > a2 != 0x0 means old world > a0 != 0x0 means EFI boot, a1 is the command line > a0 == 0x0 means !EFI boot, a1 is the DT > > > > > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT); > > > > > > No need to repeat a mistake. I intend to fix RISC-V next. > > > > > > > 3, out-of-tree patches can make compatibility with the old world > > > > easier by just judging whether a2 is zero. > > > > > > > > > > The whole point of this series is that the EFI stub should not be > > > touching the DT at all. In other words, there is no DT pointer, so the > > > current method needs to be revised. > > > > > > What we might do is > > > > > > a0=systemtable, a1=cmdline > > > > > > as any non-zero value is treated as logic true. That way, your logic > > > to test a2 is zero will still work. > > I think the efi boot flag is still needed, even boot from efistub. > > Because if boot with "efi=novamap", the efi runtime should be > > disabled. Then we need efi_enabled(EFI_BOOT) to be false in > > efi_init(). > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > that we cannot make use of the runtime services that way. So in my > code, efi_novamap is set to false unconditionally. Emm, I prefer to support "efi=novamap", the core kernel has already supported "noefi" to disable runtime, I don't want to hack efi_novamap. So please keep the efi boot flag, thanks. Huacai
On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Ard, > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > The old way (so-called old world): > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > The current way (so-called new world): > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > The new way (in this patchset): > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > non-EFI boot. We *really* should keep these separate, given the > > > > experience on ARM, where other projects invent ways to pass those > > > > values to the kernel without going through the stub. > > > In the last patch I see: > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > + > > > + early_init_dt_scan(fdt_ptr); > > > + early_init_fdt_reserve_self(); > > > + > > > clear_bit(EFI_BOOT, &efi.flags); > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > system, but similar. > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > Do you intend to support non-EFI DT boot by the way? > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > which looks similar to the old-world). Excellent. If non-EFI boot is not supported, we can drop the code that deals with that. > But I have another question: is > it early enough to get DT from systemtable for DT boot (in the current > way DT is the earliest thing)? > If you rely on DT only for the hardware description, then loading it from efi_init() should be fine. > > > > So > > > > a2 != 0x0 means old world > > a0 != 0x0 means EFI boot, a1 is the command line > > a0 == 0x0 means !EFI boot, a1 is the DT > > > > > > > > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT); > > > > > > > > No need to repeat a mistake. I intend to fix RISC-V next. > > > > > > > > > 3, out-of-tree patches can make compatibility with the old world > > > > > easier by just judging whether a2 is zero. > > > > > > > > > > > > > The whole point of this series is that the EFI stub should not be > > > > touching the DT at all. In other words, there is no DT pointer, so the > > > > current method needs to be revised. > > > > > > > > What we might do is > > > > > > > > a0=systemtable, a1=cmdline > > > > > > > > as any non-zero value is treated as logic true. That way, your logic > > > > to test a2 is zero will still work. > > > I think the efi boot flag is still needed, even boot from efistub. > > > Because if boot with "efi=novamap", the efi runtime should be > > > disabled. Then we need efi_enabled(EFI_BOOT) to be false in > > > efi_init(). > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > that we cannot make use of the runtime services that way. So in my > > code, efi_novamap is set to false unconditionally. > Emm, I prefer to support "efi=novamap", the core kernel has already > supported "noefi" to disable runtime, I don't want to hack > efi_novamap. So please keep the efi boot flag, thanks. > Fair enough. Do you have any uses for efi_novamap in mind?
On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Ard, > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > The old way (so-called old world): > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > The current way (so-called new world): > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > The new way (in this patchset): > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > experience on ARM, where other projects invent ways to pass those > > > > > values to the kernel without going through the stub. > > > > In the last patch I see: > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > + > > > > + early_init_dt_scan(fdt_ptr); > > > > + early_init_fdt_reserve_self(); > > > > + > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > system, but similar. > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > Do you intend to support non-EFI DT boot by the way? > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > which looks similar to the old-world). > > Excellent. If non-EFI boot is not supported, we can drop the code that > deals with that. > > > But I have another question: is > > it early enough to get DT from systemtable for DT boot (in the current > > way DT is the earliest thing)? > > > > If you rely on DT only for the hardware description, then loading it > from efi_init() should be fine. OK, then please drop patch #12 at this time. It can be added when we add Loongson-2K support. > > > > > > > So > > > > > > a2 != 0x0 means old world > > > a0 != 0x0 means EFI boot, a1 is the command line > > > a0 == 0x0 means !EFI boot, a1 is the DT > > > > > > > > > > > > > > 2, arm64, riscv and loongarch can use similar logics (light FDT); > > > > > > > > > > No need to repeat a mistake. I intend to fix RISC-V next. > > > > > > > > > > > 3, out-of-tree patches can make compatibility with the old world > > > > > > easier by just judging whether a2 is zero. > > > > > > > > > > > > > > > > The whole point of this series is that the EFI stub should not be > > > > > touching the DT at all. In other words, there is no DT pointer, so the > > > > > current method needs to be revised. > > > > > > > > > > What we might do is > > > > > > > > > > a0=systemtable, a1=cmdline > > > > > > > > > > as any non-zero value is treated as logic true. That way, your logic > > > > > to test a2 is zero will still work. > > > > I think the efi boot flag is still needed, even boot from efistub. > > > > Because if boot with "efi=novamap", the efi runtime should be > > > > disabled. Then we need efi_enabled(EFI_BOOT) to be false in > > > > efi_init(). > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > that we cannot make use of the runtime services that way. So in my > > > code, efi_novamap is set to false unconditionally. > > Emm, I prefer to support "efi=novamap", the core kernel has already > > supported "noefi" to disable runtime, I don't want to hack > > efi_novamap. So please keep the efi boot flag, thanks. > > > > Fair enough. Do you have any uses for efi_novamap in mind? I usually use "efi=novamap" in EFI shell to see whether it can work well without runtime. And I think I will modify this patch these days because in another thread you said that this series cannot boot. Huacai
On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Ard, > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > The old way (so-called old world): > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > values to the kernel without going through the stub. > > > > > In the last patch I see: > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > + > > > > > + early_init_dt_scan(fdt_ptr); > > > > > + early_init_fdt_reserve_self(); > > > > > + > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > system, but similar. > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > which looks similar to the old-world). > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > deals with that. > > > > > But I have another question: is > > > it early enough to get DT from systemtable for DT boot (in the current > > > way DT is the earliest thing)? > > > > > > > If you rely on DT only for the hardware description, then loading it > > from efi_init() should be fine. > OK, then please drop patch #12 at this time. It can be added when we > add Loongson-2K support. > OK > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > that we cannot make use of the runtime services that way. So in my > > > > code, efi_novamap is set to false unconditionally. > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > supported "noefi" to disable runtime, I don't want to hack > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > I usually use "efi=novamap" in EFI shell to see whether it can work > well without runtime. And I think I will modify this patch these days > because in another thread you said that this series cannot boot. It works fine now. However, clearing the EFI_BOOT flag is not the correct way to disable runtime services only. And note that we also have efi=noruntime - does that currently work on LoongArch?
On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Ard, > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > The old way (so-called old world): > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > values to the kernel without going through the stub. > > > > > > In the last patch I see: > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > + > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > + early_init_fdt_reserve_self(); > > > > > > + > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > system, but similar. > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > which looks similar to the old-world). > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > deals with that. > > > > > > > But I have another question: is > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > way DT is the earliest thing)? > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > from efi_init() should be fine. > > OK, then please drop patch #12 at this time. It can be added when we > > add Loongson-2K support. > > > > OK > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > that we cannot make use of the runtime services that way. So in my > > > > > code, efi_novamap is set to false unconditionally. > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > supported "noefi" to disable runtime, I don't want to hack > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > I usually use "efi=novamap" in EFI shell to see whether it can work > > well without runtime. And I think I will modify this patch these days > > because in another thread you said that this series cannot boot. > > It works fine now. > > However, clearing the EFI_BOOT flag is not the correct way to disable > runtime services only. And note that we also have efi=noruntime - does > that currently work on LoongArch? OK, but we don't need to add a new "efi=noruntime" parameter, I can use "noefi" instead. But I think support "efi=novamap" is not a bad idea (maybe I misunderstood its usage). Huacai
On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > The old way (so-called old world): > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > values to the kernel without going through the stub. > > > > > > > In the last patch I see: > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > + > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > + > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > which looks similar to the old-world). > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > deals with that. > > > > > > > > > But I have another question: is > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > from efi_init() should be fine. > > > OK, then please drop patch #12 at this time. It can be added when we > > > add Loongson-2K support. > > > > > > > OK > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > code, efi_novamap is set to false unconditionally. > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > well without runtime. And I think I will modify this patch these days > > > because in another thread you said that this series cannot boot. > > > > It works fine now. > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > runtime services only. And note that we also have efi=noruntime - does > > that currently work on LoongArch? > OK, but we don't need to add a new "efi=noruntime" parameter, I can > use "noefi" instead. OK > But I think support "efi=novamap" is not a bad > idea (maybe I misunderstood its usage). > It basically exists to deal with broken EFI firmware on ARM laptops that were only intended to run Windows. Windows calls SetVirtualAddressMap() *after* creating the new mappings, and some broken firmware drivers will access those regions too early. On Linux, we install the mapping first, and then much later, we actually create the mappings and only activate them on a single CPU while the EFI runtime service call is in progress. In any case, I will reinstate the efi_novamap logic for now - we can always revisit it later.
On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > In the last patch I see: > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > + > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > + > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > which looks similar to the old-world). > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > deals with that. > > > > > > > > > > > But I have another question: is > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > from efi_init() should be fine. > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > add Loongson-2K support. > > > > > > > > > > OK > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > well without runtime. And I think I will modify this patch these days > > > > because in another thread you said that this series cannot boot. > > > > > > It works fine now. > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > runtime services only. And note that we also have efi=noruntime - does > > > that currently work on LoongArch? > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > use "noefi" instead. > > OK > > > But I think support "efi=novamap" is not a bad > > idea (maybe I misunderstood its usage). > > > > It basically exists to deal with broken EFI firmware on ARM laptops > that were only intended to run Windows. Windows calls > SetVirtualAddressMap() *after* creating the new mappings, and some > broken firmware drivers will access those regions too early. > > On Linux, we install the mapping first, and then much later, we > actually create the mappings and only activate them on a single CPU > while the EFI runtime service call is in progress. > > In any case, I will reinstate the efi_novamap logic for now - we can > always revisit it later. OK, now I think there are no big problems. Only some bikesheddings: 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass boot information, it looks most similar to the old-world, and we can distinguish old-world/new-world by judging whether a0 is greater than 1. 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == EFI_INVALID_TABLE_ADDR) then return immediately, this makes indentation look better. 3, Declare "cmdline" out of the if() condition in init_environ() looks better. 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). 5, It seems that one line is enough for the last statement in efi_boot_kernel(). Hope this series will be stable as soon as possible, our kexec/kdump work needs to adjust because of this change. :) Huacai >
On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > In the last patch I see: > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > + > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > + > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > deals with that. > > > > > > > > > > > > > But I have another question: is > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > from efi_init() should be fine. > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > add Loongson-2K support. > > > > > > > > > > > > > OK > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > well without runtime. And I think I will modify this patch these days > > > > > because in another thread you said that this series cannot boot. > > > > > > > > It works fine now. > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > runtime services only. And note that we also have efi=noruntime - does > > > > that currently work on LoongArch? > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > use "noefi" instead. > > > > OK > > > > > But I think support "efi=novamap" is not a bad > > > idea (maybe I misunderstood its usage). > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > that were only intended to run Windows. Windows calls > > SetVirtualAddressMap() *after* creating the new mappings, and some > > broken firmware drivers will access those regions too early. > > > > On Linux, we install the mapping first, and then much later, we > > actually create the mappings and only activate them on a single CPU > > while the EFI runtime service call is in progress. > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > always revisit it later. > OK, now I think there are no big problems. Only some bikesheddings: > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > boot information, it looks most similar to the old-world, and we can > distinguish old-world/new-world by judging whether a0 is greater than > 1. > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > indentation look better. > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > OK > Hope this series will be stable as soon as possible, our kexec/kdump > work needs to adjust because of this change. :) > Do you have a link to those patches?
On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > In the last patch I see: > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > + > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > + > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > deals with that. > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > from efi_init() should be fine. > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > It works fine now. > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > that currently work on LoongArch? > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > use "noefi" instead. > > > > > > OK > > > > > > > But I think support "efi=novamap" is not a bad > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > that were only intended to run Windows. Windows calls > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > broken firmware drivers will access those regions too early. > > > > > > On Linux, we install the mapping first, and then much later, we > > > actually create the mappings and only activate them on a single CPU > > > while the EFI runtime service call is in progress. > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > always revisit it later. > > OK, now I think there are no big problems. Only some bikesheddings: > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > boot information, it looks most similar to the old-world, and we can > > distinguish old-world/new-world by judging whether a0 is greater than > > 1. > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > indentation look better. > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > OK > > > Hope this series will be stable as soon as possible, our kexec/kdump > > work needs to adjust because of this change. :) > > > > Do you have a link to those patches? There is a snapshot for patches pending for 6.1: https://github.com/loongson/linux/commits/loongarch-next
On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > + > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > + > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > from efi_init() should be fine. > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > that currently work on LoongArch? > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > use "noefi" instead. > > > > > > > > OK > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > that were only intended to run Windows. Windows calls > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > broken firmware drivers will access those regions too early. > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > actually create the mappings and only activate them on a single CPU > > > > while the EFI runtime service call is in progress. > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > always revisit it later. > > > OK, now I think there are no big problems. Only some bikesheddings: > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > boot information, it looks most similar to the old-world, and we can > > > distinguish old-world/new-world by judging whether a0 is greater than > > > 1. > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > indentation look better. > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > OK > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > work needs to adjust because of this change. :) > > > > > > > Do you have a link to those patches? > There is a snapshot for patches pending for 6.1: > https://github.com/loongson/linux/commits/loongarch-next Thanks. I already spotted an issue there which is exactly the kind of thing I am trying to avoid with this series. diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c index 7423361b0ebc9b..c68c88e40cc0b9 100644 --- a/arch/loongarch/kernel/mem.c +++ b/arch/loongarch/kernel/mem.c @@ -61,4 +62,9 @@ void __init memblock_init(void) /* Reserve the initrd */ reserve_initrd_mem(); + + /* Main reserved memory for the elf core head */ + early_init_fdt_scan_reserved_mem(); + /* Parse linux,usable-memory-range for crash dump kernel */ + early_init_dt_check_for_usable_mem_range(); } So here, we are adding support for DT memory reservations, which kdump apparently needs. However, this parsing of the DT not only happens on kexec boot: all ACPI and DT kernels will now honour FDT memory reservations passed in via a DT when booting the first kernel, and external projects may use this and start to depend on this. Once something like this hits the upstream kernel, it is *very* difficult to change or remove. If you only need to pass the usable memory range for kcrash/kdump, it's probably better to use a command line option for that, instead of relying on DT memory reservations.
Hi, Ard, On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > + > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > + > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > from efi_init() should be fine. > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > that currently work on LoongArch? > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > use "noefi" instead. > > > > > > > > > > OK > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > that were only intended to run Windows. Windows calls > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > actually create the mappings and only activate them on a single CPU > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > always revisit it later. > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > boot information, it looks most similar to the old-world, and we can > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > 1. > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > indentation look better. > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > OK > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > work needs to adjust because of this change. :) > > > > > > > > > > Do you have a link to those patches? > > There is a snapshot for patches pending for 6.1: > > https://github.com/loongson/linux/commits/loongarch-next > > Thanks. I already spotted an issue there which is exactly the kind of > thing I am trying to avoid with this series. > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > --- a/arch/loongarch/kernel/mem.c > +++ b/arch/loongarch/kernel/mem.c > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > /* Reserve the initrd */ > reserve_initrd_mem(); > + > + /* Main reserved memory for the elf core head */ > + early_init_fdt_scan_reserved_mem(); > + /* Parse linux,usable-memory-range for crash dump kernel */ > + early_init_dt_check_for_usable_mem_range(); > } > > So here, we are adding support for DT memory reservations, which kdump > apparently needs. > > However, this parsing of the DT not only happens on kexec boot: all > ACPI and DT kernels will now honour FDT memory reservations passed in > via a DT when booting the first kernel, and external projects may use > this and start to depend on this. > > Once something like this hits the upstream kernel, it is *very* > difficult to change or remove. > > If you only need to pass the usable memory range for kcrash/kdump, > it's probably better to use a command line option for that, instead of > relying on DT memory reservations. Thank you for your suggestion, but we found some trouble when handle initrd in kexec. In current implementation, we generate a fdt in kexec-tools, then fill "linux,usable-memory-range", "linux,elfcorehdr" and initrd information in it. After this series, we can use "mem=xxx" "elfcorehdr=" in cmdline, but how to handle initrd information which is stored in a system table? Huacai
On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > + > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > + > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > that currently work on LoongArch? > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > use "noefi" instead. > > > > > > > > > > > > OK > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > that were only intended to run Windows. Windows calls > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > always revisit it later. > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > boot information, it looks most similar to the old-world, and we can > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > 1. > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > indentation look better. > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > OK > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > Do you have a link to those patches? > > > There is a snapshot for patches pending for 6.1: > > > https://github.com/loongson/linux/commits/loongarch-next > > > > Thanks. I already spotted an issue there which is exactly the kind of > > thing I am trying to avoid with this series. > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > --- a/arch/loongarch/kernel/mem.c > > +++ b/arch/loongarch/kernel/mem.c > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > /* Reserve the initrd */ > > reserve_initrd_mem(); > > + > > + /* Main reserved memory for the elf core head */ > > + early_init_fdt_scan_reserved_mem(); > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > + early_init_dt_check_for_usable_mem_range(); > > } > > > > So here, we are adding support for DT memory reservations, which kdump > > apparently needs. > > > > However, this parsing of the DT not only happens on kexec boot: all > > ACPI and DT kernels will now honour FDT memory reservations passed in > > via a DT when booting the first kernel, and external projects may use > > this and start to depend on this. > > > > Once something like this hits the upstream kernel, it is *very* > > difficult to change or remove. > > > > If you only need to pass the usable memory range for kcrash/kdump, > > it's probably better to use a command line option for that, instead of > > relying on DT memory reservations. > Thank you for your suggestion, but we found some trouble when handle > initrd in kexec. > In current implementation, we generate a fdt in kexec-tools, then fill > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > cmdline, but how to handle initrd information which is stored in a > system table? > There are two options: - use initrdmem= on the command line - update the INITRD config table in memory (i.e., update the base and size to refer to the new initrd image)
On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > + > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > that currently work on LoongArch? > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > always revisit it later. > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > 1. > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > indentation look better. > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > OK > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > There is a snapshot for patches pending for 6.1: > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > thing I am trying to avoid with this series. > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > --- a/arch/loongarch/kernel/mem.c > > > +++ b/arch/loongarch/kernel/mem.c > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > /* Reserve the initrd */ > > > reserve_initrd_mem(); > > > + > > > + /* Main reserved memory for the elf core head */ > > > + early_init_fdt_scan_reserved_mem(); > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > + early_init_dt_check_for_usable_mem_range(); > > > } > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > apparently needs. > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > via a DT when booting the first kernel, and external projects may use > > > this and start to depend on this. > > > > > > Once something like this hits the upstream kernel, it is *very* > > > difficult to change or remove. > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > it's probably better to use a command line option for that, instead of > > > relying on DT memory reservations. > > Thank you for your suggestion, but we found some trouble when handle > > initrd in kexec. > > In current implementation, we generate a fdt in kexec-tools, then fill > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > cmdline, but how to handle initrd information which is stored in a > > system table? > > > > There are two options: > - use initrdmem= on the command line This is not a good way, even if the second kernel handle "initrdmem=", it will conflict with the config table. > - update the INITRD config table in memory (i.e., update the base and > size to refer to the new initrd image) This way seems practicable, but we don't know how to handle it in kexec-tools. :( Huacai
On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Ard, > > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > > that currently work on LoongArch? > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > > always revisit it later. > > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > > 1. > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > > indentation look better. > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > > There is a snapshot for patches pending for 6.1: > > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > > thing I am trying to avoid with this series. > > > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > > --- a/arch/loongarch/kernel/mem.c > > > > +++ b/arch/loongarch/kernel/mem.c > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > > > /* Reserve the initrd */ > > > > reserve_initrd_mem(); > > > > + > > > > + /* Main reserved memory for the elf core head */ > > > > + early_init_fdt_scan_reserved_mem(); > > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > > + early_init_dt_check_for_usable_mem_range(); > > > > } > > > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > > apparently needs. > > > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > > via a DT when booting the first kernel, and external projects may use > > > > this and start to depend on this. > > > > > > > > Once something like this hits the upstream kernel, it is *very* > > > > difficult to change or remove. > > > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > > it's probably better to use a command line option for that, instead of > > > > relying on DT memory reservations. > > > Thank you for your suggestion, but we found some trouble when handle > > > initrd in kexec. > > > In current implementation, we generate a fdt in kexec-tools, then fill > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > > cmdline, but how to handle initrd information which is stored in a > > > system table? > > > > > > > There are two options: > > - use initrdmem= on the command line > This is not a good way, even if the second kernel handle "initrdmem=", > it will conflict with the config table. > It will not conflict - initrdmem= supersedes the INITRD table because early param passing happens after efi_init(). > > - update the INITRD config table in memory (i.e., update the base and > > size to refer to the new initrd image) > This way seems practicable, but we don't know how to handle it in > kexec-tools. :( > Good point. Let me think a bit about this.
On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Ard, > > > > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > > > that currently work on LoongArch? > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > > > always revisit it later. > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > > > 1. > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > > > indentation look better. > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > > > There is a snapshot for patches pending for 6.1: > > > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > > > thing I am trying to avoid with this series. > > > > > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > > > --- a/arch/loongarch/kernel/mem.c > > > > > +++ b/arch/loongarch/kernel/mem.c > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > > > > > /* Reserve the initrd */ > > > > > reserve_initrd_mem(); > > > > > + > > > > > + /* Main reserved memory for the elf core head */ > > > > > + early_init_fdt_scan_reserved_mem(); > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > > > + early_init_dt_check_for_usable_mem_range(); > > > > > } > > > > > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > > > apparently needs. > > > > > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > > > via a DT when booting the first kernel, and external projects may use > > > > > this and start to depend on this. > > > > > > > > > > Once something like this hits the upstream kernel, it is *very* > > > > > difficult to change or remove. > > > > > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > > > it's probably better to use a command line option for that, instead of > > > > > relying on DT memory reservations. > > > > Thank you for your suggestion, but we found some trouble when handle > > > > initrd in kexec. > > > > In current implementation, we generate a fdt in kexec-tools, then fill > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > > > cmdline, but how to handle initrd information which is stored in a > > > > system table? > > > > > > > > > > There are two options: > > > - use initrdmem= on the command line > > This is not a good way, even if the second kernel handle "initrdmem=", > > it will conflict with the config table. > > > > It will not conflict - initrdmem= supersedes the INITRD table because > early param passing happens after efi_init(). > > > > - update the INITRD config table in memory (i.e., update the base and > > > size to refer to the new initrd image) > > This way seems practicable, but we don't know how to handle it in > > kexec-tools. :( > > > > Good point. Let me think a bit about this. OK, then let's go back to this series itself. :) I have seen the latest code in your git repo, I don't think we need efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as before seems reasonable. If efi_novamap breaks something, I can accept "set efi_novamap to false" in the previous version, or just ignore its value in the efistub, but a0 should clearly be the indicator of EFI_BOOT. Huacai
On Mon, 19 Sept 2022 at 17:09, Huacai Chen <chenhuacai@kernel.org> wrote: > > On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Ard, > > > > > > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > > > > that currently work on LoongArch? > > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > > > > always revisit it later. > > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > > > > 1. > > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > > > > indentation look better. > > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > > > > There is a snapshot for patches pending for 6.1: > > > > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > > > > thing I am trying to avoid with this series. > > > > > > > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > > > > --- a/arch/loongarch/kernel/mem.c > > > > > > +++ b/arch/loongarch/kernel/mem.c > > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > > > > > > > /* Reserve the initrd */ > > > > > > reserve_initrd_mem(); > > > > > > + > > > > > > + /* Main reserved memory for the elf core head */ > > > > > > + early_init_fdt_scan_reserved_mem(); > > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > > > > + early_init_dt_check_for_usable_mem_range(); > > > > > > } > > > > > > > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > > > > apparently needs. > > > > > > > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > > > > via a DT when booting the first kernel, and external projects may use > > > > > > this and start to depend on this. > > > > > > > > > > > > Once something like this hits the upstream kernel, it is *very* > > > > > > difficult to change or remove. > > > > > > > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > > > > it's probably better to use a command line option for that, instead of > > > > > > relying on DT memory reservations. > > > > > Thank you for your suggestion, but we found some trouble when handle > > > > > initrd in kexec. > > > > > In current implementation, we generate a fdt in kexec-tools, then fill > > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > > > > cmdline, but how to handle initrd information which is stored in a > > > > > system table? > > > > > > > > > > > > > There are two options: > > > > - use initrdmem= on the command line > > > This is not a good way, even if the second kernel handle "initrdmem=", > > > it will conflict with the config table. > > > > > > > It will not conflict - initrdmem= supersedes the INITRD table because > > early param passing happens after efi_init(). > > > > > > - update the INITRD config table in memory (i.e., update the base and > > > > size to refer to the new initrd image) > > > This way seems practicable, but we don't know how to handle it in > > > kexec-tools. :( > > > > > > > Good point. Let me think a bit about this. > OK, then let's go back to this series itself. :) > > I have seen the latest code in your git repo, I don't think we need > efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as > before seems reasonable. > No, not really. EFI_BOOT basically means 'did I boot via EFI?' and strictly, you should not be attempting to access the EFI system table or parse EFI config tables unless EFI_BOOT is true. Deviating from this makes it more difficult for generic code to reason about what parts of EFI are active on a given system. > If efi_novamap breaks something, I can accept "set efi_novamap to > false" in the previous version, or just ignore its value in the > efistub, but a0 should clearly be the indicator of EFI_BOOT. > I'm fine with keeping efi_novamap if you think it has a value. To me, it seems rather pointless, because it prevents you from using the runtime services. If you want a0 to control the EFI boot flag, you should find another way to control whether runtime services can be used, because they are really two different things.
Hi, Ard, On Mon, Sep 19, 2022 at 11:50 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 19 Sept 2022 at 17:09, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Mon, Sep 19, 2022 at 10:44 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Mon, 19 Sept 2022 at 16:43, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Mon, Sep 19, 2022 at 10:32 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Mon, 19 Sept 2022 at 16:25, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > On Mon, Sep 19, 2022 at 8:27 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 14:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 8:11 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 7:21 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 13:15, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:49 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:47, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 6:37 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 12:33, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 2:22 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 08:06, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Sep 19, 2022 at 1:15 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 19 Sept 2022 at 03:58, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Ard, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think the parameters passed to the core kernel need to be discussed. > > > > > > > > > > > > > > > > > > > > The old way (so-called old world): > > > > > > > > > > > > > > > > > > > > a0=argc, a1=argv, a1=bpi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The current way (so-called new world): > > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=fdt pointer > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The new way (in this patchset): > > > > > > > > > > > > > > > > > > > > a0=efi boot flag, a1=systemtable, a2=cmdline > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I prefer to use the current way, there are 3 reasons: > > > > > > > > > > > > > > > > > > > > 1, both acpi system and dt system can use the same parameters passing method; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DT systems will use this too. The distinction is between EFI boot and > > > > > > > > > > > > > > > > > > > non-EFI boot. We *really* should keep these separate, given the > > > > > > > > > > > > > > > > > > > experience on ARM, where other projects invent ways to pass those > > > > > > > > > > > > > > > > > > > values to the kernel without going through the stub. > > > > > > > > > > > > > > > > > > In the last patch I see: > > > > > > > > > > > > > > > > > > + void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > + early_init_dt_scan(fdt_ptr); > > > > > > > > > > > > > > > > > > + early_init_fdt_reserve_self(); > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > clear_bit(EFI_BOOT, &efi.flags); > > > > > > > > > > > > > > > > > > So I suppose for the DT system that means a0=efi boot flag, a1=fdt > > > > > > > > > > > > > > > > > > pointer, a2=cmdline? Then it is not exactly the same as the ACPI > > > > > > > > > > > > > > > > > > system, but similar. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > No, for non-EFI DT boot, the command line is passed via the DT, so > > > > > > > > > > > > > > > > > a0=0x0 (non-efi), a1=DT, a2=0x0 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you intend to support non-EFI DT boot by the way? > > > > > > > > > > > > > > > > I think we needn't support non-EFI DT boot, so a0=efi boot flag, > > > > > > > > > > > > > > > > a1=systemtable, a2=cmdline is just OK (or maybe we can exchange a1/a2, > > > > > > > > > > > > > > > > which looks similar to the old-world). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Excellent. If non-EFI boot is not supported, we can drop the code that > > > > > > > > > > > > > > > deals with that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > But I have another question: is > > > > > > > > > > > > > > > > it early enough to get DT from systemtable for DT boot (in the current > > > > > > > > > > > > > > > > way DT is the earliest thing)? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If you rely on DT only for the hardware description, then loading it > > > > > > > > > > > > > > > from efi_init() should be fine. > > > > > > > > > > > > > > OK, then please drop patch #12 at this time. It can be added when we > > > > > > > > > > > > > > add Loongson-2K support. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think it makes sense to allow efi=novamap on LoongArch, given > > > > > > > > > > > > > > > > > that we cannot make use of the runtime services that way. So in my > > > > > > > > > > > > > > > > > code, efi_novamap is set to false unconditionally. > > > > > > > > > > > > > > > > Emm, I prefer to support "efi=novamap", the core kernel has already > > > > > > > > > > > > > > > > supported "noefi" to disable runtime, I don't want to hack > > > > > > > > > > > > > > > > efi_novamap. So please keep the efi boot flag, thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fair enough. Do you have any uses for efi_novamap in mind? > > > > > > > > > > > > > > I usually use "efi=novamap" in EFI shell to see whether it can work > > > > > > > > > > > > > > well without runtime. And I think I will modify this patch these days > > > > > > > > > > > > > > because in another thread you said that this series cannot boot. > > > > > > > > > > > > > > > > > > > > > > > > > > It works fine now. > > > > > > > > > > > > > > > > > > > > > > > > > > However, clearing the EFI_BOOT flag is not the correct way to disable > > > > > > > > > > > > > runtime services only. And note that we also have efi=noruntime - does > > > > > > > > > > > > > that currently work on LoongArch? > > > > > > > > > > > > OK, but we don't need to add a new "efi=noruntime" parameter, I can > > > > > > > > > > > > use "noefi" instead. > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > > > > > But I think support "efi=novamap" is not a bad > > > > > > > > > > > > idea (maybe I misunderstood its usage). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It basically exists to deal with broken EFI firmware on ARM laptops > > > > > > > > > > > that were only intended to run Windows. Windows calls > > > > > > > > > > > SetVirtualAddressMap() *after* creating the new mappings, and some > > > > > > > > > > > broken firmware drivers will access those regions too early. > > > > > > > > > > > > > > > > > > > > > > On Linux, we install the mapping first, and then much later, we > > > > > > > > > > > actually create the mappings and only activate them on a single CPU > > > > > > > > > > > while the EFI runtime service call is in progress. > > > > > > > > > > > > > > > > > > > > > > In any case, I will reinstate the efi_novamap logic for now - we can > > > > > > > > > > > always revisit it later. > > > > > > > > > > OK, now I think there are no big problems. Only some bikesheddings: > > > > > > > > > > 1, Please use "a0=efi boot flag, a1=cmdline a2=systemtable" to pass > > > > > > > > > > boot information, it looks most similar to the old-world, and we can > > > > > > > > > > distinguish old-world/new-world by judging whether a0 is greater than > > > > > > > > > > 1. > > > > > > > > > > 2, I prefer "early return" in efi_init, i.e., if (boot_memmap == > > > > > > > > > > EFI_INVALID_TABLE_ADDR) then return immediately, this makes > > > > > > > > > > indentation look better. > > > > > > > > > > 3, Declare "cmdline" out of the if() condition in init_environ() looks better. > > > > > > > > > > 4, I prefer "kernel_addr" rather than "image_addr" in efi_boot_kernel(). > > > > > > > > > > 5, It seems that one line is enough for the last statement in efi_boot_kernel(). > > > > > > > > > > > > > > > > > > > > > > > > > > > > OK > > > > > > > > > > > > > > > > > > > Hope this series will be stable as soon as possible, our kexec/kdump > > > > > > > > > > work needs to adjust because of this change. :) > > > > > > > > > > > > > > > > > > > > > > > > > > > > Do you have a link to those patches? > > > > > > > > There is a snapshot for patches pending for 6.1: > > > > > > > > https://github.com/loongson/linux/commits/loongarch-next > > > > > > > > > > > > > > Thanks. I already spotted an issue there which is exactly the kind of > > > > > > > thing I am trying to avoid with this series. > > > > > > > > > > > > > > diff --git a/arch/loongarch/kernel/mem.c b/arch/loongarch/kernel/mem.c > > > > > > > index 7423361b0ebc9b..c68c88e40cc0b9 100644 > > > > > > > --- a/arch/loongarch/kernel/mem.c > > > > > > > +++ b/arch/loongarch/kernel/mem.c > > > > > > > @@ -61,4 +62,9 @@ void __init memblock_init(void) > > > > > > > > > > > > > > /* Reserve the initrd */ > > > > > > > reserve_initrd_mem(); > > > > > > > + > > > > > > > + /* Main reserved memory for the elf core head */ > > > > > > > + early_init_fdt_scan_reserved_mem(); > > > > > > > + /* Parse linux,usable-memory-range for crash dump kernel */ > > > > > > > + early_init_dt_check_for_usable_mem_range(); > > > > > > > } > > > > > > > > > > > > > > So here, we are adding support for DT memory reservations, which kdump > > > > > > > apparently needs. > > > > > > > > > > > > > > However, this parsing of the DT not only happens on kexec boot: all > > > > > > > ACPI and DT kernels will now honour FDT memory reservations passed in > > > > > > > via a DT when booting the first kernel, and external projects may use > > > > > > > this and start to depend on this. > > > > > > > > > > > > > > Once something like this hits the upstream kernel, it is *very* > > > > > > > difficult to change or remove. > > > > > > > > > > > > > > If you only need to pass the usable memory range for kcrash/kdump, > > > > > > > it's probably better to use a command line option for that, instead of > > > > > > > relying on DT memory reservations. > > > > > > Thank you for your suggestion, but we found some trouble when handle > > > > > > initrd in kexec. > > > > > > In current implementation, we generate a fdt in kexec-tools, then fill > > > > > > "linux,usable-memory-range", "linux,elfcorehdr" and initrd information > > > > > > in it. After this series, we can use "mem=xxx" "elfcorehdr=" in > > > > > > cmdline, but how to handle initrd information which is stored in a > > > > > > system table? > > > > > > > > > > > > > > > > There are two options: > > > > > - use initrdmem= on the command line > > > > This is not a good way, even if the second kernel handle "initrdmem=", > > > > it will conflict with the config table. > > > > > > > > > > It will not conflict - initrdmem= supersedes the INITRD table because > > > early param passing happens after efi_init(). > > > > > > > > - update the INITRD config table in memory (i.e., update the base and > > > > > size to refer to the new initrd image) > > > > This way seems practicable, but we don't know how to handle it in > > > > kexec-tools. :( > > > > > > > > > > Good point. Let me think a bit about this. > > OK, then let's go back to this series itself. :) > > > > I have seen the latest code in your git repo, I don't think we need > > efi_disable_rt. Instead, setting/clearing EFI_BOOT according to a0 as > > before seems reasonable. > > > > No, not really. EFI_BOOT basically means 'did I boot via EFI?' and > strictly, you should not be attempting to access the EFI system table > or parse EFI config tables unless EFI_BOOT is true. > > Deviating from this makes it more difficult for generic code to reason > about what parts of EFI are active on a given system. > > > If efi_novamap breaks something, I can accept "set efi_novamap to > > false" in the previous version, or just ignore its value in the > > efistub, but a0 should clearly be the indicator of EFI_BOOT. > > > > I'm fine with keeping efi_novamap if you think it has a value. To me, > it seems rather pointless, because it prevents you from using the > runtime services. > > If you want a0 to control the EFI boot flag, you should find another > way to control whether runtime services can be used, because they are > really two different things. I'm very sorry, after an offline discussion with my colleagues, non-EFI DT boot is still needed (very sadly, we want to drop non-EFI firmware but we can't do that). However, for non-EFI DT boot we will use the same parameter passing method (a0=efi boot flag, a1=cmdline, a2=systemtable), firmware will generate a simple systemtable only for DT. In this way all boot methods share the same logic, and also make kexec easy to implement. So, let's make a0 the real "efi boot flag" and let it control EFI_BOOT, for efistub, we can just pass "true" unconditionally (whether support efi_novamap is not as important as the efi boot flag for us, as you said, efi_novamap is just for broken firmware). Huacai
On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > ... > I'm very sorry, after an offline discussion with my colleagues, > non-EFI DT boot is still needed (very sadly, we want to drop non-EFI > firmware but we can't do that). However, for non-EFI DT boot we will > use the same parameter passing method (a0=efi boot flag, a1=cmdline, > a2=systemtable), firmware will generate a simple systemtable only for > DT. In this way all boot methods share the same logic, and also make > kexec easy to implement. > OK, that should work. So I suppose you create a EFI system table along with EFI configuration tables for DT, SMBIOS, etc? In this case, I suggest you omit the MEMMAP config table that I am adding here, so that there is no ambiguity between the EFI provided memory map and the one provided by DT. I think that should be a clean way to implement this. > So, let's make a0 the real "efi boot flag" and let it control > EFI_BOOT, for efistub, we can just pass "true" unconditionally > (whether support efi_novamap is not as important as the efi boot flag > for us, as you said, efi_novamap is just for broken firmware). Indeed. So as before, I will just set efi_novamap to false. You can still use noefi or efi=noruntime to turn off the EFI runtime pieces if needed (e.g., PREEMPT_RT tends to disable those by default to preserve their bounded worst case latency)
Hi, Ard, On Tue, Sep 20, 2022 at 4:04 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Ard, > > > ... > > I'm very sorry, after an offline discussion with my colleagues, > > non-EFI DT boot is still needed (very sadly, we want to drop non-EFI > > firmware but we can't do that). However, for non-EFI DT boot we will > > use the same parameter passing method (a0=efi boot flag, a1=cmdline, > > a2=systemtable), firmware will generate a simple systemtable only for > > DT. In this way all boot methods share the same logic, and also make > > kexec easy to implement. > > > > OK, that should work. So I suppose you create a EFI system table along > with EFI configuration tables for DT, SMBIOS, etc? In this case, I > suggest you omit the MEMMAP config table that I am adding here, so > that there is no ambiguity between the EFI provided memory map and the > one provided by DT. > > I think that should be a clean way to implement this. OK, thanks. I have merged efi-cleanups-for-v6.1-v2 to https://github.com/loongson/linux/commits/loongarch-next. It seems everything work well except kexec. Huacai > > > So, let's make a0 the real "efi boot flag" and let it control > > EFI_BOOT, for efistub, we can just pass "true" unconditionally > > (whether support efi_novamap is not as important as the efi boot flag > > for us, as you said, efi_novamap is just for broken firmware). > > Indeed. So as before, I will just set efi_novamap to false. You can > still use noefi or efi=noruntime to turn off the EFI runtime pieces if > needed (e.g., PREEMPT_RT tends to disable those by default to preserve > their bounded worst case latency)
On Tue, 20 Sept 2022 at 15:12, Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Ard, > > On Tue, Sep 20, 2022 at 4:04 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Tue, 20 Sept 2022 at 03:45, Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Ard, > > > > > ... > > > I'm very sorry, after an offline discussion with my colleagues, > > > non-EFI DT boot is still needed (very sadly, we want to drop non-EFI > > > firmware but we can't do that). However, for non-EFI DT boot we will > > > use the same parameter passing method (a0=efi boot flag, a1=cmdline, > > > a2=systemtable), firmware will generate a simple systemtable only for > > > DT. In this way all boot methods share the same logic, and also make > > > kexec easy to implement. > > > > > > > OK, that should work. So I suppose you create a EFI system table along > > with EFI configuration tables for DT, SMBIOS, etc? In this case, I > > suggest you omit the MEMMAP config table that I am adding here, so > > that there is no ambiguity between the EFI provided memory map and the > > one provided by DT. > > > > I think that should be a clean way to implement this. > OK, thanks. > > I have merged efi-cleanups-for-v6.1-v2 to > https://github.com/loongson/linux/commits/loongarch-next. It seems > everything work well except kexec. > OK thanks for testing. I will send out a v2 today and merge the changes into efi/next tomorrow.
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig index fca106a8b8af..14a2a1ec8561 100644 --- a/arch/loongarch/Kconfig +++ b/arch/loongarch/Kconfig @@ -104,8 +104,6 @@ config LOONGARCH select MODULES_USE_ELF_RELA if MODULES select NEED_PER_CPU_EMBED_FIRST_CHUNK select NEED_PER_CPU_PAGE_FIRST_CHUNK - select OF - select OF_EARLY_FLATTREE select PCI select PCI_DOMAINS_GENERIC select PCI_ECAM if ACPI @@ -311,7 +309,6 @@ config DMI config EFI bool "EFI runtime service support" select UCS2_STRING - select EFI_PARAMS_FROM_FDT select EFI_RUNTIME_WRAPPERS help This enables the kernel to use EFI runtime services that are diff --git a/arch/loongarch/include/asm/bootinfo.h b/arch/loongarch/include/asm/bootinfo.h index e02ac4af7f6e..8e5881bc5ad1 100644 --- a/arch/loongarch/include/asm/bootinfo.h +++ b/arch/loongarch/include/asm/bootinfo.h @@ -36,7 +36,7 @@ struct loongson_system_configuration { }; extern u64 efi_system_table; -extern unsigned long fw_arg0, fw_arg1; +extern unsigned long fw_arg0, fw_arg1, fw_arg2; extern struct loongson_board_info b_info; extern struct loongson_system_configuration loongson_sysconf; diff --git a/arch/loongarch/kernel/efi.c b/arch/loongarch/kernel/efi.c index 1f1f755fb425..3b80675726ec 100644 --- a/arch/loongarch/kernel/efi.c +++ b/arch/loongarch/kernel/efi.c @@ -27,8 +27,13 @@ static unsigned long efi_nr_tables; static unsigned long efi_config_table; +static unsigned long __initdata boot_memmap = EFI_INVALID_TABLE_ADDR; + static efi_system_table_t *efi_systab; -static efi_config_table_type_t arch_tables[] __initdata = {{},}; +static efi_config_table_type_t arch_tables[] __initdata = { + {LINUX_EFI_BOOT_MEMMAP_GUID, &boot_memmap, "MEMMAP" }, + {}, +}; void __init efi_runtime_init(void) { @@ -61,6 +66,8 @@ void __init efi_init(void) return; } + efi_systab_report_header(&efi_systab->hdr, efi_systab->fw_vendor); + set_bit(EFI_64BIT, &efi.flags); efi_nr_tables = efi_systab->nr_tables; efi_config_table = (unsigned long)efi_systab->tables; @@ -72,4 +79,25 @@ void __init efi_init(void) if (screen_info.orig_video_isVGA == VIDEO_TYPE_EFI) memblock_reserve(screen_info.lfb_base, screen_info.lfb_size); + + if (boot_memmap != EFI_INVALID_TABLE_ADDR) { + struct efi_memory_map_data data; + struct efi_boot_memmap *tbl; + + tbl = early_memremap_ro(boot_memmap, sizeof(*tbl)); + if (tbl) { + data.phys_map = boot_memmap + sizeof(*tbl); + data.size = tbl->map_size; + data.desc_size = tbl->desc_size; + data.desc_version = tbl->desc_ver; + + if (efi_memmap_init_early(&data) < 0) + panic("Unable to map EFI memory map.\n"); + + memblock_reserve(data.phys_map & PAGE_MASK, + PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); + + early_memunmap(tbl, sizeof(*tbl)); + } + } } diff --git a/arch/loongarch/kernel/env.c b/arch/loongarch/kernel/env.c index 82b478a5c665..05c38d28476e 100644 --- a/arch/loongarch/kernel/env.c +++ b/arch/loongarch/kernel/env.c @@ -8,7 +8,6 @@ #include <linux/efi.h> #include <linux/export.h> #include <linux/memblock.h> -#include <linux/of_fdt.h> #include <asm/early_ioremap.h> #include <asm/bootinfo.h> #include <asm/loongson.h> @@ -20,21 +19,18 @@ EXPORT_SYMBOL(loongson_sysconf); void __init init_environ(void) { int efi_boot = fw_arg0; - struct efi_memory_map_data data; - void *fdt_ptr = early_memremap_ro(fw_arg1, SZ_64K); - if (efi_boot) - set_bit(EFI_BOOT, &efi.flags); - else - clear_bit(EFI_BOOT, &efi.flags); + if (efi_boot) { + char *cmdline = early_memremap_ro(fw_arg2, COMMAND_LINE_SIZE); - early_init_dt_scan(fdt_ptr); - early_init_fdt_reserve_self(); - efi_system_table = efi_get_fdt_params(&data); + strscpy(boot_command_line, cmdline, COMMAND_LINE_SIZE); + early_memunmap(cmdline, COMMAND_LINE_SIZE); - efi_memmap_init_early(&data); - memblock_reserve(data.phys_map & PAGE_MASK, - PAGE_ALIGN(data.size + (data.phys_map & ~PAGE_MASK))); + efi_system_table = fw_arg1; + set_bit(EFI_BOOT, &efi.flags); + } else { + clear_bit(EFI_BOOT, &efi.flags); + } } static int __init init_cpu_fullname(void) diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S index 01bac62a6442..8f89f39fd31b 100644 --- a/arch/loongarch/kernel/head.S +++ b/arch/loongarch/kernel/head.S @@ -67,6 +67,8 @@ SYM_CODE_START(kernel_entry) # kernel entry point st.d a0, t0, 0 # firmware arguments la t0, fw_arg1 st.d a1, t0, 0 + la t0, fw_arg2 + st.d a2, t0, 0 /* KSave3 used for percpu base, initialized as 0 */ csrwr zero, PERCPU_BASE_KS diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c index e8714b1d94c8..7fabf2306e80 100644 --- a/arch/loongarch/kernel/setup.c +++ b/arch/loongarch/kernel/setup.c @@ -51,7 +51,7 @@ struct screen_info screen_info __section(".data"); -unsigned long fw_arg0, fw_arg1; +unsigned long fw_arg0, fw_arg1, fw_arg2; DEFINE_PER_CPU(unsigned long, kernelsp); struct cpuinfo_loongarch cpu_data[NR_CPUS] __read_mostly; @@ -187,7 +187,6 @@ early_param("mem", early_parse_mem); void __init platform_init(void) { - efi_init(); #ifdef CONFIG_ACPI_TABLE_UPGRADE acpi_table_upgrade(); #endif @@ -347,6 +346,7 @@ void __init setup_arch(char **cmdline_p) *cmdline_p = boot_command_line; init_environ(); + efi_init(); memblock_init(); parse_early_param(); diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 0dbc6d93f2e6..d8d6657e6277 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -29,7 +29,7 @@ cflags-$(CONFIG_RISCV) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ cflags-$(CONFIG_LOONGARCH) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \ -fpie -cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt +cflags-$(CONFIG_EFI_PARAMS_FROM_FDT) += -I$(srctree)/scripts/dtc/libfdt KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \ -include $(srctree)/include/linux/hidden.h \ @@ -60,14 +60,17 @@ lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \ alignedmem.o relocate.o vsprintf.o \ systable.o -# include the stub's generic dependencies from lib/ when building for ARM/arm64 -efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c +# include the stub's libfdt dependencies from lib/ when needed +libfdt-deps := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c \ + fdt_empty_tree.c fdt_sw.c + +lib-$(CONFIG_EFI_PARAMS_FROM_FDT) += fdt.o \ + $(patsubst %.c,lib-%.o,$(libfdt-deps)) $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE $(call if_changed_rule,cc_o_c) -lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o intrinsics.o \ - $(patsubst %.c,lib-%.o,$(efi-deps-y)) +lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o lib-$(CONFIG_ARM) += arm32-stub.o lib-$(CONFIG_ARM64) += arm64-stub.o diff --git a/drivers/firmware/efi/libstub/loongarch-stub.c b/drivers/firmware/efi/libstub/loongarch-stub.c index b7ef8d2df59e..7c684d10f936 100644 --- a/drivers/firmware/efi/libstub/loongarch-stub.c +++ b/drivers/firmware/efi/libstub/loongarch-stub.c @@ -9,7 +9,8 @@ #include <asm/addrspace.h> #include "efistub.h" -typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long fdt); +typedef void __noreturn (*kernel_entry_t)(bool efi, unsigned long systab, + unsigned long cmdline); extern int kernel_asize; extern int kernel_fsize; @@ -42,19 +43,60 @@ efi_status_t handle_kernel_image(unsigned long *image_addr, return status; } -void __noreturn efi_enter_kernel(unsigned long entrypoint, unsigned long fdt, unsigned long fdt_size) +struct exit_boot_struct { + efi_memory_desc_t *runtime_map; + int runtime_entry_count; +}; + +static efi_status_t exit_boot_func(struct efi_boot_memmap *map, void *priv) +{ + struct exit_boot_struct *p = priv; + + /* + * Update the memory map with virtual addresses. The function will also + * populate @runtime_map with copies of just the EFI_MEMORY_RUNTIME + * entries so that we can pass it straight to SetVirtualAddressMap() + */ + efi_get_virtmap(map->map, map->map_size, map->desc_size, + p->runtime_map, &p->runtime_entry_count); + + return EFI_SUCCESS; +} + +efi_status_t efi_boot_kernel(void *handle, efi_loaded_image_t *image, + unsigned long image_addr, char *cmdline_ptr) { kernel_entry_t real_kernel_entry; + struct exit_boot_struct priv; + unsigned long desc_size; + efi_status_t status; + u32 desc_ver; + + status = efi_alloc_virtmap(&priv.runtime_map, &desc_size, &desc_ver); + if (status != EFI_SUCCESS) { + efi_err("Unable to retrieve UEFI memory map.\n"); + return status; + } + + efi_info("Exiting boot services\n"); + + efi_novamap = false; + status = efi_exit_boot_services(handle, &priv, exit_boot_func); + if (status != EFI_SUCCESS) + return status; + + /* Install the new virtual address map */ + efi_rt_call(set_virtual_address_map, + priv.runtime_entry_count * desc_size, desc_size, + desc_ver, priv.runtime_map); /* Config Direct Mapping */ csr_write64(CSR_DMW0_INIT, LOONGARCH_CSR_DMWIN0); csr_write64(CSR_DMW1_INIT, LOONGARCH_CSR_DMWIN1); real_kernel_entry = (kernel_entry_t) - ((unsigned long)&kernel_entry - entrypoint + VMLINUX_LOAD_ADDRESS); + ((unsigned long)&kernel_entry - image_addr + VMLINUX_LOAD_ADDRESS); - if (!efi_novamap) - real_kernel_entry(true, fdt); - else - real_kernel_entry(false, fdt); + real_kernel_entry(true, (unsigned long)efi_system_table, + (unsigned long)cmdline_ptr); }
LoongArch does not use FDT or DT natively [yet], and the only reason it currently uses it is so that it can reuse the existing EFI stub code. Overloading the DT with data passed between the EFI stub and the core kernel has been a source of problems: there is the overlap between information provided by EFI which DT can also provide (initrd base/size, command line, memory descriptions), requiring us to reason about which is which and what to prioritize. It has also resulted in ABI leaks, i.e., internal ABI being promoted to external ABI inadvertently because the bootloader can set the EFI stub's DT properties as well (e.g., "kaslr-seed"). This has become especially problematic with boot environments that want to pretend that EFI boot is being done (to access ACPI and SMBIOS tables, for instance) but have no ability to execute the EFI stub, and so the environment that the EFI stub creates is emulated [poorly, in some cases]. Another downside of treating DT like this is that the DT binary that the kernel receives is different from the one created by the firmware, which is undesirable in the context of secure and measured boot. Given that LoongArch support in Linux is brand new, we can avoid these pitfalls, and treat the DT strictly as a hardware description, and use a separate handover method between the EFI stub and the kernel. Now that initrd loading and passing the EFI memory map have been refactored into pure EFI routines that use EFI configuration tables, the only thing we need to pass directly is the kernel command line (even if we could pass this via a config table as well, it is used extremely early, so passing it directly is preferred in this case.) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/loongarch/Kconfig | 3 -- arch/loongarch/include/asm/bootinfo.h | 2 +- arch/loongarch/kernel/efi.c | 30 ++++++++++- arch/loongarch/kernel/env.c | 22 ++++---- arch/loongarch/kernel/head.S | 2 + arch/loongarch/kernel/setup.c | 4 +- drivers/firmware/efi/libstub/Makefile | 13 +++-- drivers/firmware/efi/libstub/loongarch-stub.c | 56 +++++++++++++++++--- 8 files changed, 100 insertions(+), 32 deletions(-)