Message ID | 1486635389-26974-1-git-send-email-ard.biesheuvel@linaro.org |
---|---|
State | New |
Headers | show |
On 9 February 2017 at 10:16, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On arm64, we have made some changes over the past year to the way the > kernel itself is allocated and to how it deals with the initrd and FDT. > This patch brings the allocation logic in the EFI stub in line with that, > which is necessary because the introduction of KASLR has created the > possibility for the initrd to be allocated in a place where the kernel > may not be able to map it. (This is currently a theoretical scenario, > since it only affects systems where the size of RAM exceeds the size of > the linear mapping.) > Actually, this affects systems where the RAM /footprint/ exceeds the size of the linear region, and it isn't that theoretical given that some Freescale systems were already affected by this in some way. But we switched to 48-bit VAs by default, and in that configuration, this issue is unlikely to occur at present. > So adhere to the arm64 boot protocol, and make sure that the initrd is > fully inside a 1GB aligned 32 GB window that covers the kernel as well. > > The FDT may be anywhere in memory on arm64 now that we map it via the > fixmap, so we can lift the address restriction there completely. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm/include/asm/efi.h | 14 +++++++++++++- > arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- > drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- > 3 files changed, 35 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h > index 0b06f5341b45..62620451f60b 100644 > --- a/arch/arm/include/asm/efi.h > +++ b/arch/arm/include/asm/efi.h > @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt) > */ > #define ZIMAGE_OFFSET_LIMIT SZ_128M > #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE > -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT > + > +/* on ARM, the FDT should be located in the first 128 MB of RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return dram_base + ZIMAGE_OFFSET_LIMIT; > +} > + > +/* on ARM, the initrd should be loaded in a lowmem region */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return dram_base + SZ_512M; > +} > > #endif /* _ASM_ARM_EFI_H */ > diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h > index 0b6b1633017f..6a6c8a0d1424 100644 > --- a/arch/arm64/include/asm/efi.h > +++ b/arch/arm64/include/asm/efi.h > @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); > * 2MiB so we know it won't cross a 2MiB boundary. > */ > #define EFI_FDT_ALIGN SZ_2M /* used by allocate_new_fdt_and_exit_boot() */ > -#define MAX_FDT_OFFSET SZ_512M > + > +/* on arm64, the FDT may be located anywhere in system RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return ULONG_MAX; > +} > + > +/* > + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB window > + * that covers Image as well. Since we allocate from the top down, set a max > + * address that is virtually guaranteed to produce a suitable allocation even > + * when the physical address of Image is randomized. > + */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; > +} > > #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__) > #define __efi_call_early(f, ...) f(__VA_ARGS__) > diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c > index b4f7d78f9e8b..557281fe375f 100644 > --- a/drivers/firmware/efi/libstub/arm-stub.c > +++ b/drivers/firmware/efi/libstub/arm-stub.c > @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > if (!fdt_addr) > pr_efi(sys_table, "Generating empty DTB\n"); > > - status = handle_cmdline_files(sys_table, image, cmdline_ptr, > - "initrd=", dram_base + SZ_512M, > + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "initrd=", > + efi_get_max_initrd_addr(dram_base, > + *image_addr), > (unsigned long *)&initrd_addr, > (unsigned long *)&initrd_size); > if (status != EFI_SUCCESS) > @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > > new_fdt_addr = fdt_addr; > status = allocate_new_fdt_and_exit_boot(sys_table, handle, > - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, > + &new_fdt_addr, efi_get_max_fdt_addr(dram_base), > initrd_addr, initrd_size, cmdline_ptr, > fdt_addr, fdt_size); > > -- > 2.7.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: > On arm64, we have made some changes over the past year to the way the > kernel itself is allocated and to how it deals with the initrd and FDT. > This patch brings the allocation logic in the EFI stub in line with that, > which is necessary because the introduction of KASLR has created the > possibility for the initrd to be allocated in a place where the kernel > may not be able to map it. (This is currently a theoretical scenario, > since it only affects systems where the size of RAM exceeds the size of > the linear mapping.) > > So adhere to the arm64 boot protocol, and make sure that the initrd is > fully inside a 1GB aligned 32 GB window that covers the kernel as well. > > The FDT may be anywhere in memory on arm64 now that we map it via the > fixmap, so we can lift the address restriction there completely. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- I'll give this a test on our platform that was running into the current limitation - probably this weekend. I reviewed the code and its ok, but I do have one question. Do we need to handle the case where initrd ends up below the kernel? Lets assume KALSR puts the kernel somewhere up high in DDR, after the 32GB mark in DDR. Now lets assume the unlikely scenario that the initrd won't fit anywhere after 32GB, but will fit before 32GB. Per my understanding of efi_high_alloc, it will put the initrd before the 32GB mark, which will be outside of the window where the kernel is. Seems like there are 3 options - 1. We consider the scenario unlikely to the point that we don't care, and don't do anything 2. We check to see if initrd is allocated to be within the window, and if not print an error message 3. Change efi_high_alloc to take a min addess as well as a max, so that it will fail if the initrd can't fit in the window (which will result in an error message printed) Thoughts? > arch/arm/include/asm/efi.h | 14 +++++++++++++- > arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- > drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- > 3 files changed, 35 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h > index 0b06f5341b45..62620451f60b 100644 > --- a/arch/arm/include/asm/efi.h > +++ b/arch/arm/include/asm/efi.h > @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt) > */ > #define ZIMAGE_OFFSET_LIMIT SZ_128M > #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE > -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT > + > +/* on ARM, the FDT should be located in the first 128 MB of RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return dram_base + ZIMAGE_OFFSET_LIMIT; > +} > + > +/* on ARM, the initrd should be loaded in a lowmem region */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return dram_base + SZ_512M; > +} > > #endif /* _ASM_ARM_EFI_H */ > diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h > index 0b6b1633017f..6a6c8a0d1424 100644 > --- a/arch/arm64/include/asm/efi.h > +++ b/arch/arm64/include/asm/efi.h > @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); > * 2MiB so we know it won't cross a 2MiB boundary. > */ > #define EFI_FDT_ALIGN SZ_2M /* used by allocate_new_fdt_and_exit_boot() */ > -#define MAX_FDT_OFFSET SZ_512M > + > +/* on arm64, the FDT may be located anywhere in system RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return ULONG_MAX; > +} > + > +/* > + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB window > + * that covers Image as well. Since we allocate from the top down, set a max > + * address that is virtually guaranteed to produce a suitable allocation even > + * when the physical address of Image is randomized. > + */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; > +} > > #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__) > #define __efi_call_early(f, ...) f(__VA_ARGS__) > diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c > index b4f7d78f9e8b..557281fe375f 100644 > --- a/drivers/firmware/efi/libstub/arm-stub.c > +++ b/drivers/firmware/efi/libstub/arm-stub.c > @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > if (!fdt_addr) > pr_efi(sys_table, "Generating empty DTB\n"); > > - status = handle_cmdline_files(sys_table, image, cmdline_ptr, > - "initrd=", dram_base + SZ_512M, > + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "initrd=", > + efi_get_max_initrd_addr(dram_base, > + *image_addr), > (unsigned long *)&initrd_addr, > (unsigned long *)&initrd_size); > if (status != EFI_SUCCESS) > @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > > new_fdt_addr = fdt_addr; > status = allocate_new_fdt_and_exit_boot(sys_table, handle, > - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, > + &new_fdt_addr, efi_get_max_fdt_addr(dram_base), > initrd_addr, initrd_size, cmdline_ptr, > fdt_addr, fdt_size); > > -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: > On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >> >> On arm64, we have made some changes over the past year to the way the >> kernel itself is allocated and to how it deals with the initrd and FDT. >> This patch brings the allocation logic in the EFI stub in line with that, >> which is necessary because the introduction of KASLR has created the >> possibility for the initrd to be allocated in a place where the kernel >> may not be able to map it. (This is currently a theoretical scenario, >> since it only affects systems where the size of RAM exceeds the size of >> the linear mapping.) >> >> So adhere to the arm64 boot protocol, and make sure that the initrd is >> fully inside a 1GB aligned 32 GB window that covers the kernel as well. >> >> The FDT may be anywhere in memory on arm64 now that we map it via the >> fixmap, so we can lift the address restriction there completely. >> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- > > > I'll give this a test on our platform that was running into the current > limitation - probably this weekend. > > I reviewed the code and its ok, but I do have one question. Do we need to > handle the case where initrd ends up below the kernel? > > Lets assume KALSR puts the kernel somewhere up high in DDR, after the 32GB > mark in DDR. Now lets assume the unlikely scenario that the initrd won't > fit anywhere after 32GB, but will fit before 32GB. Per my understanding of > efi_high_alloc, it will put the initrd before the 32GB mark, which will be > outside of the window where the kernel is. > The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as long as the follow expression holds, we should be fine align(max(kernel_end, initrd_end), 1g) - align_down (min (kernel_start, initrd_start), 1g) <= 32g > Seems like there are 3 options - > 1. We consider the scenario unlikely to the point that we don't care, and > don't do anything > 2. We check to see if initrd is allocated to be within the window, and if > not print an error message > 3. Change efi_high_alloc to take a min addess as well as a max, so that it > will fail if the initrd can't fit in the window (which will result in an > error message printed) > > Thoughts? > > >> arch/arm/include/asm/efi.h | 14 +++++++++++++- >> arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- >> drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- >> 3 files changed, 35 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h >> index 0b06f5341b45..62620451f60b 100644 >> --- a/arch/arm/include/asm/efi.h >> +++ b/arch/arm/include/asm/efi.h >> @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct >> screen_info *si, const char *opt) >> */ >> #define ZIMAGE_OFFSET_LIMIT SZ_128M >> #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE >> -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT >> + >> +/* on ARM, the FDT should be located in the first 128 MB of RAM */ >> +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) >> +{ >> + return dram_base + ZIMAGE_OFFSET_LIMIT; >> +} >> + >> +/* on ARM, the initrd should be loaded in a lowmem region */ >> +static inline unsigned long efi_get_max_initrd_addr(unsigned long >> dram_base, >> + unsigned long >> image_addr) >> +{ >> + return dram_base + SZ_512M; >> +} >> >> #endif /* _ASM_ARM_EFI_H */ >> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h >> index 0b6b1633017f..6a6c8a0d1424 100644 >> --- a/arch/arm64/include/asm/efi.h >> +++ b/arch/arm64/include/asm/efi.h >> @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, >> efi_memory_desc_t *md); >> * 2MiB so we know it won't cross a 2MiB boundary. >> */ >> #define EFI_FDT_ALIGN SZ_2M /* used by >> allocate_new_fdt_and_exit_boot() */ >> -#define MAX_FDT_OFFSET SZ_512M >> + >> +/* on arm64, the FDT may be located anywhere in system RAM */ >> +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) >> +{ >> + return ULONG_MAX; >> +} >> + >> +/* >> + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB >> window >> + * that covers Image as well. Since we allocate from the top down, set a >> max >> + * address that is virtually guaranteed to produce a suitable allocation >> even >> + * when the physical address of Image is randomized. >> + */ >> +static inline unsigned long efi_get_max_initrd_addr(unsigned long >> dram_base, >> + unsigned long >> image_addr) >> +{ >> + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; >> +} >> >> #define efi_call_early(f, ...) >> sys_table_arg->boottime->f(__VA_ARGS__) >> #define __efi_call_early(f, ...) f(__VA_ARGS__) >> diff --git a/drivers/firmware/efi/libstub/arm-stub.c >> b/drivers/firmware/efi/libstub/arm-stub.c >> index b4f7d78f9e8b..557281fe375f 100644 >> --- a/drivers/firmware/efi/libstub/arm-stub.c >> +++ b/drivers/firmware/efi/libstub/arm-stub.c >> @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, >> efi_system_table_t *sys_table, >> if (!fdt_addr) >> pr_efi(sys_table, "Generating empty DTB\n"); >> >> - status = handle_cmdline_files(sys_table, image, cmdline_ptr, >> - "initrd=", dram_base + SZ_512M, >> + status = handle_cmdline_files(sys_table, image, cmdline_ptr, >> "initrd=", >> + efi_get_max_initrd_addr(dram_base, >> + >> *image_addr), >> (unsigned long *)&initrd_addr, >> (unsigned long *)&initrd_size); >> if (status != EFI_SUCCESS) >> @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, >> efi_system_table_t *sys_table, >> >> new_fdt_addr = fdt_addr; >> status = allocate_new_fdt_and_exit_boot(sys_table, handle, >> - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, >> + &new_fdt_addr, >> efi_get_max_fdt_addr(dram_base), >> initrd_addr, initrd_size, cmdline_ptr, >> fdt_addr, fdt_size); >> >> > > > -- > Jeffrey Hugo > Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, > Inc. > Qualcomm Technologies, Inc. is a member of the > Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: > On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>> >>> On arm64, we have made some changes over the past year to the way the >>> kernel itself is allocated and to how it deals with the initrd and FDT. >>> This patch brings the allocation logic in the EFI stub in line with that, >>> which is necessary because the introduction of KASLR has created the >>> possibility for the initrd to be allocated in a place where the kernel >>> may not be able to map it. (This is currently a theoretical scenario, >>> since it only affects systems where the size of RAM exceeds the size of >>> the linear mapping.) >>> >>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>> fully inside a 1GB aligned 32 GB window that covers the kernel as well. >>> >>> The FDT may be anywhere in memory on arm64 now that we map it via the >>> fixmap, so we can lift the address restriction there completely. >>> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>> --- >> >> >> I'll give this a test on our platform that was running into the current >> limitation - probably this weekend. >> >> I reviewed the code and its ok, but I do have one question. Do we need to >> handle the case where initrd ends up below the kernel? >> >> Lets assume KALSR puts the kernel somewhere up high in DDR, after the 32GB >> mark in DDR. Now lets assume the unlikely scenario that the initrd won't >> fit anywhere after 32GB, but will fit before 32GB. Per my understanding of >> efi_high_alloc, it will put the initrd before the 32GB mark, which will be >> outside of the window where the kernel is. >> > > The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as > long as the follow expression holds, we should be fine > > > align(max(kernel_end, initrd_end), 1g) - align_down (min > (kernel_start, initrd_start), 1g) <= 32g Yes, and I argue there is a possibility (we'll call it extremely remote) where that may not hold. My question is, do we care about that possibility, and if so, do we do anything about it? > > >> Seems like there are 3 options - >> 1. We consider the scenario unlikely to the point that we don't care, and >> don't do anything >> 2. We check to see if initrd is allocated to be within the window, and if >> not print an error message >> 3. Change efi_high_alloc to take a min addess as well as a max, so that it >> will fail if the initrd can't fit in the window (which will result in an >> error message printed) >> >> Thoughts? >> >> >>> arch/arm/include/asm/efi.h | 14 +++++++++++++- >>> arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- >>> drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- >>> 3 files changed, 35 insertions(+), 5 deletions(-) >>> >>> diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h >>> index 0b06f5341b45..62620451f60b 100644 >>> --- a/arch/arm/include/asm/efi.h >>> +++ b/arch/arm/include/asm/efi.h >>> @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct >>> screen_info *si, const char *opt) >>> */ >>> #define ZIMAGE_OFFSET_LIMIT SZ_128M >>> #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE >>> -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT >>> + >>> +/* on ARM, the FDT should be located in the first 128 MB of RAM */ >>> +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) >>> +{ >>> + return dram_base + ZIMAGE_OFFSET_LIMIT; >>> +} >>> + >>> +/* on ARM, the initrd should be loaded in a lowmem region */ >>> +static inline unsigned long efi_get_max_initrd_addr(unsigned long >>> dram_base, >>> + unsigned long >>> image_addr) >>> +{ >>> + return dram_base + SZ_512M; >>> +} >>> >>> #endif /* _ASM_ARM_EFI_H */ >>> diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h >>> index 0b6b1633017f..6a6c8a0d1424 100644 >>> --- a/arch/arm64/include/asm/efi.h >>> +++ b/arch/arm64/include/asm/efi.h >>> @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, >>> efi_memory_desc_t *md); >>> * 2MiB so we know it won't cross a 2MiB boundary. >>> */ >>> #define EFI_FDT_ALIGN SZ_2M /* used by >>> allocate_new_fdt_and_exit_boot() */ >>> -#define MAX_FDT_OFFSET SZ_512M >>> + >>> +/* on arm64, the FDT may be located anywhere in system RAM */ >>> +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) >>> +{ >>> + return ULONG_MAX; >>> +} >>> + >>> +/* >>> + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB >>> window >>> + * that covers Image as well. Since we allocate from the top down, set a >>> max >>> + * address that is virtually guaranteed to produce a suitable allocation >>> even >>> + * when the physical address of Image is randomized. >>> + */ >>> +static inline unsigned long efi_get_max_initrd_addr(unsigned long >>> dram_base, >>> + unsigned long >>> image_addr) >>> +{ >>> + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; >>> +} >>> >>> #define efi_call_early(f, ...) >>> sys_table_arg->boottime->f(__VA_ARGS__) >>> #define __efi_call_early(f, ...) f(__VA_ARGS__) >>> diff --git a/drivers/firmware/efi/libstub/arm-stub.c >>> b/drivers/firmware/efi/libstub/arm-stub.c >>> index b4f7d78f9e8b..557281fe375f 100644 >>> --- a/drivers/firmware/efi/libstub/arm-stub.c >>> +++ b/drivers/firmware/efi/libstub/arm-stub.c >>> @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, >>> efi_system_table_t *sys_table, >>> if (!fdt_addr) >>> pr_efi(sys_table, "Generating empty DTB\n"); >>> >>> - status = handle_cmdline_files(sys_table, image, cmdline_ptr, >>> - "initrd=", dram_base + SZ_512M, >>> + status = handle_cmdline_files(sys_table, image, cmdline_ptr, >>> "initrd=", >>> + efi_get_max_initrd_addr(dram_base, >>> + >>> *image_addr), >>> (unsigned long *)&initrd_addr, >>> (unsigned long *)&initrd_size); >>> if (status != EFI_SUCCESS) >>> @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, >>> efi_system_table_t *sys_table, >>> >>> new_fdt_addr = fdt_addr; >>> status = allocate_new_fdt_and_exit_boot(sys_table, handle, >>> - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, >>> + &new_fdt_addr, >>> efi_get_max_fdt_addr(dram_base), >>> initrd_addr, initrd_size, cmdline_ptr, >>> fdt_addr, fdt_size); >>> >>> >> >> >> -- >> Jeffrey Hugo >> Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, >> Inc. >> Qualcomm Technologies, Inc. is a member of the >> Code Aurora Forum, a Linux Foundation Collaborative Project. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9 February 2017 at 17:41, Jeffrey Hugo <jhugo@codeaurora.org> wrote: > On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: >> >> On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>> >>> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>>> >>>> >>>> On arm64, we have made some changes over the past year to the way the >>>> kernel itself is allocated and to how it deals with the initrd and FDT. >>>> This patch brings the allocation logic in the EFI stub in line with >>>> that, >>>> which is necessary because the introduction of KASLR has created the >>>> possibility for the initrd to be allocated in a place where the kernel >>>> may not be able to map it. (This is currently a theoretical scenario, >>>> since it only affects systems where the size of RAM exceeds the size of >>>> the linear mapping.) >>>> >>>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>>> fully inside a 1GB aligned 32 GB window that covers the kernel as well. >>>> >>>> The FDT may be anywhere in memory on arm64 now that we map it via the >>>> fixmap, so we can lift the address restriction there completely. >>>> >>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>> --- >>> >>> >>> >>> I'll give this a test on our platform that was running into the current >>> limitation - probably this weekend. >>> >>> I reviewed the code and its ok, but I do have one question. Do we need >>> to >>> handle the case where initrd ends up below the kernel? >>> >>> Lets assume KALSR puts the kernel somewhere up high in DDR, after the >>> 32GB >>> mark in DDR. Now lets assume the unlikely scenario that the initrd won't >>> fit anywhere after 32GB, but will fit before 32GB. Per my understanding >>> of >>> efi_high_alloc, it will put the initrd before the 32GB mark, which will >>> be >>> outside of the window where the kernel is. >>> >> >> The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as >> long as the follow expression holds, we should be fine >> >> >> align(max(kernel_end, initrd_end), 1g) - align_down (min >> (kernel_start, initrd_start), 1g) <= 32g > > > Yes, and I argue there is a possibility (we'll call it extremely remote) > where that may not hold. My question is, do we care about that possibility, > and if so, do we do anything about it? > We allocate top down, so we start at align_down(base_of_image, 1g) + 32g, and go down until we hit a region that first our initrd. We will disregard the region that the kernel occupies, but below that, we will just proceed until we find a slot. This effectively means we have a 63 GB window, with the kernel in the middle, where we can load the initrd and adhere to the boot protocol. I don't see how we could end up in the situation where we load the kernel somewhere, and both the 31 GB before *and* after are completely occupied. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/9/2017 10:45 AM, Ard Biesheuvel wrote: > On 9 February 2017 at 17:41, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >> On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: >>> >>> On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>> >>>> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>>>> >>>>> >>>>> On arm64, we have made some changes over the past year to the way the >>>>> kernel itself is allocated and to how it deals with the initrd and FDT. >>>>> This patch brings the allocation logic in the EFI stub in line with >>>>> that, >>>>> which is necessary because the introduction of KASLR has created the >>>>> possibility for the initrd to be allocated in a place where the kernel >>>>> may not be able to map it. (This is currently a theoretical scenario, >>>>> since it only affects systems where the size of RAM exceeds the size of >>>>> the linear mapping.) >>>>> >>>>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>>>> fully inside a 1GB aligned 32 GB window that covers the kernel as well. >>>>> >>>>> The FDT may be anywhere in memory on arm64 now that we map it via the >>>>> fixmap, so we can lift the address restriction there completely. >>>>> >>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>> --- >>>> >>>> >>>> >>>> I'll give this a test on our platform that was running into the current >>>> limitation - probably this weekend. >>>> >>>> I reviewed the code and its ok, but I do have one question. Do we need >>>> to >>>> handle the case where initrd ends up below the kernel? >>>> >>>> Lets assume KALSR puts the kernel somewhere up high in DDR, after the >>>> 32GB >>>> mark in DDR. Now lets assume the unlikely scenario that the initrd won't >>>> fit anywhere after 32GB, but will fit before 32GB. Per my understanding >>>> of >>>> efi_high_alloc, it will put the initrd before the 32GB mark, which will >>>> be >>>> outside of the window where the kernel is. >>>> >>> >>> The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as >>> long as the follow expression holds, we should be fine >>> >>> >>> align(max(kernel_end, initrd_end), 1g) - align_down (min >>> (kernel_start, initrd_start), 1g) <= 32g >> >> >> Yes, and I argue there is a possibility (we'll call it extremely remote) >> where that may not hold. My question is, do we care about that possibility, >> and if so, do we do anything about it? >> > > We allocate top down, so we start at align_down(base_of_image, 1g) + > 32g, and go down until we hit a region that first our initrd. We will > disregard the region that the kernel occupies, but below that, we will > just proceed until we find a slot. This effectively means we have a 63 > GB window, with the kernel in the middle, where we can load the initrd > and adhere to the boot protocol. I don't see how we could end up in > the situation where we load the kernel somewhere, and both the 31 GB > before *and* after are completely occupied. > No we don't. We do not allocate top down. Please look at efi_high_alloc. Efi_high_alloc iterates though the memory map, low to high. It looks to see if a slot can hold the allocation, and the slot does not exceed the specified max. If so, efi_high_alloc retains a reference to the slot. Then efi_high_alloc continues iterating though the map, until the end. efi_high_alloc only stores a reference to the most recently valid slot, which would be the highest slot in the map. My system can have 256GB (or more) of RAM. It is possible, however remote, that the initrd and kernel can be more than 64GB away from each other. Lets assume KASLR puts the kernel at 250GB. Lets assume, for whatever reason, we can't fit the initrd above 150GB (there was just enough room to jam kernel there somwhow, but firmware is consuming the rest, maybe it put rootfs there via NFIT). efi_high_alloc will put the initrd at some point just below 150GB, because it iterates low to high, and 150GB will be below the max of 250GB where the kernel is. This will result in the initrd and kernel being ~100GB away in this example, which violates the requirements stated in Booting.txt I see the situation is possible, but I admit it is remote. If you want to ignore it, fine. I would be happy with that so long as the assumption is documented so that if it is ever somehow violated in the real world, we know what broke. -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9 February 2017 at 18:01, Jeffrey Hugo <jhugo@codeaurora.org> wrote: > On 2/9/2017 10:45 AM, Ard Biesheuvel wrote: >> >> On 9 February 2017 at 17:41, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>> >>> On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: >>>> >>>> >>>> On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>>> >>>>> >>>>> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>>>>> >>>>>> >>>>>> >>>>>> On arm64, we have made some changes over the past year to the way the >>>>>> kernel itself is allocated and to how it deals with the initrd and >>>>>> FDT. >>>>>> This patch brings the allocation logic in the EFI stub in line with >>>>>> that, >>>>>> which is necessary because the introduction of KASLR has created the >>>>>> possibility for the initrd to be allocated in a place where the kernel >>>>>> may not be able to map it. (This is currently a theoretical scenario, >>>>>> since it only affects systems where the size of RAM exceeds the size >>>>>> of >>>>>> the linear mapping.) >>>>>> >>>>>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>>>>> fully inside a 1GB aligned 32 GB window that covers the kernel as >>>>>> well. >>>>>> >>>>>> The FDT may be anywhere in memory on arm64 now that we map it via the >>>>>> fixmap, so we can lift the address restriction there completely. >>>>>> >>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>> --- >>>>> >>>>> >>>>> >>>>> >>>>> I'll give this a test on our platform that was running into the current >>>>> limitation - probably this weekend. >>>>> >>>>> I reviewed the code and its ok, but I do have one question. Do we need >>>>> to >>>>> handle the case where initrd ends up below the kernel? >>>>> >>>>> Lets assume KALSR puts the kernel somewhere up high in DDR, after the >>>>> 32GB >>>>> mark in DDR. Now lets assume the unlikely scenario that the initrd >>>>> won't >>>>> fit anywhere after 32GB, but will fit before 32GB. Per my >>>>> understanding >>>>> of >>>>> efi_high_alloc, it will put the initrd before the 32GB mark, which will >>>>> be >>>>> outside of the window where the kernel is. >>>>> >>>> >>>> The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as >>>> long as the follow expression holds, we should be fine >>>> >>>> >>>> align(max(kernel_end, initrd_end), 1g) - align_down (min >>>> (kernel_start, initrd_start), 1g) <= 32g >>> >>> >>> >>> Yes, and I argue there is a possibility (we'll call it extremely remote) >>> where that may not hold. My question is, do we care about that >>> possibility, >>> and if so, do we do anything about it? >>> >> >> We allocate top down, so we start at align_down(base_of_image, 1g) + >> 32g, and go down until we hit a region that first our initrd. We will >> disregard the region that the kernel occupies, but below that, we will >> just proceed until we find a slot. This effectively means we have a 63 >> GB window, with the kernel in the middle, where we can load the initrd >> and adhere to the boot protocol. I don't see how we could end up in >> the situation where we load the kernel somewhere, and both the 31 GB >> before *and* after are completely occupied. >> > > No we don't. We do not allocate top down. Please look at efi_high_alloc. > > Efi_high_alloc iterates though the memory map, low to high. It looks to see > if a slot can hold the allocation, and the slot does not exceed the > specified max. If so, efi_high_alloc retains a reference to the slot. Then > efi_high_alloc continues iterating though the map, until the end. > efi_high_alloc only stores a reference to the most recently valid slot, > which would be the highest slot in the map. > It is documented as /* * Allocate at the highest possible address that is not above 'max'. */ and what you describe is pretty much that, no? > My system can have 256GB (or more) of RAM. It is possible, however remote, > that the initrd and kernel can be more than 64GB away from each other. > > Lets assume KASLR puts the kernel at 250GB. Lets assume, for whatever > reason, we can't fit the initrd above 150GB (there was just enough room to > jam kernel there somwhow, but firmware is consuming the rest, maybe it put > rootfs there via NFIT). So before even booting the kernel, you already have 100 GB of memory occupied? As I replied before, you are correct that in this case, you will not be able to put the initrd within 32 GB of the kernel. But do note that this 32 GB figure is derived from the linear region size of a 16k pages kernel with 2 levels of translation, which is a niche configuration by itself. On a system that has 256 GB of RAM, it is highly unlikely that you will be using a kernel that can only map 32 GB of it. The reason for choosing the 32 GB figure is that it relieves the boot loader from having to go and figure out what kind of kernel is going to be executed. Page size can be read from the Image header but the VA size cannot. So 32 GB was a reasonable number imo. > efi_high_alloc will put the initrd at some point > just below 150GB, because it iterates low to high, No, because everything above that is occupied. If efi_high_alloc() does not do what it says on the tin, we should fix that. > and 150GB will be below > the max of 250GB where the kernel is. This will result in the initrd and > kernel being ~100GB away in this example, which violates the requirements > stated in Booting.txt > > I see the situation is possible, but I admit it is remote. If you want to > ignore it, fine. I would be happy with that so long as the assumption is > documented so that if it is ever somehow violated in the real world, we know > what broke. > -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9 February 2017 at 18:18, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 9 February 2017 at 18:01, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >> On 2/9/2017 10:45 AM, Ard Biesheuvel wrote: >>> >>> On 9 February 2017 at 17:41, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>> >>>> On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: >>>>> >>>>> >>>>> On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>>>> >>>>>> >>>>>> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On arm64, we have made some changes over the past year to the way the >>>>>>> kernel itself is allocated and to how it deals with the initrd and >>>>>>> FDT. >>>>>>> This patch brings the allocation logic in the EFI stub in line with >>>>>>> that, >>>>>>> which is necessary because the introduction of KASLR has created the >>>>>>> possibility for the initrd to be allocated in a place where the kernel >>>>>>> may not be able to map it. (This is currently a theoretical scenario, >>>>>>> since it only affects systems where the size of RAM exceeds the size >>>>>>> of >>>>>>> the linear mapping.) >>>>>>> >>>>>>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>>>>>> fully inside a 1GB aligned 32 GB window that covers the kernel as >>>>>>> well. >>>>>>> >>>>>>> The FDT may be anywhere in memory on arm64 now that we map it via the >>>>>>> fixmap, so we can lift the address restriction there completely. >>>>>>> >>>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>>> --- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I'll give this a test on our platform that was running into the current >>>>>> limitation - probably this weekend. >>>>>> >>>>>> I reviewed the code and its ok, but I do have one question. Do we need >>>>>> to >>>>>> handle the case where initrd ends up below the kernel? >>>>>> >>>>>> Lets assume KALSR puts the kernel somewhere up high in DDR, after the >>>>>> 32GB >>>>>> mark in DDR. Now lets assume the unlikely scenario that the initrd >>>>>> won't >>>>>> fit anywhere after 32GB, but will fit before 32GB. Per my >>>>>> understanding >>>>>> of >>>>>> efi_high_alloc, it will put the initrd before the 32GB mark, which will >>>>>> be >>>>>> outside of the window where the kernel is. >>>>>> >>>>> >>>>> The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as >>>>> long as the follow expression holds, we should be fine >>>>> >>>>> >>>>> align(max(kernel_end, initrd_end), 1g) - align_down (min >>>>> (kernel_start, initrd_start), 1g) <= 32g >>>> >>>> >>>> >>>> Yes, and I argue there is a possibility (we'll call it extremely remote) >>>> where that may not hold. My question is, do we care about that >>>> possibility, >>>> and if so, do we do anything about it? >>>> >>> >>> We allocate top down, so we start at align_down(base_of_image, 1g) + >>> 32g, and go down until we hit a region that first our initrd. We will >>> disregard the region that the kernel occupies, but below that, we will >>> just proceed until we find a slot. This effectively means we have a 63 >>> GB window, with the kernel in the middle, where we can load the initrd >>> and adhere to the boot protocol. I don't see how we could end up in >>> the situation where we load the kernel somewhere, and both the 31 GB >>> before *and* after are completely occupied. >>> >> >> No we don't. We do not allocate top down. Please look at efi_high_alloc. >> >> Efi_high_alloc iterates though the memory map, low to high. BTW the memory map isn't necessarily sorted per the UEFI spec, so it iterates in what is essentially random order, not low to high. >> It looks to see >> if a slot can hold the allocation, and the slot does not exceed the >> specified max. If so, efi_high_alloc retains a reference to the slot. Then >> efi_high_alloc continues iterating though the map, until the end. >> efi_high_alloc only stores a reference to the most recently valid slot, >> which would be the highest slot in the map. >> > > It is documented as > > /* > * Allocate at the highest possible address that is not above 'max'. > */ > > and what you describe is pretty much that, no? > >> My system can have 256GB (or more) of RAM. It is possible, however remote, >> that the initrd and kernel can be more than 64GB away from each other. >> >> Lets assume KASLR puts the kernel at 250GB. Lets assume, for whatever >> reason, we can't fit the initrd above 150GB (there was just enough room to >> jam kernel there somwhow, but firmware is consuming the rest, maybe it put >> rootfs there via NFIT). > > So before even booting the kernel, you already have 100 GB of memory > occupied? As I replied before, you are correct that in this case, you > will not be able to put the initrd within 32 GB of the kernel. But do > note that this 32 GB figure is derived from the linear region size of > a 16k pages kernel with 2 levels of translation, which is a niche > configuration by itself. On a system that has 256 GB of RAM, it is > highly unlikely that you will be using a kernel that can only map 32 > GB of it. > > The reason for choosing the 32 GB figure is that it relieves the boot > loader from having to go and figure out what kind of kernel is going > to be executed. Page size can be read from the Image header but the VA > size cannot. So 32 GB was a reasonable number imo. > >> efi_high_alloc will put the initrd at some point >> just below 150GB, because it iterates low to high, > > No, because everything above that is occupied. If efi_high_alloc() > does not do what it says on the tin, we should fix that. > >> and 150GB will be below >> the max of 250GB where the kernel is. This will result in the initrd and >> kernel being ~100GB away in this example, which violates the requirements >> stated in Booting.txt >> >> I see the situation is possible, but I admit it is remote. If you want to >> ignore it, fine. I would be happy with that so long as the assumption is >> documented so that if it is ever somehow violated in the real world, we know >> what broke. >> -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/9/2017 11:26 AM, Ard Biesheuvel wrote: > On 9 February 2017 at 18:18, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: >> On 9 February 2017 at 18:01, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>> On 2/9/2017 10:45 AM, Ard Biesheuvel wrote: >>>> >>>> On 9 February 2017 at 17:41, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>>> >>>>> On 2/9/2017 10:16 AM, Ard Biesheuvel wrote: >>>>>> >>>>>> >>>>>> On 9 February 2017 at 17:06, Jeffrey Hugo <jhugo@codeaurora.org> wrote: >>>>>>> >>>>>>> >>>>>>> On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On arm64, we have made some changes over the past year to the way the >>>>>>>> kernel itself is allocated and to how it deals with the initrd and >>>>>>>> FDT. >>>>>>>> This patch brings the allocation logic in the EFI stub in line with >>>>>>>> that, >>>>>>>> which is necessary because the introduction of KASLR has created the >>>>>>>> possibility for the initrd to be allocated in a place where the kernel >>>>>>>> may not be able to map it. (This is currently a theoretical scenario, >>>>>>>> since it only affects systems where the size of RAM exceeds the size >>>>>>>> of >>>>>>>> the linear mapping.) >>>>>>>> >>>>>>>> So adhere to the arm64 boot protocol, and make sure that the initrd is >>>>>>>> fully inside a 1GB aligned 32 GB window that covers the kernel as >>>>>>>> well. >>>>>>>> >>>>>>>> The FDT may be anywhere in memory on arm64 now that we map it via the >>>>>>>> fixmap, so we can lift the address restriction there completely. >>>>>>>> >>>>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>>>> --- >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I'll give this a test on our platform that was running into the current >>>>>>> limitation - probably this weekend. >>>>>>> >>>>>>> I reviewed the code and its ok, but I do have one question. Do we need >>>>>>> to >>>>>>> handle the case where initrd ends up below the kernel? >>>>>>> >>>>>>> Lets assume KALSR puts the kernel somewhere up high in DDR, after the >>>>>>> 32GB >>>>>>> mark in DDR. Now lets assume the unlikely scenario that the initrd >>>>>>> won't >>>>>>> fit anywhere after 32GB, but will fit before 32GB. Per my >>>>>>> understanding >>>>>>> of >>>>>>> efi_high_alloc, it will put the initrd before the 32GB mark, which will >>>>>>> be >>>>>>> outside of the window where the kernel is. >>>>>>> >>>>>> >>>>>> The 32 GB does not have to be 32 GB aligned, only 1 GB aligned. So as >>>>>> long as the follow expression holds, we should be fine >>>>>> >>>>>> >>>>>> align(max(kernel_end, initrd_end), 1g) - align_down (min >>>>>> (kernel_start, initrd_start), 1g) <= 32g >>>>> >>>>> >>>>> >>>>> Yes, and I argue there is a possibility (we'll call it extremely remote) >>>>> where that may not hold. My question is, do we care about that >>>>> possibility, >>>>> and if so, do we do anything about it? >>>>> >>>> >>>> We allocate top down, so we start at align_down(base_of_image, 1g) + >>>> 32g, and go down until we hit a region that first our initrd. We will >>>> disregard the region that the kernel occupies, but below that, we will >>>> just proceed until we find a slot. This effectively means we have a 63 >>>> GB window, with the kernel in the middle, where we can load the initrd >>>> and adhere to the boot protocol. I don't see how we could end up in >>>> the situation where we load the kernel somewhere, and both the 31 GB >>>> before *and* after are completely occupied. >>>> >>> >>> No we don't. We do not allocate top down. Please look at efi_high_alloc. >>> >>> Efi_high_alloc iterates though the memory map, low to high. > > BTW the memory map isn't necessarily sorted per the UEFI spec, so it > iterates in what is essentially random order, not low to high. True, I'm used to EDK2, which from what I've seen, keeps it ordered. However that's somewhat immaterial to my point that its possible for initrd to be far enough from kernel to break booting.txt > >>> It looks to see >>> if a slot can hold the allocation, and the slot does not exceed the >>> specified max. If so, efi_high_alloc retains a reference to the slot. Then >>> efi_high_alloc continues iterating though the map, until the end. >>> efi_high_alloc only stores a reference to the most recently valid slot, >>> which would be the highest slot in the map. >>> >> >> It is documented as >> >> /* >> * Allocate at the highest possible address that is not above 'max'. >> */ >> >> and what you describe is pretty much that, no? >> >>> My system can have 256GB (or more) of RAM. It is possible, however remote, >>> that the initrd and kernel can be more than 64GB away from each other. >>> >>> Lets assume KASLR puts the kernel at 250GB. Lets assume, for whatever >>> reason, we can't fit the initrd above 150GB (there was just enough room to >>> jam kernel there somwhow, but firmware is consuming the rest, maybe it put >>> rootfs there via NFIT). >> >> So before even booting the kernel, you already have 100 GB of memory >> occupied? That is possible, yes. Likely? Probably not. Would our system fail if initrd and kernel are father than the prescribed restriction? No, since the system can address all of RAM, we'd probably be fine. >> As I replied before, you are correct that in this case, you >> will not be able to put the initrd within 32 GB of the kernel. But do >> note that this 32 GB figure is derived from the linear region size of >> a 16k pages kernel with 2 levels of translation, which is a niche >> configuration by itself. On a system that has 256 GB of RAM, it is >> highly unlikely that you will be using a kernel that can only map 32 >> GB of it. >> >> The reason for choosing the 32 GB figure is that it relieves the boot >> loader from having to go and figure out what kind of kernel is going >> to be executed. Page size can be read from the Image header but the VA >> size cannot. So 32 GB was a reasonable number imo. Ok, so the restriction is completely arbitrary and has no real purpose. Ie nothing in the kernel will break, so long as you assume the system is not configured with more RAM than can be addressed, which doesn't feel reasonable to do. I realize I'm being nitpicky, from my perspective, any issues related to efistub are particularly difficult to debug, so if this scenario we've been going around about ever popped up, it wouldn't even give you a print that happened when you back trace the output trying to figure out why the boot failed. However, it really looks like even if the scenario occurred, there is zero realistic expectation anything would break, and its just a violation of some document that makes assumptions and should be treated more as guidance to try to follow, rather than hard rules. I guess I'm satisfied, and don't see any need to continue the discussion. Thanks for entertaining me. >> >>> efi_high_alloc will put the initrd at some point >>> just below 150GB, because it iterates low to high, >> >> No, because everything above that is occupied. If efi_high_alloc() >> does not do what it says on the tin, we should fix that. I will agree, efi_high_alloc() does what it says on the tin (my interpretation of what you were saying what not what you intended, sorry about that), but relying on that is not sufficient to implicitly assume that we are holding to the restrictions in booting.txt in all scenarios. >> >>> and 150GB will be below >>> the max of 250GB where the kernel is. This will result in the initrd and >>> kernel being ~100GB away in this example, which violates the requirements >>> stated in Booting.txt >>> >>> I see the situation is possible, but I admit it is remote. If you want to >>> ignore it, fine. I would be happy with that so long as the assumption is >>> documented so that if it is ever somehow violated in the real world, we know >>> what broke. >>> > -- > To unsubscribe from this list: send the line "unsubscribe linux-efi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jeffrey Hugo Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9 February 2017 at 19:04, Jeffrey Hugo <jhugo@codeaurora.org> wrote: > On 2/9/2017 11:26 AM, Ard Biesheuvel wrote: >> >> BTW the memory map isn't necessarily sorted per the UEFI spec, so it >> iterates in what is essentially random order, not low to high. > > > True, I'm used to EDK2, which from what I've seen, keeps it ordered. However > that's somewhat immaterial to my point that its possible for initrd to be > far enough from kernel to break booting.txt > >> >>>> It looks to see >>>> if a slot can hold the allocation, and the slot does not exceed the >>>> specified max. If so, efi_high_alloc retains a reference to the slot. >>>> Then >>>> efi_high_alloc continues iterating though the map, until the end. >>>> efi_high_alloc only stores a reference to the most recently valid slot, >>>> which would be the highest slot in the map. >>>> >>> >>> It is documented as >>> >>> /* >>> * Allocate at the highest possible address that is not above 'max'. >>> */ >>> >>> and what you describe is pretty much that, no? >>> >>>> My system can have 256GB (or more) of RAM. It is possible, however >>>> remote, >>>> that the initrd and kernel can be more than 64GB away from each other. >>>> >>>> Lets assume KASLR puts the kernel at 250GB. Lets assume, for whatever >>>> reason, we can't fit the initrd above 150GB (there was just enough room >>>> to >>>> jam kernel there somwhow, but firmware is consuming the rest, maybe it >>>> put >>>> rootfs there via NFIT). >>> >>> >>> So before even booting the kernel, you already have 100 GB of memory >>> occupied? > > > That is possible, yes. Likely? Probably not. Would our system fail if > initrd and kernel are father than the prescribed restriction? No, since the > system can address all of RAM, we'd probably be fine. > >>> As I replied before, you are correct that in this case, you >>> will not be able to put the initrd within 32 GB of the kernel. But do >>> note that this 32 GB figure is derived from the linear region size of >>> a 16k pages kernel with 2 levels of translation, which is a niche >>> configuration by itself. On a system that has 256 GB of RAM, it is >>> highly unlikely that you will be using a kernel that can only map 32 >>> GB of it. >>> >>> The reason for choosing the 32 GB figure is that it relieves the boot >>> loader from having to go and figure out what kind of kernel is going >>> to be executed. Page size can be read from the Image header but the VA >>> size cannot. So 32 GB was a reasonable number imo. > > > Ok, so the restriction is completely arbitrary and has no real purpose. Ie > nothing in the kernel will break, so long as you assume the system is not > configured with more RAM than can be addressed, which doesn't feel > reasonable to do. > > I realize I'm being nitpicky, from my perspective, any issues related to > efistub are particularly difficult to debug, so if this scenario we've been > going around about ever popped up, it wouldn't even give you a print that > happened when you back trace the output trying to figure out why the boot > failed. > > However, it really looks like even if the scenario occurred, there is zero > realistic expectation anything would break, and its just a violation of some > document that makes assumptions and should be treated more as guidance to > try to follow, rather than hard rules. > Actually, there is no reason for the stub to adhere 100% to the boot protocol, and we already violate it deliberately by randomizing the physical offset with a 64 KB granularity under KASLR, whereas the boot protocol stipulates that it should be a 2 MB aligned base + TEXT_OFFSET. So in this case, I think it is reasonable to take VA_BITS into account rather than use a hardcoded 32 GB. That will eliminate the problem completely when you use a 48-bit VA kernel, and will make it highly unlikely to ever occur on more common 39 and 42 bit VA configurations. -- Ard. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2/9/2017 3:16 AM, Ard Biesheuvel wrote: > On arm64, we have made some changes over the past year to the way the > kernel itself is allocated and to how it deals with the initrd and FDT. > This patch brings the allocation logic in the EFI stub in line with that, > which is necessary because the introduction of KASLR has created the > possibility for the initrd to be allocated in a place where the kernel > may not be able to map it. (This is currently a theoretical scenario, > since it only affects systems where the size of RAM exceeds the size of > the linear mapping.) > > So adhere to the arm64 boot protocol, and make sure that the initrd is > fully inside a 1GB aligned 32 GB window that covers the kernel as well. > > The FDT may be anywhere in memory on arm64 now that we map it via the > fixmap, so we can lift the address restriction there completely. > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > arch/arm/include/asm/efi.h | 14 +++++++++++++- > arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- > drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- > 3 files changed, 35 insertions(+), 5 deletions(-) > > diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h > index 0b06f5341b45..62620451f60b 100644 > --- a/arch/arm/include/asm/efi.h > +++ b/arch/arm/include/asm/efi.h > @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt) > */ > #define ZIMAGE_OFFSET_LIMIT SZ_128M > #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE > -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT > + > +/* on ARM, the FDT should be located in the first 128 MB of RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return dram_base + ZIMAGE_OFFSET_LIMIT; > +} > + > +/* on ARM, the initrd should be loaded in a lowmem region */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return dram_base + SZ_512M; > +} > > #endif /* _ASM_ARM_EFI_H */ > diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h > index 0b6b1633017f..6a6c8a0d1424 100644 > --- a/arch/arm64/include/asm/efi.h > +++ b/arch/arm64/include/asm/efi.h > @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); > * 2MiB so we know it won't cross a 2MiB boundary. > */ > #define EFI_FDT_ALIGN SZ_2M /* used by allocate_new_fdt_and_exit_boot() */ > -#define MAX_FDT_OFFSET SZ_512M > + > +/* on arm64, the FDT may be located anywhere in system RAM */ > +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) > +{ > + return ULONG_MAX; > +} > + > +/* > + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB window > + * that covers Image as well. Since we allocate from the top down, set a max > + * address that is virtually guaranteed to produce a suitable allocation even > + * when the physical address of Image is randomized. > + */ > +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, > + unsigned long image_addr) > +{ > + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; > +} > > #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__) > #define __efi_call_early(f, ...) f(__VA_ARGS__) > diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c > index b4f7d78f9e8b..557281fe375f 100644 > --- a/drivers/firmware/efi/libstub/arm-stub.c > +++ b/drivers/firmware/efi/libstub/arm-stub.c > @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > if (!fdt_addr) > pr_efi(sys_table, "Generating empty DTB\n"); > > - status = handle_cmdline_files(sys_table, image, cmdline_ptr, > - "initrd=", dram_base + SZ_512M, > + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "initrd=", > + efi_get_max_initrd_addr(dram_base, > + *image_addr), > (unsigned long *)&initrd_addr, > (unsigned long *)&initrd_size); > if (status != EFI_SUCCESS) > @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, > > new_fdt_addr = fdt_addr; > status = allocate_new_fdt_and_exit_boot(sys_table, handle, > - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, > + &new_fdt_addr, efi_get_max_fdt_addr(dram_base), > initrd_addr, initrd_size, cmdline_ptr, > fdt_addr, fdt_size); > This was tested successfully on our QDT2400 system on which we found this failure. Tested-by: Richard Ruigrok <rruigrok@codeaurora.org> -- Qualcomm Datacenter Technologies as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h index 0b06f5341b45..62620451f60b 100644 --- a/arch/arm/include/asm/efi.h +++ b/arch/arm/include/asm/efi.h @@ -84,6 +84,18 @@ static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt) */ #define ZIMAGE_OFFSET_LIMIT SZ_128M #define MIN_ZIMAGE_OFFSET MAX_UNCOMP_KERNEL_SIZE -#define MAX_FDT_OFFSET ZIMAGE_OFFSET_LIMIT + +/* on ARM, the FDT should be located in the first 128 MB of RAM */ +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) +{ + return dram_base + ZIMAGE_OFFSET_LIMIT; +} + +/* on ARM, the initrd should be loaded in a lowmem region */ +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, + unsigned long image_addr) +{ + return dram_base + SZ_512M; +} #endif /* _ASM_ARM_EFI_H */ diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h index 0b6b1633017f..6a6c8a0d1424 100644 --- a/arch/arm64/include/asm/efi.h +++ b/arch/arm64/include/asm/efi.h @@ -46,7 +46,24 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md); * 2MiB so we know it won't cross a 2MiB boundary. */ #define EFI_FDT_ALIGN SZ_2M /* used by allocate_new_fdt_and_exit_boot() */ -#define MAX_FDT_OFFSET SZ_512M + +/* on arm64, the FDT may be located anywhere in system RAM */ +static inline unsigned long efi_get_max_fdt_addr(unsigned long dram_base) +{ + return ULONG_MAX; +} + +/* + * On arm64, the initrd must be completely inside a 1 GB aligned 32 GB window + * that covers Image as well. Since we allocate from the top down, set a max + * address that is virtually guaranteed to produce a suitable allocation even + * when the physical address of Image is randomized. + */ +static inline unsigned long efi_get_max_initrd_addr(unsigned long dram_base, + unsigned long image_addr) +{ + return ALIGN(image_addr, SZ_1G) + 31UL * SZ_1G; +} #define efi_call_early(f, ...) sys_table_arg->boottime->f(__VA_ARGS__) #define __efi_call_early(f, ...) f(__VA_ARGS__) diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c index b4f7d78f9e8b..557281fe375f 100644 --- a/drivers/firmware/efi/libstub/arm-stub.c +++ b/drivers/firmware/efi/libstub/arm-stub.c @@ -333,8 +333,9 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, if (!fdt_addr) pr_efi(sys_table, "Generating empty DTB\n"); - status = handle_cmdline_files(sys_table, image, cmdline_ptr, - "initrd=", dram_base + SZ_512M, + status = handle_cmdline_files(sys_table, image, cmdline_ptr, "initrd=", + efi_get_max_initrd_addr(dram_base, + *image_addr), (unsigned long *)&initrd_addr, (unsigned long *)&initrd_size); if (status != EFI_SUCCESS) @@ -344,7 +345,7 @@ unsigned long efi_entry(void *handle, efi_system_table_t *sys_table, new_fdt_addr = fdt_addr; status = allocate_new_fdt_and_exit_boot(sys_table, handle, - &new_fdt_addr, dram_base + MAX_FDT_OFFSET, + &new_fdt_addr, efi_get_max_fdt_addr(dram_base), initrd_addr, initrd_size, cmdline_ptr, fdt_addr, fdt_size);
On arm64, we have made some changes over the past year to the way the kernel itself is allocated and to how it deals with the initrd and FDT. This patch brings the allocation logic in the EFI stub in line with that, which is necessary because the introduction of KASLR has created the possibility for the initrd to be allocated in a place where the kernel may not be able to map it. (This is currently a theoretical scenario, since it only affects systems where the size of RAM exceeds the size of the linear mapping.) So adhere to the arm64 boot protocol, and make sure that the initrd is fully inside a 1GB aligned 32 GB window that covers the kernel as well. The FDT may be anywhere in memory on arm64 now that we map it via the fixmap, so we can lift the address restriction there completely. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm/include/asm/efi.h | 14 +++++++++++++- arch/arm64/include/asm/efi.h | 19 ++++++++++++++++++- drivers/firmware/efi/libstub/arm-stub.c | 7 ++++--- 3 files changed, 35 insertions(+), 5 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html