Message ID | 20250108215957.3437660-1-usamaarif642@gmail.com |
---|---|
State | New |
Headers | show |
On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@gmail.com> wrote: > > The commit in [1] introduced a check to see if EFI memory attributes > table was corrupted. It assumed that efi.memmap.nr_map remains > constant, but it changes during late boot. > Hence, the check is valid during cold boot, but not in the subsequent > kexec boot. > > This is best explained with an exampled. At cold boot, for a test > machine: > efi.memmap.nr_map=91, > memory_attributes_table->num_entries=48, > desc_size = 48 > Hence, the check introduced in [1] where 3x the size of the > entire EFI memory map is a reasonable upper bound for the size of this > table is valid. > > In late boot __efi_enter_virtual_mode calls 2 functions that updates > efi.memmap.nr_map: > - efi_map_regions which reduces the `count` of map entries > (for e.g. if should_map_region returns false) and which is reflected > in efi.memmap by __efi_memmap_init. > At this point efi.memmap.nr_map becomes 46 in the test machine. > - efi_free_boot_services which also reduces the number of memory regions > available (for e.g. if md->type or md->attribute is not the right value). > At this point efi.memmap.nr_map becomes 9 in the test machine. > Hence when you kexec into a new kernel and pass efi.memmap, the > paramaters that are compared are: > efi.memmap.nr_map=9, > memory_attributes_table->num_entries=48, > desc_size = 48 > where the check in [1] is no longer valid with such a low efi.memmap.nr_map > as it was reduced due to efi_map_regions and efi_free_boot_services. > > A more appropriate check is to see if the description size reported by > efi and memory attributes table is the same. > > [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@google.com/ > > Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus") > Reported-by: Breno Leitao <leitao@debian.org> > Signed-off-by: Usama Arif <usamaarif642@gmail.com> > --- > drivers/firmware/efi/memattr.c | 16 ++++++---------- > 1 file changed, 6 insertions(+), 10 deletions(-) > The more I think about this, the more I feel that kexec on x86 should simply discard this table, and run with the firmware code RWX (which is not the end of the world). The main reason is that the EFI memory map and the EFI memory attributes table are supposed to be a matched pair, where each RTcode entry in the former is broken down into multiple code and data segments in the latter. The amount of mangling that the x86 arch code does of the EFI memory map makes it intractable to ensure that they remain in sync, and so it is better not to bother. > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > index c38b1a335590..d3bc161361fb 100644 > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -40,21 +40,17 @@ int __init efi_memattr_init(void) > goto unmap; > } > > - > /* > - * Sanity check: the Memory Attributes Table contains up to 3 entries > - * for each entry of type EfiRuntimeServicesCode in the EFI memory map. > - * So if the size of the table exceeds 3x the size of the entire EFI > - * memory map, there is clearly something wrong, and the table should > - * just be ignored altogether. > + * Sanity check: the Memory Attributes Table desc_size and > + * efi.memmap.desc_size should match. > */ > - size = tbl->num_entries * tbl->desc_size; > - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { > - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", > - tbl->version, tbl->desc_size, tbl->num_entries); > + if (efi.memmap.desc_size != tbl->desc_size) { > + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, table desc_size == %u, efi.memmap.desc_size == %lu, table num_entries == %u)\n", > + tbl->version, tbl->desc_size, efi.memmap.desc_size, tbl->num_entries); > goto unmap; > } > > + size = tbl->num_entries * tbl->desc_size; > tbl_size = sizeof(*tbl) + size; > memblock_reserve(efi_mem_attr_table, tbl_size); > set_bit(EFI_MEM_ATTR, &efi.flags); > -- > 2.43.5 >
On Thu, 9 Jan 2025 at 17:36, Usama Arif <usamaarif642@gmail.com> wrote: > > > > On 09/01/2025 15:45, Ard Biesheuvel wrote: > > On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@gmail.com> wrote: > >> > >> The commit in [1] introduced a check to see if EFI memory attributes > >> table was corrupted. It assumed that efi.memmap.nr_map remains > >> constant, but it changes during late boot. > >> Hence, the check is valid during cold boot, but not in the subsequent > >> kexec boot. > >> > >> This is best explained with an exampled. At cold boot, for a test > >> machine: > >> efi.memmap.nr_map=91, > >> memory_attributes_table->num_entries=48, > >> desc_size = 48 > >> Hence, the check introduced in [1] where 3x the size of the > >> entire EFI memory map is a reasonable upper bound for the size of this > >> table is valid. > >> > >> In late boot __efi_enter_virtual_mode calls 2 functions that updates > >> efi.memmap.nr_map: > >> - efi_map_regions which reduces the `count` of map entries > >> (for e.g. if should_map_region returns false) and which is reflected > >> in efi.memmap by __efi_memmap_init. > >> At this point efi.memmap.nr_map becomes 46 in the test machine. > >> - efi_free_boot_services which also reduces the number of memory regions > >> available (for e.g. if md->type or md->attribute is not the right value). > >> At this point efi.memmap.nr_map becomes 9 in the test machine. > >> Hence when you kexec into a new kernel and pass efi.memmap, the > >> paramaters that are compared are: > >> efi.memmap.nr_map=9, > >> memory_attributes_table->num_entries=48, > >> desc_size = 48 > >> where the check in [1] is no longer valid with such a low efi.memmap.nr_map > >> as it was reduced due to efi_map_regions and efi_free_boot_services. > >> > >> A more appropriate check is to see if the description size reported by > >> efi and memory attributes table is the same. > >> > >> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@google.com/ > >> > >> Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus") > >> Reported-by: Breno Leitao <leitao@debian.org> > >> Signed-off-by: Usama Arif <usamaarif642@gmail.com> > >> --- > >> drivers/firmware/efi/memattr.c | 16 ++++++---------- > >> 1 file changed, 6 insertions(+), 10 deletions(-) > >> > > > > The more I think about this, the more I feel that kexec on x86 should > > simply discard this table, and run with the firmware code RWX (which > > is not the end of the world). > > > By discard this table, do you mean kexec not use e820_table_firmware? No, I mean kexec ignores the memory attributes table. > Also a very basic question, what do you mean by run with the firmware RWX? > The memory attributes table is an overlay for the EFI memory map that describes which runtime code regions may be mapped with restricted permissions. Without this table, everything will be mapped writable as well as executable, but only in the EFI page tables, which are only active when an EFI call is in progress. > Sorry for the very basic questions above! > No worries.
On 10/01/2025 07:21, Ard Biesheuvel wrote: > On Thu, 9 Jan 2025 at 17:36, Usama Arif <usamaarif642@gmail.com> wrote: >> >> >> >> On 09/01/2025 15:45, Ard Biesheuvel wrote: >>> On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@gmail.com> wrote: >>>> >>>> The commit in [1] introduced a check to see if EFI memory attributes >>>> table was corrupted. It assumed that efi.memmap.nr_map remains >>>> constant, but it changes during late boot. >>>> Hence, the check is valid during cold boot, but not in the subsequent >>>> kexec boot. >>>> >>>> This is best explained with an exampled. At cold boot, for a test >>>> machine: >>>> efi.memmap.nr_map=91, >>>> memory_attributes_table->num_entries=48, >>>> desc_size = 48 >>>> Hence, the check introduced in [1] where 3x the size of the >>>> entire EFI memory map is a reasonable upper bound for the size of this >>>> table is valid. >>>> >>>> In late boot __efi_enter_virtual_mode calls 2 functions that updates >>>> efi.memmap.nr_map: >>>> - efi_map_regions which reduces the `count` of map entries >>>> (for e.g. if should_map_region returns false) and which is reflected >>>> in efi.memmap by __efi_memmap_init. >>>> At this point efi.memmap.nr_map becomes 46 in the test machine. >>>> - efi_free_boot_services which also reduces the number of memory regions >>>> available (for e.g. if md->type or md->attribute is not the right value). >>>> At this point efi.memmap.nr_map becomes 9 in the test machine. >>>> Hence when you kexec into a new kernel and pass efi.memmap, the >>>> paramaters that are compared are: >>>> efi.memmap.nr_map=9, >>>> memory_attributes_table->num_entries=48, >>>> desc_size = 48 >>>> where the check in [1] is no longer valid with such a low efi.memmap.nr_map >>>> as it was reduced due to efi_map_regions and efi_free_boot_services. >>>> >>>> A more appropriate check is to see if the description size reported by >>>> efi and memory attributes table is the same. >>>> >>>> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@google.com/ >>>> >>>> Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus") >>>> Reported-by: Breno Leitao <leitao@debian.org> >>>> Signed-off-by: Usama Arif <usamaarif642@gmail.com> >>>> --- >>>> drivers/firmware/efi/memattr.c | 16 ++++++---------- >>>> 1 file changed, 6 insertions(+), 10 deletions(-) >>>> >>> >>> The more I think about this, the more I feel that kexec on x86 should >>> simply discard this table, and run with the firmware code RWX (which >>> is not the end of the world). >> >> >> By discard this table, do you mean kexec not use e820_table_firmware? > > No, I mean kexec ignores the memory attributes table. > >> Also a very basic question, what do you mean by run with the firmware RWX? >> > > The memory attributes table is an overlay for the EFI memory map that > describes which runtime code regions may be mapped with restricted > permissions. Without this table, everything will be mapped writable as > well as executable, but only in the EFI page tables, which are only > active when an EFI call is in progress. > Thanks for explaining! So basically get rid of memattr.c :) Do you mean get rid of it only for kexec, or not do it for any boot (including cold boot)? I do like this idea! I couldn't find this in the git history, but do you know if this was added in the linux kernel just because EFI spec added support for it, or if there was a specific security problem? Thanks!
On Fri, 10 Jan 2025 at 11:53, Usama Arif <usamaarif642@gmail.com> wrote: > > > > On 10/01/2025 07:21, Ard Biesheuvel wrote: > > On Thu, 9 Jan 2025 at 17:36, Usama Arif <usamaarif642@gmail.com> wrote: > >> > >> > >> > >> On 09/01/2025 15:45, Ard Biesheuvel wrote: > >>> On Wed, 8 Jan 2025 at 23:00, Usama Arif <usamaarif642@gmail.com> wrote: > >>>> > >>>> The commit in [1] introduced a check to see if EFI memory attributes > >>>> table was corrupted. It assumed that efi.memmap.nr_map remains > >>>> constant, but it changes during late boot. > >>>> Hence, the check is valid during cold boot, but not in the subsequent > >>>> kexec boot. > >>>> > >>>> This is best explained with an exampled. At cold boot, for a test > >>>> machine: > >>>> efi.memmap.nr_map=91, > >>>> memory_attributes_table->num_entries=48, > >>>> desc_size = 48 > >>>> Hence, the check introduced in [1] where 3x the size of the > >>>> entire EFI memory map is a reasonable upper bound for the size of this > >>>> table is valid. > >>>> > >>>> In late boot __efi_enter_virtual_mode calls 2 functions that updates > >>>> efi.memmap.nr_map: > >>>> - efi_map_regions which reduces the `count` of map entries > >>>> (for e.g. if should_map_region returns false) and which is reflected > >>>> in efi.memmap by __efi_memmap_init. > >>>> At this point efi.memmap.nr_map becomes 46 in the test machine. > >>>> - efi_free_boot_services which also reduces the number of memory regions > >>>> available (for e.g. if md->type or md->attribute is not the right value). > >>>> At this point efi.memmap.nr_map becomes 9 in the test machine. > >>>> Hence when you kexec into a new kernel and pass efi.memmap, the > >>>> paramaters that are compared are: > >>>> efi.memmap.nr_map=9, > >>>> memory_attributes_table->num_entries=48, > >>>> desc_size = 48 > >>>> where the check in [1] is no longer valid with such a low efi.memmap.nr_map > >>>> as it was reduced due to efi_map_regions and efi_free_boot_services. > >>>> > >>>> A more appropriate check is to see if the description size reported by > >>>> efi and memory attributes table is the same. > >>>> > >>>> [1] https://lore.kernel.org/all/20241031175822.2952471-2-ardb+git@google.com/ > >>>> > >>>> Fixes: 8fbe4c49c0cc ("efi/memattr: Ignore table if the size is clearly bogus") > >>>> Reported-by: Breno Leitao <leitao@debian.org> > >>>> Signed-off-by: Usama Arif <usamaarif642@gmail.com> > >>>> --- > >>>> drivers/firmware/efi/memattr.c | 16 ++++++---------- > >>>> 1 file changed, 6 insertions(+), 10 deletions(-) > >>>> > >>> > >>> The more I think about this, the more I feel that kexec on x86 should > >>> simply discard this table, and run with the firmware code RWX (which > >>> is not the end of the world). > >> > >> > >> By discard this table, do you mean kexec not use e820_table_firmware? > > > > No, I mean kexec ignores the memory attributes table. > > > >> Also a very basic question, what do you mean by run with the firmware RWX? > >> > > > > The memory attributes table is an overlay for the EFI memory map that > > describes which runtime code regions may be mapped with restricted > > permissions. Without this table, everything will be mapped writable as > > well as executable, but only in the EFI page tables, which are only > > active when an EFI call is in progress. > > > > Thanks for explaining! > > So basically get rid of memattr.c :) > > Do you mean get rid of it only for kexec, or not do it for any > boot (including cold boot)? Only for kexec, and only on x86 > I do like this idea! I couldn't find this in the git history, > but do you know if this was added in the linux kernel just > because EFI spec added support for it, or if there was a > specific security problem? > Mapping memory RWX is generally a bad idea, so we should avoid it if possible. But EFI runtime memory regions are only mapped during a EFI runtime call, and these don't happen often at all, so the benefit is only marginal. (In the early days of EFI, it was more common for the OS to map these regions permanently, but we stopped doing that a long time ago)
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h index d33ccbc4a2c6..a1a956f2d963 100644 --- a/drivers/firmware/efi/libstub/efistub.h +++ b/drivers/firmware/efi/libstub/efistub.h @@ -1143,6 +1143,7 @@ efi_enable_reset_attack_mitigation(void) { } #endif void efi_retrieve_eventlog(void); +void efi_mem_attr_init(void); struct screen_info *alloc_screen_info(void); struct screen_info *__alloc_screen_info(void); diff --git a/drivers/firmware/efi/libstub/mem.c b/drivers/firmware/efi/libstub/mem.c index 4f1fa302234d..c5b60aea342e 100644 --- a/drivers/firmware/efi/libstub/mem.c +++ b/drivers/firmware/efi/libstub/mem.c @@ -128,3 +128,35 @@ void efi_free(unsigned long size, unsigned long addr) nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE; efi_bs_call(free_pages, addr, nr_pages); } + +void efi_mem_attr_init(void) +{ + efi_guid_t linux_mem_attr_guid = EFI_MEMORY_ATTRIBUTES_TABLE_GUID; + efi_memory_attributes_table_t *tbl = NULL; + efi_status_t status; + unsigned long size; + + tbl = get_efi_config_table(linux_mem_attr_guid); + efi_err("KKK tbl %lx tbl_>version=%d, tbl->num_entries %d tbl->desc_size %d\n", tbl, tbl->version, tbl->num_entries, tbl->desc_size); + + size = tbl->num_entries * tbl->desc_size; + status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY, + sizeof(*tbl) + size, (void **)&tbl); + + if (status != EFI_SUCCESS) { + efi_err("Unable to allocate memory for event log\n"); + return; + } + + status = efi_bs_call(install_configuration_table, + &linux_mem_attr_guid, tbl); + + if (status != EFI_SUCCESS) + efi_err("Unable to install configuration table to update memory type\n"); + efi_bs_call(free_pool, tbl); + + /* verify if its the same table */ + tbl = get_efi_config_table(linux_mem_attr_guid); + efi_err("KKK2 tbl %lx tbl_>version=%d, tbl->num_entries %d tbl->desc_size %d\n", tbl, tbl->version, tbl->num_entries, tbl->desc_size); + +} diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index f8e465da344d..c0c3d278451d 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -1036,6 +1036,8 @@ void __noreturn efi_stub_entry(efi_handle_t handle, efi_retrieve_eventlog(); + efi_mem_attr_init(); + setup_graphics(boot_params); setup_efi_pci(boot_params);