diff mbox series

efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption

Message ID 20240912155159.1951792-2-ardb+git@google.com
State New
Headers show
Series efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption | expand

Commit Message

Ard Biesheuvel Sept. 12, 2024, 3:52 p.m. UTC
From: Ard Biesheuvel <ardb@kernel.org>

The TPM event log table is a Linux specific construct, where the data
produced by the GetEventLog() boot service is cached in memory, and
passed on to the OS using a EFI configuration table.

The use of EFI_LOADER_DATA here results in the region being left
unreserved in the E820 memory map constructed by the EFI stub, and this
is the memory description that is passed on to the incoming kernel by
kexec, which is therefore unaware that the region should be reserved.

Even though the utility of the TPM2 event log after a kexec is
questionable, any corruption might send the parsing code off into the
weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
instead, which is always treated as reserved by the E820 conversion
logic.

Cc: <stable@vger.kernel.org>
Reported-by: Breno Leitao <leitao@debian.org>
Tested-by: Usama Arif <usamaarif642@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 drivers/firmware/efi/libstub/tpm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ilias Apalodimas Sept. 13, 2024, 6:27 a.m. UTC | #1
On Thu, 12 Sept 2024 at 18:52, Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The TPM event log table is a Linux specific construct, where the data
> produced by the GetEventLog() boot service is cached in memory, and
> passed on to the OS using a EFI configuration table.

an EFI*

>
> The use of EFI_LOADER_DATA here results in the region being left
> unreserved in the E820 memory map constructed by the EFI stub, and this
> is the memory description that is passed on to the incoming kernel by
> kexec, which is therefore unaware that the region should be reserved.
>
> Even though the utility of the TPM2 event log after a kexec is
> questionable, any corruption might send the parsing code off into the
> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
> instead, which is always treated as reserved by the E820 conversion
> logic.
>
> Cc: <stable@vger.kernel.org>
> Reported-by: Breno Leitao <leitao@debian.org>
> Tested-by: Usama Arif <usamaarif642@gmail.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  drivers/firmware/efi/libstub/tpm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
> index df3182f2e63a..1fd6823248ab 100644
> --- a/drivers/firmware/efi/libstub/tpm.c
> +++ b/drivers/firmware/efi/libstub/tpm.c
> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
>         }
>
>         /* Allocate space for the logs and copy them. */
> -       status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
> +       status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
>                              sizeof(*log_tbl) + log_size, (void **)&log_tbl);
>
>         if (status != EFI_SUCCESS) {
> --
> 2.46.0.662.g92d0881bb0-goog
>
>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Breno Leitao Sept. 13, 2024, 10 a.m. UTC | #2
Hello Ard,

Thanks for the fix!

On Thu, Sep 12, 2024 at 05:52:00PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The TPM event log table is a Linux specific construct, where the data
> produced by the GetEventLog() boot service is cached in memory, and
> passed on to the OS using a EFI configuration table.
> 
> The use of EFI_LOADER_DATA here results in the region being left
> unreserved in the E820 memory map constructed by the EFI stub, and this
> is the memory description that is passed on to the incoming kernel by
> kexec, which is therefore unaware that the region should be reserved.
> 
> Even though the utility of the TPM2 event log after a kexec is
> questionable, any corruption might send the parsing code off into the
> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
> instead, which is always treated as reserved by the E820 conversion
> logic.
> 
> Cc: <stable@vger.kernel.org>
> Reported-by: Breno Leitao <leitao@debian.org>
> Tested-by: Usama Arif <usamaarif642@gmail.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Tested-by: Breno Leitao <leitao@debian.org>

For stable, having the fixes will help stable folks to decide where to
pick. Based on investigation, this was introduced by 33b6d03469b22, so,
we might want to have the following line:

Fixes: 33b6d03469b22 ("efi: call get_event_log before ExitBootServices")

Thanks again!
--breno
Jiri Slaby Oct. 24, 2024, 4:20 p.m. UTC | #3
On 12. 09. 24, 17:52, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The TPM event log table is a Linux specific construct, where the data
> produced by the GetEventLog() boot service is cached in memory, and
> passed on to the OS using a EFI configuration table.
> 
> The use of EFI_LOADER_DATA here results in the region being left
> unreserved in the E820 memory map constructed by the EFI stub, and this
> is the memory description that is passed on to the incoming kernel by
> kexec, which is therefore unaware that the region should be reserved.
> 
> Even though the utility of the TPM2 event log after a kexec is
> questionable, any corruption might send the parsing code off into the
> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
> instead, which is always treated as reserved by the E820 conversion
> logic.
> 
> Cc: <stable@vger.kernel.org>
> Reported-by: Breno Leitao <leitao@debian.org>
> Tested-by: Usama Arif <usamaarif642@gmail.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>   drivers/firmware/efi/libstub/tpm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
> index df3182f2e63a..1fd6823248ab 100644
> --- a/drivers/firmware/efi/libstub/tpm.c
> +++ b/drivers/firmware/efi/libstub/tpm.c
> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
>   	}
>   
>   	/* Allocate space for the logs and copy them. */
> -	status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
> +	status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
>   			     sizeof(*log_tbl) + log_size, (void **)&log_tbl);

Hi,

this, for some reason, corrupts system configuration table. On good 
boots, memattr points to 0x77535018, on bad boots (this commit applied), 
it points to 0x77526018.

And the good content at 0x77526018:
tab=0x77526018 size=16+45*48=0x0000000000000880

bad content at 0x77535018:
tab=0x77535018 size=16+2*1705353216=0x00000000cb4b4010

This happens only on cold boots. Subsequent boots (having the commit or 
not) are all fine.

Any ideas?

DMI: Dell Inc. Latitude 7290/09386V, BIOS 1.39.0 07/04/2024

This was reported downstream at:
https://bugzilla.suse.com/show_bug.cgi?id=1231465

thanks,
Jiri Slaby Oct. 25, 2024, 5:07 a.m. UTC | #4
On 24. 10. 24, 18:20, Jiri Slaby wrote:
> On 12. 09. 24, 17:52, Ard Biesheuvel wrote:
>> From: Ard Biesheuvel <ardb@kernel.org>
>>
>> The TPM event log table is a Linux specific construct, where the data
>> produced by the GetEventLog() boot service is cached in memory, and
>> passed on to the OS using a EFI configuration table.
>>
>> The use of EFI_LOADER_DATA here results in the region being left
>> unreserved in the E820 memory map constructed by the EFI stub, and this
>> is the memory description that is passed on to the incoming kernel by
>> kexec, which is therefore unaware that the region should be reserved.
>>
>> Even though the utility of the TPM2 event log after a kexec is
>> questionable, any corruption might send the parsing code off into the
>> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
>> instead, which is always treated as reserved by the E820 conversion
>> logic.
>>
>> Cc: <stable@vger.kernel.org>
>> Reported-by: Breno Leitao <leitao@debian.org>
>> Tested-by: Usama Arif <usamaarif642@gmail.com>
>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>> ---
>>   drivers/firmware/efi/libstub/tpm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/ 
>> efi/libstub/tpm.c
>> index df3182f2e63a..1fd6823248ab 100644
>> --- a/drivers/firmware/efi/libstub/tpm.c
>> +++ b/drivers/firmware/efi/libstub/tpm.c
>> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, 
>> efi_physical_addr_t log_loca
>>       }
>>       /* Allocate space for the logs and copy them. */
>> -    status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
>> +    status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
>>                    sizeof(*log_tbl) + log_size, (void **)&log_tbl);
> 
> Hi,
> 
> this, for some reason, corrupts system configuration table. On good 
> boots, memattr points to 0x77535018, on bad boots (this commit applied), 
> it points to 0x77526018.
> 
> And the good content at 0x77526018:
> tab=0x77526018 size=16+45*48=0x0000000000000880
> 
> bad content at 0x77535018:
> tab=0x77535018 size=16+2*1705353216=0x00000000cb4b4010
> 
> This happens only on cold boots. Subsequent boots (having the commit or 
> not) are all fine.
> 
> Any ideas?

====
EFI_ACPI_RECLAIM_MEMORY

This memory is to be preserved by the UEFI OS loader and OS until ACPI
is enabled. Once ACPI is enabled, the memory in this range is available 
for general use.
====

BTW doesn't the above mean it is released by the time TPM actually reads it?

Isn't the proper fix to actually memblock_reserve() that TPM portion. 
The same as memattr in efi_memattr_init()?

> DMI: Dell Inc. Latitude 7290/09386V, BIOS 1.39.0 07/04/2024
> 
> This was reported downstream at:
> https://bugzilla.suse.com/show_bug.cgi?id=1231465
> 
> thanks,
Jiri Slaby Oct. 25, 2024, 5:09 a.m. UTC | #5
On 25. 10. 24, 7:07, Jiri Slaby wrote:
> On 24. 10. 24, 18:20, Jiri Slaby wrote:
>> On 12. 09. 24, 17:52, Ard Biesheuvel wrote:
>>> From: Ard Biesheuvel <ardb@kernel.org>
>>>
>>> The TPM event log table is a Linux specific construct, where the data
>>> produced by the GetEventLog() boot service is cached in memory, and
>>> passed on to the OS using a EFI configuration table.
>>>
>>> The use of EFI_LOADER_DATA here results in the region being left
>>> unreserved in the E820 memory map constructed by the EFI stub, and this
>>> is the memory description that is passed on to the incoming kernel by
>>> kexec, which is therefore unaware that the region should be reserved.
>>>
>>> Even though the utility of the TPM2 event log after a kexec is
>>> questionable, any corruption might send the parsing code off into the
>>> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
>>> instead, which is always treated as reserved by the E820 conversion
>>> logic.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Reported-by: Breno Leitao <leitao@debian.org>
>>> Tested-by: Usama Arif <usamaarif642@gmail.com>
>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>>> ---
>>>   drivers/firmware/efi/libstub/tpm.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/ 
>>> efi/libstub/tpm.c
>>> index df3182f2e63a..1fd6823248ab 100644
>>> --- a/drivers/firmware/efi/libstub/tpm.c
>>> +++ b/drivers/firmware/efi/libstub/tpm.c
>>> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, 
>>> efi_physical_addr_t log_loca
>>>       }
>>>       /* Allocate space for the logs and copy them. */
>>> -    status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
>>> +    status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
>>>                    sizeof(*log_tbl) + log_size, (void **)&log_tbl);
>>
>> Hi,
>>
>> this, for some reason, corrupts system configuration table. On good 
>> boots, memattr points to 0x77535018, on bad boots (this commit 
>> applied), it points to 0x77526018.
>>
>> And the good content at 0x77526018:
>> tab=0x77526018 size=16+45*48=0x0000000000000880
>>
>> bad content at 0x77535018:
>> tab=0x77535018 size=16+2*1705353216=0x00000000cb4b4010
>>
>> This happens only on cold boots. Subsequent boots (having the commit 
>> or not) are all fine.
>>
>> Any ideas?
> 
> ====
> EFI_ACPI_RECLAIM_MEMORY
> 
> This memory is to be preserved by the UEFI OS loader and OS until ACPI
> is enabled. Once ACPI is enabled, the memory in this range is available 
> for general use.
> ====
> 
> BTW doesn't the above mean it is released by the time TPM actually reads 
> it?
> 
> Isn't the proper fix to actually memblock_reserve() that TPM portion. 
> The same as memattr in efi_memattr_init()?

And this is actually done in efi_tpm_eventlog_init().

>> DMI: Dell Inc. Latitude 7290/09386V, BIOS 1.39.0 07/04/2024
>>
>> This was reported downstream at:
>> https://bugzilla.suse.com/show_bug.cgi?id=1231465
>>
>> thanks,
Ard Biesheuvel Oct. 25, 2024, 7:30 a.m. UTC | #6
On Fri, 25 Oct 2024 at 07:09, Jiri Slaby <jirislaby@kernel.org> wrote:
>
> On 25. 10. 24, 7:07, Jiri Slaby wrote:
> > On 24. 10. 24, 18:20, Jiri Slaby wrote:
> >> On 12. 09. 24, 17:52, Ard Biesheuvel wrote:
> >>> From: Ard Biesheuvel <ardb@kernel.org>
> >>>
> >>> The TPM event log table is a Linux specific construct, where the data
> >>> produced by the GetEventLog() boot service is cached in memory, and
> >>> passed on to the OS using a EFI configuration table.
> >>>
> >>> The use of EFI_LOADER_DATA here results in the region being left
> >>> unreserved in the E820 memory map constructed by the EFI stub, and this
> >>> is the memory description that is passed on to the incoming kernel by
> >>> kexec, which is therefore unaware that the region should be reserved.
> >>>
> >>> Even though the utility of the TPM2 event log after a kexec is
> >>> questionable, any corruption might send the parsing code off into the
> >>> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
> >>> instead, which is always treated as reserved by the E820 conversion
> >>> logic.
> >>>
> >>> Cc: <stable@vger.kernel.org>
> >>> Reported-by: Breno Leitao <leitao@debian.org>
> >>> Tested-by: Usama Arif <usamaarif642@gmail.com>
> >>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> >>> ---
> >>>   drivers/firmware/efi/libstub/tpm.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/
> >>> efi/libstub/tpm.c
> >>> index df3182f2e63a..1fd6823248ab 100644
> >>> --- a/drivers/firmware/efi/libstub/tpm.c
> >>> +++ b/drivers/firmware/efi/libstub/tpm.c
> >>> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version,
> >>> efi_physical_addr_t log_loca
> >>>       }
> >>>       /* Allocate space for the logs and copy them. */
> >>> -    status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
> >>> +    status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
> >>>                    sizeof(*log_tbl) + log_size, (void **)&log_tbl);
> >>
> >> Hi,
> >>
> >> this, for some reason, corrupts system configuration table. On good
> >> boots, memattr points to 0x77535018, on bad boots (this commit
> >> applied), it points to 0x77526018.
> >>
> >> And the good content at 0x77526018:
> >> tab=0x77526018 size=16+45*48=0x0000000000000880
> >>
> >> bad content at 0x77535018:
> >> tab=0x77535018 size=16+2*1705353216=0x00000000cb4b4010
> >>
> >> This happens only on cold boots. Subsequent boots (having the commit
> >> or not) are all fine.
> >>
> >> Any ideas?
> >
> > ====
> > EFI_ACPI_RECLAIM_MEMORY
> >
> > This memory is to be preserved by the UEFI OS loader and OS until ACPI
> > is enabled. Once ACPI is enabled, the memory in this range is available
> > for general use.
> > ====
> >
> > BTW doesn't the above mean it is released by the time TPM actually reads
> > it?
> >
> > Isn't the proper fix to actually memblock_reserve() that TPM portion.
> > The same as memattr in efi_memattr_init()?
>
> And this is actually done in efi_tpm_eventlog_init().
>

EFI_ACPI_RECLAIM_MEMORY may be reclaimed by the OS, but we never
actually do that in Linux.

To me, it seems like the use of EFI_ACPI_RECLAIM_MEMORY in this case
simply tickles a bug in the firmware that causes it to corrupt the
memory attributes table. The fact that cold boot behaves differently
is a strong indicator here.

I didn't see the results of the memory attribute table dumps on the
bugzilla thread, but dumping this table from EFI is not very useful
because it will get regenerated/updated at ExitBootServices() time.
Unfortunately, that also takes away the console so capturing the state
of that table before the EFI stub boots the kernel is not an easy
thing to do.

Is the memattr table completely corrupted? It also has a version
field, and only versions 1 and 2 are defined so we might use that to
detect corruption.
Usama Arif Oct. 25, 2024, 1:27 p.m. UTC | #7
On 25/10/2024 06:09, Jiri Slaby wrote:
> On 25. 10. 24, 7:07, Jiri Slaby wrote:
>> On 24. 10. 24, 18:20, Jiri Slaby wrote:
>>> On 12. 09. 24, 17:52, Ard Biesheuvel wrote:
>>>> From: Ard Biesheuvel <ardb@kernel.org>
>>>>
>>>> The TPM event log table is a Linux specific construct, where the data
>>>> produced by the GetEventLog() boot service is cached in memory, and
>>>> passed on to the OS using a EFI configuration table.
>>>>
>>>> The use of EFI_LOADER_DATA here results in the region being left
>>>> unreserved in the E820 memory map constructed by the EFI stub, and this
>>>> is the memory description that is passed on to the incoming kernel by
>>>> kexec, which is therefore unaware that the region should be reserved.
>>>>
>>>> Even though the utility of the TPM2 event log after a kexec is
>>>> questionable, any corruption might send the parsing code off into the
>>>> weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY
>>>> instead, which is always treated as reserved by the E820 conversion
>>>> logic.
>>>>
>>>> Cc: <stable@vger.kernel.org>
>>>> Reported-by: Breno Leitao <leitao@debian.org>
>>>> Tested-by: Usama Arif <usamaarif642@gmail.com>
>>>> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>>>> ---
>>>>   drivers/firmware/efi/libstub/tpm.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/ efi/libstub/tpm.c
>>>> index df3182f2e63a..1fd6823248ab 100644
>>>> --- a/drivers/firmware/efi/libstub/tpm.c
>>>> +++ b/drivers/firmware/efi/libstub/tpm.c
>>>> @@ -96,7 +96,7 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
>>>>       }
>>>>       /* Allocate space for the logs and copy them. */
>>>> -    status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
>>>> +    status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
>>>>                    sizeof(*log_tbl) + log_size, (void **)&log_tbl);
>>>
>>> Hi,
>>>
>>> this, for some reason, corrupts system configuration table. On good boots, memattr points to 0x77535018, on bad boots (this commit applied), it points to 0x77526018.
>>>
>>> And the good content at 0x77526018:
>>> tab=0x77526018 size=16+45*48=0x0000000000000880
>>>
>>> bad content at 0x77535018:
>>> tab=0x77535018 size=16+2*1705353216=0x00000000cb4b4010
>>>
>>> This happens only on cold boots. Subsequent boots (having the commit or not) are all fine.
>>>
>>> Any ideas?
>>
>> ====
>> EFI_ACPI_RECLAIM_MEMORY
>>
>> This memory is to be preserved by the UEFI OS loader and OS until ACPI
>> is enabled. Once ACPI is enabled, the memory in this range is available for general use.
>> ====
>>
>> BTW doesn't the above mean it is released by the time TPM actually reads it?
>>
>> Isn't the proper fix to actually memblock_reserve() that TPM portion. The same as memattr in efi_memattr_init()?
> 
> And this is actually done in efi_tpm_eventlog_init().
> 
>>> DMI: Dell Inc. Latitude 7290/09386V, BIOS 1.39.0 07/04/2024
>>>
>>> This was reported downstream at:
>>> https://bugzilla.suse.com/show_bug.cgi?id=1231465
>>>
>>> thanks,
> 

Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
All of these should be just be in the dmesg.

Thanks,
Usama
Jiri Slaby Oct. 30, 2024, 5:25 a.m. UTC | #8
On 25. 10. 24, 15:27, Usama Arif wrote:
> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
> All of these should be just be in the dmesg.

It's shared in the aforementioned bug [1] already.

6.11.2 dmesg (bad run):
https://bugzilla.suse.com/attachment.cgi?id=877874

6.12-rc2 dmesg (good run):
https://bugzilla.suse.com/attachment.cgi?id=877887

FWIW from https://bugzilla.suse.com/attachment.cgi?id=878051:
good TPMEventLog=0x682aa018
bad  TPMEventLog=0x65a6b018

[1] https://bugzilla.suse.com/show_bug.cgi?id=1231465

wdiff of e820:
wdiff -n bad good |colordiff
BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
BIOS-e820: [mem 0x0000000000059000-0x000000000009efff] usable
BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
BIOS-e820: [mem [-0x0000000000100000-0x0000000065a6efff]-] 
{+0x0000000000100000-0x00000000682abfff]+} usable
BIOS-e820: [mem [-0x0000000065a6f000-0x0000000065a7dfff]-] 
{+0x00000000682ac000-0x00000000682bafff]+} ACPI data
BIOS-e820: [mem [-0x0000000065a7e000-0x000000006a5acfff]-] 
{+0x00000000682bb000-0x000000006a5acfff]+} usable
BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
BIOS-e820: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
BIOS-e820: [mem 0x000000006a5af000-0x0000000079e83fff] usable
BIOS-e820: [mem 0x0000000079e84000-0x000000007a246fff] reserved
BIOS-e820: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
BIOS-e820: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
BIOS-e820: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
BIOS-e820: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
BIOS-e820: [mem 0x000000007b600000-0x000000007f7fffff] reserved
BIOS-e820: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
BIOS-e820: [mem 0x0000000100000000-0x000000087e7fffff] usable
NX (Execute Disable) protection: active
APIC: Static calls initialized
e820: update [mem [-0x65a5e018-0x65a6e457]-] {+0x6829b018-0x682ab457]+} 
usable ==> usable
extended physical RAM map:
reserve setup_data: [mem 0x0000000000000000-0x0000000000057fff] usable
reserve setup_data: [mem 0x0000000000058000-0x0000000000058fff] reserved
reserve setup_data: [mem 0x0000000000059000-0x000000000009efff] usable
reserve setup_data: [mem 0x000000000009f000-0x00000000000fffff] reserved
reserve setup_data: [mem [-0x0000000000100000-0x0000000065a5e017]-] 
{+0x0000000000100000-0x000000006829b017]+} usable
reserve setup_data: [mem [-0x0000000065a5e018-0x0000000065a6e457]-] 
{+0x000000006829b018-0x00000000682ab457]+} usable
reserve setup_data: [mem [-0x0000000065a6e458-0x0000000065a6efff]-] 
{+0x00000000682ab458-0x00000000682abfff]+} usable
reserve setup_data: [mem [-0x0000000065a6f000-0x0000000065a7dfff]-] 
{+0x00000000682ac000-0x00000000682bafff]+} ACPI data
reserve setup_data: [mem [-0x0000000065a7e000-0x000000006a5acfff]-] 
{+0x00000000682bb000-0x000000006a5acfff]+} usable
reserve setup_data: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
reserve setup_data: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
reserve setup_data: [mem 0x000000006a5af000-0x0000000079e83fff] usable
reserve setup_data: [mem 0x0000000079e84000-0x000000007a246fff] reserved
reserve setup_data: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
reserve setup_data: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
reserve setup_data: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
reserve setup_data: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
reserve setup_data: [mem 0x000000007b600000-0x000000007f7fffff] reserved
reserve setup_data: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
reserve setup_data: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
reserve setup_data: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
reserve setup_data: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
reserve setup_data: [mem 0x0000000100000000-0x000000087e7fffff] usable
efi: EFI v2.6 by American Megatrends
efi: ACPI=0x7a255000 ACPI 2.0=0x7a255000 SMBIOS=0x7b140000 SMBIOS 
3.0=0x7b13f000 TPMFinalLog=0x7a892000 ESRT=0x7b0deb18 
[-MEMATTR=0x77535018-] {+MEMATTR=0x77526018+} MOKvar=0x7b13e000 
RNG=0x7a254018 [-TPMEventLog=0x65a6f018-] {+TPMEventLog=0x682ac018+}


thanks,
Usama Arif Oct. 30, 2024, 5:13 p.m. UTC | #9
On 30/10/2024 05:25, Jiri Slaby wrote:
> On 25. 10. 24, 15:27, Usama Arif wrote:
>> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
>> All of these should be just be in the dmesg.
> 
> It's shared in the aforementioned bug [1] already.
> 
> 6.11.2 dmesg (bad run):
> https://bugzilla.suse.com/attachment.cgi?id=877874
> 
> 6.12-rc2 dmesg (good run):
> https://bugzilla.suse.com/attachment.cgi?id=877887
> 
> FWIW from https://bugzilla.suse.com/attachment.cgi?id=878051:
> good TPMEventLog=0x682aa018
> bad  TPMEventLog=0x65a6b018
> 
> [1] https://bugzilla.suse.com/show_bug.cgi?id=1231465
> 
> wdiff of e820:
> wdiff -n bad good |colordiff
> BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
> BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
> BIOS-e820: [mem 0x0000000000059000-0x000000000009efff] usable
> BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
> BIOS-e820: [mem [-0x0000000000100000-0x0000000065a6efff]-] {+0x0000000000100000-0x00000000682abfff]+} usable
> BIOS-e820: [mem [-0x0000000065a6f000-0x0000000065a7dfff]-] {+0x00000000682ac000-0x00000000682bafff]+} ACPI data
> BIOS-e820: [mem [-0x0000000065a7e000-0x000000006a5acfff]-] {+0x00000000682bb000-0x000000006a5acfff]+} usable
> BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
> BIOS-e820: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
> BIOS-e820: [mem 0x000000006a5af000-0x0000000079e83fff] usable
> BIOS-e820: [mem 0x0000000079e84000-0x000000007a246fff] reserved
> BIOS-e820: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
> BIOS-e820: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
> BIOS-e820: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
> BIOS-e820: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
> BIOS-e820: [mem 0x000000007b600000-0x000000007f7fffff] reserved
> BIOS-e820: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
> BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
> BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
> BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
> BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> BIOS-e820: [mem 0x0000000100000000-0x000000087e7fffff] usable
> NX (Execute Disable) protection: active
> APIC: Static calls initialized
> e820: update [mem [-0x65a5e018-0x65a6e457]-] {+0x6829b018-0x682ab457]+} usable ==> usable
> extended physical RAM map:
> reserve setup_data: [mem 0x0000000000000000-0x0000000000057fff] usable
> reserve setup_data: [mem 0x0000000000058000-0x0000000000058fff] reserved
> reserve setup_data: [mem 0x0000000000059000-0x000000000009efff] usable
> reserve setup_data: [mem 0x000000000009f000-0x00000000000fffff] reserved
> reserve setup_data: [mem [-0x0000000000100000-0x0000000065a5e017]-] {+0x0000000000100000-0x000000006829b017]+} usable
> reserve setup_data: [mem [-0x0000000065a5e018-0x0000000065a6e457]-] {+0x000000006829b018-0x00000000682ab457]+} usable
> reserve setup_data: [mem [-0x0000000065a6e458-0x0000000065a6efff]-] {+0x00000000682ab458-0x00000000682abfff]+} usable
> reserve setup_data: [mem [-0x0000000065a6f000-0x0000000065a7dfff]-] {+0x00000000682ac000-0x00000000682bafff]+} ACPI data
> reserve setup_data: [mem [-0x0000000065a7e000-0x000000006a5acfff]-] {+0x00000000682bb000-0x000000006a5acfff]+} usable
> reserve setup_data: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
> reserve setup_data: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
> reserve setup_data: [mem 0x000000006a5af000-0x0000000079e83fff] usable
> reserve setup_data: [mem 0x0000000079e84000-0x000000007a246fff] reserved
> reserve setup_data: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
> reserve setup_data: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
> reserve setup_data: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
> reserve setup_data: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
> reserve setup_data: [mem 0x000000007b600000-0x000000007f7fffff] reserved
> reserve setup_data: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
> reserve setup_data: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
> reserve setup_data: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
> reserve setup_data: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
> reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> reserve setup_data: [mem 0x0000000100000000-0x000000087e7fffff] usable
> efi: EFI v2.6 by American Megatrends
> efi: ACPI=0x7a255000 ACPI 2.0=0x7a255000 SMBIOS=0x7b140000 SMBIOS 3.0=0x7b13f000 TPMFinalLog=0x7a892000 ESRT=0x7b0deb18 [-MEMATTR=0x77535018-] {+MEMATTR=0x77526018+} MOKvar=0x7b13e000 RNG=0x7a254018 [-TPMEventLog=0x65a6f018-] {+TPMEventLog=0x682ac018+}
> 
> 
> thanks,

Thanks for sharing this.

This looks a bit weird for me.

The issue this patch was trying to fix was TPMEventLog being overwritten during kexec.
We are using efi libstub.
Without this patch we would see
BIOS-e820: [mem 0x0000000000100000-0x0000000064763fff] usable 
TPMEventLog=0x5ed47018
i.e. TPMEventLog was usable memory and therefore was prone to corruption during kexec.

With this patch 
BIOS-e820: [mem 0x00000000a8c01000-0x00000000a8cebfff] ACPI data
TPMEventLog=0xa8ca8018 
i.e.  TPMEventLog is reserved as ACPI data, hence cant be corrupted during kexec.


In your case, from the logs you shared, good run without the patch:
[    0.000000] [      T0] BIOS-e820: [mem 0x0000000065a6f000-0x0000000065a7dfff] ACPI data
[    0.000000] [      T0] BIOS-e820: [mem 0x0000000065a7e000-0x000000006a5acfff] usable
[    0.000000] [      T0] BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
TPMEventLog=0x65a6f018 
bad run with the patch:
[    0.000000] [      T0] BIOS-e820: [mem 0x00000000682ac000-0x00000000682bafff] ACPI data
[    0.000000] [      T0] BIOS-e820: [mem 0x00000000682bb000-0x000000006a5acfff] usable
[    0.000000] [      T0] BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
TPMEventLog=0x682ac018
Both with and without the fix, the TPMEventLog is part of ACPI data.

It means your firmware has already marked that area as ACPI data. Are you using efi/libstub?

Thanks,
Usama
Usama Arif Oct. 30, 2024, 6:24 p.m. UTC | #10
On 30/10/2024 18:02, Gregory Price wrote:
> On Wed, Oct 30, 2024 at 05:13:14PM +0000, Usama Arif wrote:
>>
>>
>> On 30/10/2024 05:25, Jiri Slaby wrote:
>>> On 25. 10. 24, 15:27, Usama Arif wrote:
>>>> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
>>>> All of these should be just be in the dmesg.
>>>
>>> It's shared in the aforementioned bug [1] already.
>>>
>>> 6.11.2 dmesg (bad run):
>>> https://bugzilla.suse.com/attachment.cgi?id=877874
>>>
>>> 6.12-rc2 dmesg (good run):
>>> https://bugzilla.suse.com/attachment.cgi?id=877887
>>>
>>> FWIW from https://bugzilla.suse.com/attachment.cgi?id=878051:
>>> good TPMEventLog=0x682aa018
>>> bad  TPMEventLog=0x65a6b018
>>>
>>> [1] https://bugzilla.suse.com/show_bug.cgi?id=1231465
>>>
> ... snip ...
>>> efi: EFI v2.6 by American Megatrends
>>> efi: ACPI=0x7a255000 ACPI 2.0=0x7a255000 SMBIOS=0x7b140000 SMBIOS 3.0=0x7b13f000 TPMFinalLog=0x7a892000 ESRT=0x7b0deb18 [-MEMATTR=0x77535018-] {+MEMATTR=0x77526018+} MOKvar=0x7b13e000 RNG=0x7a254018 [-TPMEventLog=0x65a6f018-] {+TPMEventLog=0x682ac018+}
>>>
>>>
>>> thanks,
>>
>> Thanks for sharing this.
>>
>> This looks a bit weird for me.
>>
>> The issue this patch was trying to fix was TPMEventLog being overwritten during kexec.
>> We are using efi libstub.
>> Without this patch we would see
>> BIOS-e820: [mem 0x0000000000100000-0x0000000064763fff] usable 
>> TPMEventLog=0x5ed47018
>> i.e. TPMEventLog was usable memory and therefore was prone to corruption during kexec.
>>
>> With this patch 
>> BIOS-e820: [mem 0x00000000a8c01000-0x00000000a8cebfff] ACPI data
>> TPMEventLog=0xa8ca8018 
>> i.e.  TPMEventLog is reserved as ACPI data, hence cant be corrupted during kexec.
>>
>>
>> In your case, from the logs you shared, good run without the patch:
>> [    0.000000] [      T0] BIOS-e820: [mem 0x0000000065a6f000-0x0000000065a7dfff] ACPI data
>> [    0.000000] [      T0] BIOS-e820: [mem 0x0000000065a7e000-0x000000006a5acfff] usable
>> [    0.000000] [      T0] BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
>> TPMEventLog=0x65a6f018 
>> bad run with the patch:
>> [    0.000000] [      T0] BIOS-e820: [mem 0x00000000682ac000-0x00000000682bafff] ACPI data
>> [    0.000000] [      T0] BIOS-e820: [mem 0x00000000682bb000-0x000000006a5acfff] usable
>> [    0.000000] [      T0] BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
>> TPMEventLog=0x682ac018
>> Both with and without the fix, the TPMEventLog is part of ACPI data.
>>
> 
> Just wondering - why would the TPM log move a total of ~40MB between COLD boots.
> 
> I would expect this location to be relatively fixed (give or take a small amount of
> memory) - especially since it's so early in boot.
> 

This is a good point. I think if we suspect this patch, it would be good to compare
dmesg of kernel cold boot with this patch vs the commit just before this patch, rather
than comparing with 6.12.

Thanks

>> It means your firmware has already marked that area as ACPI data. Are you using efi/libstub?
>>
>> Thanks,
>> Usama
>>
>>
>>
>>
Gregory Price Oct. 30, 2024, 6:26 p.m. UTC | #11
On Wed, Oct 30, 2024 at 05:13:14PM +0000, Usama Arif wrote:
> 
> 
> On 30/10/2024 05:25, Jiri Slaby wrote:
> > On 25. 10. 24, 15:27, Usama Arif wrote:
> >> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
> >> All of these should be just be in the dmesg.
> > efi: EFI v2.6 by American Megatrends

Tossing in another observation - the AMI EFI we've been working with has been

EFI v2.8 by American Megatrends
or 
EFI v2.9 by American Megatrends

We have not seen this particular behavior (cold boot corruption issues) on top
of these version.  Might be worth investigating this issue.

you may also want to investigate this patch set:

https://lore.kernel.org/all/20240913231954.20081-1-gourry@gourry.net/

which I believe would have caught your "eat all memory" sign extention issue.

This is queued up for v6.13 i think - but possibly 1/4 deserves a stable mark.

~Gregory
Ard Biesheuvel Oct. 30, 2024, 7:43 p.m. UTC | #12
On Wed, 30 Oct 2024 at 19:26, Gregory Price <gourry@gourry.net> wrote:
>
> On Wed, Oct 30, 2024 at 05:13:14PM +0000, Usama Arif wrote:
> >
> >
> > On 30/10/2024 05:25, Jiri Slaby wrote:
> > > On 25. 10. 24, 15:27, Usama Arif wrote:
> > >> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
> > >> All of these should be just be in the dmesg.
> > > efi: EFI v2.6 by American Megatrends
>
> Tossing in another observation - the AMI EFI we've been working with has been
>
> EFI v2.8 by American Megatrends
> or
> EFI v2.9 by American Megatrends
>
> We have not seen this particular behavior (cold boot corruption issues) on top
> of these version.  Might be worth investigating this issue.
>
> you may also want to investigate this patch set:
>
> https://lore.kernel.org/all/20240913231954.20081-1-gourry@gourry.net/
>
> which I believe would have caught your "eat all memory" sign extention issue.
>
> This is queued up for v6.13 i think - but possibly 1/4 deserves a stable mark.
>

To me, it does not seem obvious at all that the TPM code is the
culprit here. The firmware produces a corrupted memory attributes
table now that the EFI stub uses ACPI reclaim memory for the TPM event
log, but to me, it smells like a firmware issue, not an issue in the
EFI stub. (Pool allocations can trigger page allocations, affecting
the layout of the EFI memory map).

So let's keep an open mind here, and not stare ourselves blind on the
TPM event log code.
Gregory Price Oct. 30, 2024, 8:30 p.m. UTC | #13
On Wed, Oct 30, 2024 at 08:43:01PM +0100, Ard Biesheuvel wrote:
> On Wed, 30 Oct 2024 at 19:26, Gregory Price <gourry@gourry.net> wrote:
> >
> > On Wed, Oct 30, 2024 at 05:13:14PM +0000, Usama Arif wrote:
> > >
> > >
> > > On 30/10/2024 05:25, Jiri Slaby wrote:
> > > > On 25. 10. 24, 15:27, Usama Arif wrote:
> > > >> Could you share the e820 map, reserve setup_data and TPMEventLog address with and without the patch?
> > > >> All of these should be just be in the dmesg.
> > > > efi: EFI v2.6 by American Megatrends
> >
> > Tossing in another observation - the AMI EFI we've been working with has been
> >
> > EFI v2.8 by American Megatrends
> > or
> > EFI v2.9 by American Megatrends
> >
> > We have not seen this particular behavior (cold boot corruption issues) on top
> > of these version.  Might be worth investigating this issue.
> >
> > you may also want to investigate this patch set:
> >
> > https://lore.kernel.org/all/20240913231954.20081-1-gourry@gourry.net/
> >
> > which I believe would have caught your "eat all memory" sign extention issue.
> >
> > This is queued up for v6.13 i think - but possibly 1/4 deserves a stable mark.
> >
> 
> To me, it does not seem obvious at all that the TPM code is the
> culprit here. The firmware produces a corrupted memory attributes
> table now that the EFI stub uses ACPI reclaim memory for the TPM event
> log, but to me, it smells like a firmware issue, not an issue in the
> EFI stub. (Pool allocations can trigger page allocations, affecting
> the layout of the EFI memory map).
> 
> So let's keep an open mind here, and not stare ourselves blind on the
> TPM event log code.

Agreed, just working backward a bit - this definitely feels like something
is wrong in the state of Denm... firmware.

~Gregory
Jiri Slaby Oct. 31, 2024, 7:55 a.m. UTC | #14
On 25. 10. 24, 9:30, Ard Biesheuvel wrote:
> To me, it seems like the use of EFI_ACPI_RECLAIM_MEMORY in this case
> simply tickles a bug in the firmware that causes it to corrupt the
> memory attributes table. The fact that cold boot behaves differently
> is a strong indicator here.
> 
> I didn't see the results of the memory attribute table dumps on the
> bugzilla thread, but dumping this table from EFI is not very useful
> because it will get regenerated/updated at ExitBootServices() time.
> Unfortunately, that also takes away the console so capturing the state
> of that table before the EFI stub boots the kernel is not an easy
> thing to do.
> 
> Is the memattr table completely corrupted? It also has a version
> field, and only versions 1 and 2 are defined so we might use that to
> detect corruption.

So from a today test:
https://bugzilla.suse.com/attachment.cgi?id=878296

 > efi: memattr: efi_memattr_init: tab=0x7752f018 ver=1 
size=16+2*1705287680=0x00000000cb494010

version is NOT corrupted :).

thanks,
Jiri Slaby Oct. 31, 2024, 8:19 a.m. UTC | #15
On 30. 10. 24, 19:26, Gregory Price wrote:
> you may also want to investigate this patch set:
> 
> https://lore.kernel.org/all/20240913231954.20081-1-gourry@gourry.net/
> 
> which I believe would have caught your "eat all memory" sign extention issue.

But it's memattr what is corrupted, not tpmeventlog.

The reporter is using this for some time and it indeed works:
--- a/drivers/firmware/efi/memattr.c
+++ b/drivers/firmware/efi/memattr.c
@@ -12,7 +12,7 @@

  #include <asm/early_ioremap.h>

-static int __initdata tbl_size;
+static phys_addr_t __initdata tbl_size;
  unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR;

  /*
Jiri Slaby Oct. 31, 2024, 8:38 a.m. UTC | #16
On 30. 10. 24, 19:24, Usama Arif wrote:
> This is a good point. I think if we suspect this patch, it would be good to compare
> dmesg of kernel cold boot with this patch vs the commit just before this patch, rather
> than comparing with 6.12.

Here it goes then:

dmesg for 2e6871a632a99d9 (bad -- patch applied):
https://bugzilla.suse.com/attachment.cgi?id=878297

dmesg for bc2846c7ab029f0 (good -- before the patch, i.e. 2e6871a632a99d9^):
https://bugzilla.suse.com/attachment.cgi?id=878298

(6.11-stable SHAs ^^^)

$ wdiff -n good bad |colordiff
BIOS-e820: [mem 0x0000000000000000-0x0000000000057fff] usable
BIOS-e820: [mem 0x0000000000058000-0x0000000000058fff] reserved
BIOS-e820: [mem 0x0000000000059000-0x000000000009efff] usable
BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
BIOS-e820: [mem [-0x0000000000100000-0x000000006a5acfff]-] 
{+0x0000000000100000-0x0000000066a17fff] usable+}
{+BIOS-e820: [mem 0x0000000066a18000-0x0000000066a27fff] ACPI data+}
{+BIOS-e820: [mem 0x0000000066a28000-0x000000006a5acfff]+} usable
BIOS-e820: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
BIOS-e820: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
BIOS-e820: [mem 0x000000006a5af000-0x0000000079e83fff] usable
BIOS-e820: [mem 0x0000000079e84000-0x000000007a246fff] reserved
BIOS-e820: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
BIOS-e820: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
BIOS-e820: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
BIOS-e820: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
BIOS-e820: [mem 0x000000007b600000-0x000000007f7fffff] reserved
BIOS-e820: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
BIOS-e820: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
BIOS-e820: [mem 0x0000000100000000-0x000000087e7fffff] usable
NX (Execute Disable) protection: active
APIC: Static calls initialized
e820: update [mem 0x66a07018-0x66a17457] usable ==> usable
extended physical RAM map:
reserve setup_data: [mem 0x0000000000000000-0x0000000000057fff] usable
reserve setup_data: [mem 0x0000000000058000-0x0000000000058fff] reserved
reserve setup_data: [mem 0x0000000000059000-0x000000000009efff] usable
reserve setup_data: [mem 0x000000000009f000-0x00000000000fffff] reserved
reserve setup_data: [mem 0x0000000000100000-0x0000000066a07017] usable
reserve setup_data: [mem 0x0000000066a07018-0x0000000066a17457] usable
reserve setup_data: [mem [-0x0000000066a17458-0x000000006a5acfff]-] 
{+0x0000000066a17458-0x0000000066a17fff] usable+}
{+reserve setup_data: [mem 0x0000000066a18000-0x0000000066a27fff] ACPI 
data+}
{+reserve setup_data: [mem 0x0000000066a28000-0x000000006a5acfff]+} usable
reserve setup_data: [mem 0x000000006a5ad000-0x000000006a5adfff] ACPI NVS
reserve setup_data: [mem 0x000000006a5ae000-0x000000006a5aefff] reserved
reserve setup_data: [mem 0x000000006a5af000-0x0000000079e83fff] usable
reserve setup_data: [mem 0x0000000079e84000-0x000000007a246fff] reserved
reserve setup_data: [mem 0x000000007a247000-0x000000007a28efff] ACPI data
reserve setup_data: [mem 0x000000007a28f000-0x000000007abf0fff] ACPI NVS
reserve setup_data: [mem 0x000000007abf1000-0x000000007b5fefff] reserved
reserve setup_data: [mem 0x000000007b5ff000-0x000000007b5fffff] usable
reserve setup_data: [mem 0x000000007b600000-0x000000007f7fffff] reserved
reserve setup_data: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
reserve setup_data: [mem 0x00000000fe000000-0x00000000fe010fff] reserved
reserve setup_data: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
reserve setup_data: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
reserve setup_data: [mem 0x0000000100000000-0x000000087e7fffff] usable
efi: EFI v2.6 by American Megatrends
efi: ACPI=0x7a255000 ACPI 2.0=0x7a255000 SMBIOS=0x7b140000 SMBIOS 
3.0=0x7b13f000 TPMFinalLog=0x7a892000 ESRT=0x7b0deb18 
[-MEMATTR=0x77536298-] {+MEMATTR=0x7752f018+} MOKvar=0x7b13e000 
RNG=0x7a254018 TPMEventLog=0x66a18018

thanks,
Ard Biesheuvel Oct. 31, 2024, 9:04 a.m. UTC | #17
On Thu, 31 Oct 2024 at 08:55, Jiri Slaby <jirislaby@kernel.org> wrote:
>
> On 25. 10. 24, 9:30, Ard Biesheuvel wrote:
> > To me, it seems like the use of EFI_ACPI_RECLAIM_MEMORY in this case
> > simply tickles a bug in the firmware that causes it to corrupt the
> > memory attributes table. The fact that cold boot behaves differently
> > is a strong indicator here.
> >
> > I didn't see the results of the memory attribute table dumps on the
> > bugzilla thread, but dumping this table from EFI is not very useful
> > because it will get regenerated/updated at ExitBootServices() time.
> > Unfortunately, that also takes away the console so capturing the state
> > of that table before the EFI stub boots the kernel is not an easy
> > thing to do.
> >
> > Is the memattr table completely corrupted? It also has a version
> > field, and only versions 1 and 2 are defined so we might use that to
> > detect corruption.
>
> So from a today test:
> https://bugzilla.suse.com/attachment.cgi?id=878296
>
>  > efi: memattr: efi_memattr_init: tab=0x7752f018 ver=1
> size=16+2*1705287680=0x00000000cb494010
>
> version is NOT corrupted :).
>

OK, so the struct looks like this

typedef struct {
        u32 version;
        u32 num_entries;
        u32 desc_size;
        u32 flags;
        efi_memory_desc_t entry[];
} efi_memory_attributes_table_t;

and in the correct case, num_entries == 45 and desc_size == 48.

It is quite easy to sanity check this structure: desc_size should be
equal to the desc_size in the memory map, and num_entries can never
exceed 2x the number of entries in the EFI memory map.

I'll go and implement something that performs the check right after
ExitBootServices(), and just drops the table if it is bogus (it isn't
that important anyway)
diff mbox series

Patch

diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
index df3182f2e63a..1fd6823248ab 100644
--- a/drivers/firmware/efi/libstub/tpm.c
+++ b/drivers/firmware/efi/libstub/tpm.c
@@ -96,7 +96,7 @@  static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
 	}
 
 	/* Allocate space for the logs and copy them. */
-	status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
+	status = efi_bs_call(allocate_pool, EFI_ACPI_RECLAIM_MEMORY,
 			     sizeof(*log_tbl) + log_size, (void **)&log_tbl);
 
 	if (status != EFI_SUCCESS) {