mbox series

[RFC,0/3] efi: Implement generic zboot support

Message ID 20230416120729.2470762-1-ardb@kernel.org
Headers show
Series efi: Implement generic zboot support | expand

Message

Ard Biesheuvel April 16, 2023, 12:07 p.m. UTC
This series is a proof-of-concept that implements support for the EFI
zboot decompressor for x86. It replaces the ordinary decompressor, and
instead, performs the decompression, KASLR randomization and the 4/5
level paging switch while running in the execution context of EFI.

This simplifies things substantially, and makes it straight-forward to
abide by stricter future requirements related to the use of writable and
executable memory under EFI, which will come into effect on x86 systems
that are certified as being 'more secure', and ship with an even shinier
Windows sticker.

This is an alternative approach to the work being proposed by Evgeny [0]
that makes rather radical changes to the existing decompressor, which
has accumulated too many features already, e.g., related to confidential
compute etc.

EFI zboot images can be booted in two ways:
- by EFI firmware, which loads and starts it as an ordinary EFI
  application, just like the existing EFI stub (with which it shares
  most of its code);
- by a non-EFI loader that parses the image header for the compression
  metadata, and decompresses the image into memory and boots it.

Realistically, the second option is unlikely to ever be used on x86,
given that it already has its existing bzImage, but the first option is
a good choice for distros that target EFI boot only (and some distros
switched to this format already for arm64). The fact that EFI zboot is
implemented in the same way on arm64, RISC-V, LoongArch and [shortly]
ARM helps with maintenance, not only of the kernel itself, but also the
tooling around it relating to kexec, code signing, deployment, etc.

Series can be pulled from [1], which contains some prerequisite patches
that are only tangentially related.

[0] https://lore.kernel.org/all/cover.1678785672.git.baskov@ispras.ru/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-x86-zboot

Cc: Evgeniy Baskov <baskov@ispras.ru>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Peter Jones <pjones@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

Ard Biesheuvel (3):
  efi/libstub: x86: Split off pieces shared with zboot
  efi/zboot: x86: Implement EFI zboot support
  efi/zboot: x86: Clear NX restrictions on populated code regions

 arch/x86/Makefile                              |  18 +-
 arch/x86/include/asm/efi.h                     |  10 +
 arch/x86/kernel/head_64.S                      |  15 +
 arch/x86/zboot/Makefile                        |  29 +
 drivers/firmware/efi/Kconfig                   |   2 +-
 drivers/firmware/efi/libstub/Makefile          |  15 +-
 drivers/firmware/efi/libstub/Makefile.zboot    |   2 +-
 drivers/firmware/efi/libstub/efi-stub-helper.c |   3 +
 drivers/firmware/efi/libstub/x86-stub.c        | 592 +------------------
 drivers/firmware/efi/libstub/x86-zboot.c       | 322 ++++++++++
 drivers/firmware/efi/libstub/x86.c             | 612 ++++++++++++++++++++
 drivers/firmware/efi/libstub/zboot.c           |   3 +-
 drivers/firmware/efi/libstub/zboot.lds         |   5 +
 13 files changed, 1031 insertions(+), 597 deletions(-)
 create mode 100644 arch/x86/zboot/Makefile
 create mode 100644 drivers/firmware/efi/libstub/x86-zboot.c
 create mode 100644 drivers/firmware/efi/libstub/x86.c

Comments

Evgeniy Baskov April 18, 2023, 2:10 p.m. UTC | #1
On 2023-04-16 15:07, Ard Biesheuvel wrote:
> This series is a proof-of-concept that implements support for the EFI
> zboot decompressor for x86. It replaces the ordinary decompressor, and
> instead, performs the decompression, KASLR randomization and the 4/5
> level paging switch while running in the execution context of EFI.
> 
> This simplifies things substantially, and makes it straight-forward to
> abide by stricter future requirements related to the use of writable 
> and
> executable memory under EFI, which will come into effect on x86 systems
> that are certified as being 'more secure', and ship with an even 
> shinier
> Windows sticker.
> 
> This is an alternative approach to the work being proposed by Evgeny 
> [0]
> that makes rather radical changes to the existing decompressor, which
> has accumulated too many features already, e.g., related to 
> confidential
> compute etc.
> 
> EFI zboot images can be booted in two ways:
> - by EFI firmware, which loads and starts it as an ordinary EFI
>   application, just like the existing EFI stub (with which it shares
>   most of its code);
> - by a non-EFI loader that parses the image header for the compression
>   metadata, and decompresses the image into memory and boots it.
> 
> Realistically, the second option is unlikely to ever be used on x86,
> given that it already has its existing bzImage, but the first option is
> a good choice for distros that target EFI boot only (and some distros
> switched to this format already for arm64). The fact that EFI zboot is
> implemented in the same way on arm64, RISC-V, LoongArch and [shortly]
> ARM helps with maintenance, not only of the kernel itself, but also the
> tooling around it relating to kexec, code signing, deployment, etc.
> 
> Series can be pulled from [1], which contains some prerequisite patches
> that are only tangentially related.

I've considered using similar approach when I was writing my series.
That looks great if you look at subject without considering backwards
compatibility, especially due to the increased code sharing and the 
usage
of the code path without all the legacy stuff. But, I think, that zboot
approach have two downsides:

* Most distros won't use it, due to backward compatibility, so they 
won't
   benefit the improvements.
* Those distros that would potentially use it, have to be either
   DIY-ish like Gentoo, or provide both kernels during installation.
   So it either complicates installation process or installer logic.

It might work for UEFI-only distros, but those won't be a majority for a
little while for x86, I think. Because it's very likely that a lot of 
people
will complain if distro provides zboot-only kernel (considering that the
same story with the handover protocol). Backward compatibility is evil.

So, I think, at least for now, it would still be better to change 
existing
extraction code and stay compatible, despite it being harder and less 
clean...

(zboot also lacks the support for some kernel options, like ones used 
for
tweaking memory map; mixed mode support and probably the handling of
CONFIG_MEMORY_HOTREMOVE, but that's an RFC, so this comment is largely
irrelevant for now.)

Thanks,
Evgeniy Baskov

> 
> [0] https://lore.kernel.org/all/cover.1678785672.git.baskov@ispras.ru/
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-x86-zboot
> 
> Cc: Evgeniy Baskov <baskov@ispras.ru>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
> Cc: Peter Jones <pjones@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Ard Biesheuvel (3):
>   efi/libstub: x86: Split off pieces shared with zboot
>   efi/zboot: x86: Implement EFI zboot support
>   efi/zboot: x86: Clear NX restrictions on populated code regions
> 
>  arch/x86/Makefile                              |  18 +-
>  arch/x86/include/asm/efi.h                     |  10 +
>  arch/x86/kernel/head_64.S                      |  15 +
>  arch/x86/zboot/Makefile                        |  29 +
>  drivers/firmware/efi/Kconfig                   |   2 +-
>  drivers/firmware/efi/libstub/Makefile          |  15 +-
>  drivers/firmware/efi/libstub/Makefile.zboot    |   2 +-
>  drivers/firmware/efi/libstub/efi-stub-helper.c |   3 +
>  drivers/firmware/efi/libstub/x86-stub.c        | 592 
> +------------------
>  drivers/firmware/efi/libstub/x86-zboot.c       | 322 ++++++++++
>  drivers/firmware/efi/libstub/x86.c             | 612 
> ++++++++++++++++++++
>  drivers/firmware/efi/libstub/zboot.c           |   3 +-
>  drivers/firmware/efi/libstub/zboot.lds         |   5 +
>  13 files changed, 1031 insertions(+), 597 deletions(-)
>  create mode 100644 arch/x86/zboot/Makefile
>  create mode 100644 drivers/firmware/efi/libstub/x86-zboot.c
>  create mode 100644 drivers/firmware/efi/libstub/x86.c
Dave Young April 19, 2023, 2:56 a.m. UTC | #2
Hi,

[resend the reply since I mistakenly sent a html version, apologize for
those who received two of this reply]
On 04/16/23 at 02:07pm, Ard Biesheuvel wrote:
> This series is a proof-of-concept that implements support for the EFI
> zboot decompressor for x86. It replaces the ordinary decompressor, and
> instead, performs the decompression, KASLR randomization and the 4/5
> level paging switch while running in the execution context of EFI.
> 
> This simplifies things substantially, and makes it straight-forward to
> abide by stricter future requirements related to the use of writable and
> executable memory under EFI, which will come into effect on x86 systems
> that are certified as being 'more secure', and ship with an even shinier
> Windows sticker.
> 
> This is an alternative approach to the work being proposed by Evgeny [0]
> that makes rather radical changes to the existing decompressor, which
> has accumulated too many features already, e.g., related to confidential
> compute etc.
> 
> EFI zboot images can be booted in two ways:
> - by EFI firmware, which loads and starts it as an ordinary EFI
>   application, just like the existing EFI stub (with which it shares
>   most of its code);
> - by a non-EFI loader that parses the image header for the compression
>   metadata, and decompresses the image into memory and boots it.
> 
> Realistically, the second option is unlikely to ever be used on x86,
> given that it already has its existing bzImage, but the first option is
> a good choice for distros that target EFI boot only (and some distros
> switched to this format already for arm64). The fact that EFI zboot is
> implemented in the same way on arm64, RISC-V, LoongArch and [shortly]
> ARM helps with maintenance, not only of the kernel itself, but also the
> tooling around it relating to kexec, code signing, deployment, etc.

Hi Ard, since the kexec support is not yet ready, if no quick plan how
about change the Kconfig and make zboot can be enabled only when kexec
is not enabled.

> 
> Series can be pulled from [1], which contains some prerequisite patches
> that are only tangentially related.
> 
> [0] https://lore.kernel.org/all/cover.1678785672.git.baskov@ispras.ru/
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-x86-zboot
> 
> Cc: Evgeniy Baskov <baskov@ispras.ru>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
> Cc: Peter Jones <pjones@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Ard Biesheuvel (3):
>   efi/libstub: x86: Split off pieces shared with zboot
>   efi/zboot: x86: Implement EFI zboot support
>   efi/zboot: x86: Clear NX restrictions on populated code regions
> 
>  arch/x86/Makefile                              |  18 +-
>  arch/x86/include/asm/efi.h                     |  10 +
>  arch/x86/kernel/head_64.S                      |  15 +
>  arch/x86/zboot/Makefile                        |  29 +
>  drivers/firmware/efi/Kconfig                   |   2 +-
>  drivers/firmware/efi/libstub/Makefile          |  15 +-
>  drivers/firmware/efi/libstub/Makefile.zboot    |   2 +-
>  drivers/firmware/efi/libstub/efi-stub-helper.c |   3 +
>  drivers/firmware/efi/libstub/x86-stub.c        | 592 +------------------
>  drivers/firmware/efi/libstub/x86-zboot.c       | 322 ++++++++++
>  drivers/firmware/efi/libstub/x86.c             | 612 ++++++++++++++++++++
>  drivers/firmware/efi/libstub/zboot.c           |   3 +-
>  drivers/firmware/efi/libstub/zboot.lds         |   5 +
>  13 files changed, 1031 insertions(+), 597 deletions(-)
>  create mode 100644 arch/x86/zboot/Makefile
>  create mode 100644 drivers/firmware/efi/libstub/x86-zboot.c
>  create mode 100644 drivers/firmware/efi/libstub/x86.c
> 
> -- 
> 2.39.2
>
Gerd Hoffmann April 19, 2023, 5:54 a.m. UTC | #3
On Sun, Apr 16, 2023 at 02:07:26PM +0200, Ard Biesheuvel wrote:
> This series is a proof-of-concept that implements support for the EFI
> zboot decompressor for x86. It replaces the ordinary decompressor, and
> instead, performs the decompression, KASLR randomization and the 4/5
> level paging switch while running in the execution context of EFI.
> 
> This simplifies things substantially, and makes it straight-forward to
> abide by stricter future requirements related to the use of writable and
> executable memory under EFI, which will come into effect on x86 systems
> that are certified as being 'more secure', and ship with an even shinier
> Windows sticker.
> 
> This is an alternative approach to the work being proposed by Evgeny [0]
> that makes rather radical changes to the existing decompressor, which
> has accumulated too many features already, e.g., related to confidential
> compute etc.
> 
> EFI zboot images can be booted in two ways:
> - by EFI firmware, which loads and starts it as an ordinary EFI
>   application, just like the existing EFI stub (with which it shares
>   most of its code);
> - by a non-EFI loader that parses the image header for the compression
>   metadata, and decompresses the image into memory and boots it.

I like the idea to have all EFI archs handle compressed kernels the same
way.

But given that going EFI-only on x86 isn't a realistic option for
distros today this isn't really an alternative for Evgeny's patch
series, we have to fix the existing bzImage decompressor too.

> Realistically, the second option is unlikely to ever be used on x86,

What would be needed to do so?  Teach kexec-tools and grub2 parse and
load zboot kernels I guess?

take care,
  Gerd
Ard Biesheuvel April 19, 2023, 2:44 p.m. UTC | #4
On Wed, 19 Apr 2023 at 07:54, Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Sun, Apr 16, 2023 at 02:07:26PM +0200, Ard Biesheuvel wrote:
> > This series is a proof-of-concept that implements support for the EFI
> > zboot decompressor for x86. It replaces the ordinary decompressor, and
> > instead, performs the decompression, KASLR randomization and the 4/5
> > level paging switch while running in the execution context of EFI.
> >
> > This simplifies things substantially, and makes it straight-forward to
> > abide by stricter future requirements related to the use of writable and
> > executable memory under EFI, which will come into effect on x86 systems
> > that are certified as being 'more secure', and ship with an even shinier
> > Windows sticker.
> >
> > This is an alternative approach to the work being proposed by Evgeny [0]
> > that makes rather radical changes to the existing decompressor, which
> > has accumulated too many features already, e.g., related to confidential
> > compute etc.
> >
> > EFI zboot images can be booted in two ways:
> > - by EFI firmware, which loads and starts it as an ordinary EFI
> >   application, just like the existing EFI stub (with which it shares
> >   most of its code);
> > - by a non-EFI loader that parses the image header for the compression
> >   metadata, and decompresses the image into memory and boots it.
>
> I like the idea to have all EFI archs handle compressed kernels the same
> way.
>
> But given that going EFI-only on x86 isn't a realistic option for
> distros today this isn't really an alternative for Evgeny's patch
> series, we have to fix the existing bzImage decompressor too.
>

I tend to agree, although some clarification would be helpful
regarding what is being fixed and why? I *think* I know, but I have
not been involved as deeply as some of the distro folks in getting
these requirements explicit.

> Realistically, the second option is unlikely to ever be used on x86,
>
> What would be needed to do so?  Teach kexec-tools and grub2 parse and
> load zboot kernels I guess?
>

I already implemented this for mach-virt here, so we can load Fedora
kernels without firmware:

https://gitlab.com/qemu-project/qemu/-/commit/ff11422804cd03494cc98691eecd3909ea09ab6f

On arm64, this is probably more straight-forward, as the bare metal
image is already intended to be booted directly like that. However,
the x86 uncompressed image requires surprisingly little from all the
boot_params/setup_header cruft to actually boot, so perhaps there it
is easy too.

There is an unresolved issue related to kexec_load_file(), where only
the compressed image is signed, but the uncompressed image is what
ultimately gets booted, which either needs the decompression to occur
in the kernel, or a secondary signature that the kernel can verify
after the decompression happens in user space.

Dave and I have generated several ideas here, but there hasn't been
any progress towards a solution that seems palatable for upstream.
Gerd Hoffmann April 20, 2023, 6:07 a.m. UTC | #5
Hi,

> > Realistically, the second option is unlikely to ever be used on x86,
> >
> > What would be needed to do so?  Teach kexec-tools and grub2 parse and
> > load zboot kernels I guess?
> 
> I already implemented this for mach-virt here, so we can load Fedora
> kernels without firmware:
> 
> https://gitlab.com/qemu-project/qemu/-/commit/ff11422804cd03494cc98691eecd3909ea09ab6f
> 
> On arm64, this is probably more straight-forward, as the bare metal
> image is already intended to be booted directly like that. However,
> the x86 uncompressed image requires surprisingly little from all the
> boot_params/setup_header cruft to actually boot, so perhaps there it
> is easy too.

For existing boot loaders like grub I'd expect it being easy
to code up, all the setup header code exists already, grub also
has support for uncompressing stuff, so it should really be only
zboot header parsing and some plumbing to get things going (i.e.
have grub boot efi zboot kernels in bios mode).

Disclaimer: didn't actually check grub source code.

I suspect the bigger problem wrt. grub is that getting patches merged
upstream is extremely slow and every distro carries a huge stack of
patches ...

take care,
  Gerd
Ard Biesheuvel April 20, 2023, 7:54 a.m. UTC | #6
On Thu, 20 Apr 2023 at 08:07, Gerd Hoffmann <kraxel@redhat.com> wrote:
>
>   Hi,
>
> > > Realistically, the second option is unlikely to ever be used on x86,
> > >
> > > What would be needed to do so?  Teach kexec-tools and grub2 parse and
> > > load zboot kernels I guess?
> >
> > I already implemented this for mach-virt here, so we can load Fedora
> > kernels without firmware:
> >
> > https://gitlab.com/qemu-project/qemu/-/commit/ff11422804cd03494cc98691eecd3909ea09ab6f
> >
> > On arm64, this is probably more straight-forward, as the bare metal
> > image is already intended to be booted directly like that. However,
> > the x86 uncompressed image requires surprisingly little from all the
> > boot_params/setup_header cruft to actually boot, so perhaps there it
> > is easy too.
>
> For existing boot loaders like grub I'd expect it being easy
> to code up, all the setup header code exists already, grub also
> has support for uncompressing stuff, so it should really be only
> zboot header parsing and some plumbing to get things going (i.e.
> have grub boot efi zboot kernels in bios mode).
>
> Disclaimer: didn't actually check grub source code.
>

I have :-(

> I suspect the bigger problem wrt. grub is that getting patches merged
> upstream is extremely slow and every distro carries a huge stack of
> patches ...
>

Yeah, Daniel has been asking me about LoadFile2 initrd loading support
for x86, so I think getting things merged is not going to be a problem
(although it will still take some time) - I can just implement it and
send it out at the same time.

But hacking/building/running GRUB is a rather painful experience, so I
have been kicking this can down the road.
Mario Limonciello April 20, 2023, 12:29 p.m. UTC | #7
On 4/20/23 02:54, Ard Biesheuvel wrote:
> On Thu, 20 Apr 2023 at 08:07, Gerd Hoffmann <kraxel@redhat.com> wrote:
>>    Hi,
>>
>>>> Realistically, the second option is unlikely to ever be used on x86,
>>>>
>>>> What would be needed to do so?  Teach kexec-tools and grub2 parse and
>>>> load zboot kernels I guess?
>>> I already implemented this for mach-virt here, so we can load Fedora
>>> kernels without firmware:
>>>
>>> https://gitlab.com/qemu-project/qemu/-/commit/ff11422804cd03494cc98691eecd3909ea09ab6f
>>>
>>> On arm64, this is probably more straight-forward, as the bare metal
>>> image is already intended to be booted directly like that. However,
>>> the x86 uncompressed image requires surprisingly little from all the
>>> boot_params/setup_header cruft to actually boot, so perhaps there it
>>> is easy too.
>> For existing boot loaders like grub I'd expect it being easy
>> to code up, all the setup header code exists already, grub also
>> has support for uncompressing stuff, so it should really be only
>> zboot header parsing and some plumbing to get things going (i.e.
>> have grub boot efi zboot kernels in bios mode).
>>
>> Disclaimer: didn't actually check grub source code.
>>
> I have :-(
>
>> I suspect the bigger problem wrt. grub is that getting patches merged
>> upstream is extremely slow and every distro carries a huge stack of
>> patches ...
>>
> Yeah, Daniel has been asking me about LoadFile2 initrd loading support
> for x86, so I think getting things merged is not going to be a problem
> (although it will still take some time) - I can just implement it and
> send it out at the same time.
If zboot ends up being the way to go there would be quite a bit
of pressure to land the GRUB stuff in distros because of the NX
requirements being pushed into the EFI ecosystem.

They wouldn't be able to work on a system that enforces NX which
is anticipated to balloon.

>
> But hacking/building/running GRUB is a rather painful experience, so I
> have been kicking this can down the road.
Andy Lutomirski April 21, 2023, 1:29 p.m. UTC | #8
On Sun, Apr 16, 2023, at 5:07 AM, Ard Biesheuvel wrote:
> This series is a proof-of-concept that implements support for the EFI
> zboot decompressor for x86. It replaces the ordinary decompressor, and
> instead, performs the decompression, KASLR randomization and the 4/5
> level paging switch while running in the execution context of EFI.

I like the concept.  A couple high-level questions, since I haven’t dug into the code:

Could zboot and bzImage be built into the same kernel image?  That would get this into distros, and eventually someone could modify the legacy path to switch to long mode and invoke zboot (because zboot surely doesn’t need actual UEFI — just a sensible environment like what UEFI provides.)

Does zboot set up BSS correctly?  I once went down a rabbit hole trying to get the old decompressor to jump into the kernel with BSS already usable and zeroed, and the result was an incredible mess — IIRC the decompressor does some in-place shenanigans that looked incompatible with handling BSS without a rewrite. And so we clear BSS in C after jumping to the kernel, which is gross.

—Andy

>
> This simplifies things substantially, and makes it straight-forward to
> abide by stricter future requirements related to the use of writable and
> executable memory under EFI, which will come into effect on x86 systems
> that are certified as being 'more secure', and ship with an even shinier
> Windows sticker.
>
> This is an alternative approach to the work being proposed by Evgeny [0]
> that makes rather radical changes to the existing decompressor, which
> has accumulated too many features already, e.g., related to confidential
> compute etc.
>
> EFI zboot images can be booted in two ways:
> - by EFI firmware, which loads and starts it as an ordinary EFI
>   application, just like the existing EFI stub (with which it shares
>   most of its code);
> - by a non-EFI loader that parses the image header for the compression
>   metadata, and decompresses the image into memory and boots it.
>
> Realistically, the second option is unlikely to ever be used on x86,
> given that it already has its existing bzImage, but the first option is
> a good choice for distros that target EFI boot only (and some distros
> switched to this format already for arm64). The fact that EFI zboot is
> implemented in the same way on arm64, RISC-V, LoongArch and [shortly]
> ARM helps with maintenance, not only of the kernel itself, but also the
> tooling around it relating to kexec, code signing, deployment, etc.
>
> Series can be pulled from [1], which contains some prerequisite patches
> that are only tangentially related.
>
> [0] https://lore.kernel.org/all/cover.1678785672.git.baskov@ispras.ru/
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-x86-zboot
>
> Cc: Evgeniy Baskov <baskov@ispras.ru>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
> Cc: Peter Jones <pjones@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Mario Limonciello <mario.limonciello@amd.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
>
> Ard Biesheuvel (3):
>   efi/libstub: x86: Split off pieces shared with zboot
>   efi/zboot: x86: Implement EFI zboot support
>   efi/zboot: x86: Clear NX restrictions on populated code regions
>
>  arch/x86/Makefile                              |  18 +-
>  arch/x86/include/asm/efi.h                     |  10 +
>  arch/x86/kernel/head_64.S                      |  15 +
>  arch/x86/zboot/Makefile                        |  29 +
>  drivers/firmware/efi/Kconfig                   |   2 +-
>  drivers/firmware/efi/libstub/Makefile          |  15 +-
>  drivers/firmware/efi/libstub/Makefile.zboot    |   2 +-
>  drivers/firmware/efi/libstub/efi-stub-helper.c |   3 +
>  drivers/firmware/efi/libstub/x86-stub.c        | 592 +------------------
>  drivers/firmware/efi/libstub/x86-zboot.c       | 322 ++++++++++
>  drivers/firmware/efi/libstub/x86.c             | 612 ++++++++++++++++++++
>  drivers/firmware/efi/libstub/zboot.c           |   3 +-
>  drivers/firmware/efi/libstub/zboot.lds         |   5 +
>  13 files changed, 1031 insertions(+), 597 deletions(-)
>  create mode 100644 arch/x86/zboot/Makefile
>  create mode 100644 drivers/firmware/efi/libstub/x86-zboot.c
>  create mode 100644 drivers/firmware/efi/libstub/x86.c
>
> -- 
> 2.39.2
Ard Biesheuvel April 21, 2023, 1:41 p.m. UTC | #9
On Fri, 21 Apr 2023 at 15:30, Andy Lutomirski <luto@kernel.org> wrote:
>
>
>
> On Sun, Apr 16, 2023, at 5:07 AM, Ard Biesheuvel wrote:
> > This series is a proof-of-concept that implements support for the EFI
> > zboot decompressor for x86. It replaces the ordinary decompressor, and
> > instead, performs the decompression, KASLR randomization and the 4/5
> > level paging switch while running in the execution context of EFI.
>
> I like the concept.  A couple high-level questions, since I haven’t dug into the code:
>
> Could zboot and bzImage be built into the same kernel image?  That would get this into distros, and eventually someone could modify the legacy path to switch to long mode and invoke zboot (because zboot surely doesn’t need actual UEFI — just a sensible environment like what UEFI provides.)
>

That's an interesting question, and to some extent, that is actually
what Evgeny's patch does: execute more of what the decompressor does
from inside the EFI runtime context.

The main win with zboot imho is that we get rid of all the funky
heuristics that look for usable memory for trampolines and
decompression buffers in funky ways, and instead, just use the EFI
APIs for allocating pages and remapping them executable as needed
(which is the important piece here) I'd have to think about whether
there is any middle ground between this approach and Evgeny's - I'll
have to get back to you on that.

> Does zboot set up BSS correctly?  I once went down a rabbit hole trying to get the old decompressor to jump into the kernel with BSS already usable and zeroed, and the result was an incredible mess — IIRC the decompressor does some in-place shenanigans that looked incompatible with handling BSS without a rewrite. And so we clear BSS in C after jumping to the kernel, which is gross.
>

Zboot pads the image to include BSS, so that the zboot metadata covers
the actual memory footprint of the image rather than just the image
size, and it will get zeroed out as a result of the decompression too,
which is a nice bonus. I did this mainly to try and make it idiot
proof for other (non-EFI) consumers of the zboot header and compressed
payload, but it means that the zboot EFI loader doesn't have to bother
either.
Andy Lutomirski May 3, 2023, 5:55 p.m. UTC | #10
On Fri, Apr 21, 2023, at 6:41 AM, Ard Biesheuvel wrote:
> On Fri, 21 Apr 2023 at 15:30, Andy Lutomirski <luto@kernel.org> wrote:
>>
>>
>>
>> On Sun, Apr 16, 2023, at 5:07 AM, Ard Biesheuvel wrote:
>> > This series is a proof-of-concept that implements support for the EFI
>> > zboot decompressor for x86. It replaces the ordinary decompressor, and
>> > instead, performs the decompression, KASLR randomization and the 4/5
>> > level paging switch while running in the execution context of EFI.
>>
>> I like the concept.  A couple high-level questions, since I haven’t dug into the code:
>>
>> Could zboot and bzImage be built into the same kernel image?  That would get this into distros, and eventually someone could modify the legacy path to switch to long mode and invoke zboot (because zboot surely doesn’t need actual UEFI — just a sensible environment like what UEFI provides.)
>>
>
> That's an interesting question, and to some extent, that is actually
> what Evgeny's patch does: execute more of what the decompressor does
> from inside the EFI runtime context.
>
> The main win with zboot imho is that we get rid of all the funky
> heuristics that look for usable memory for trampolines and
> decompression buffers in funky ways, and instead, just use the EFI
> APIs for allocating pages and remapping them executable as needed
> (which is the important piece here) I'd have to think about whether
> there is any middle ground between this approach and Evgeny's - I'll
> have to get back to you on that.
>

Hmm.  I dug the tiniest bit into the history.  The x86/boot/compressed stuff has an allocator!  It's this:

        free_mem_ptr     = heap;        /* Heap */
        free_mem_end_ptr = heap + BOOT_HEAP_SIZE;

plus a trivial and horrible malloc() implementation in include/linux/decompress/mm.h.  There's one caller in x86/boot/compressed.

And, once upon a time, the idea of allocating enough memory to store the kernel from the decompressor would have been a problem.  I'm willing to claim that we should not even try to support x86 systems that have that little memory (at least not once they've gotten long mode or at least flat 32-bit protected mode working).  We should not try to allocate below 1MB (my laptop will cry), but there's no need for that.

So maybe the middle ground is to build a modern, simple malloc(), and back it by EFI when EFI is there and by just finding some free memory when EFI is not there?

This would be risky -- someone might have a horrible machine that has trouble with a simple allocator.
Ard Biesheuvel May 3, 2023, 6:13 p.m. UTC | #11
On Wed, 3 May 2023 at 19:55, Andy Lutomirski <luto@kernel.org> wrote:
>
> On Fri, Apr 21, 2023, at 6:41 AM, Ard Biesheuvel wrote:
> > On Fri, 21 Apr 2023 at 15:30, Andy Lutomirski <luto@kernel.org> wrote:
> >>
> >>
> >>
> >> On Sun, Apr 16, 2023, at 5:07 AM, Ard Biesheuvel wrote:
> >> > This series is a proof-of-concept that implements support for the EFI
> >> > zboot decompressor for x86. It replaces the ordinary decompressor, and
> >> > instead, performs the decompression, KASLR randomization and the 4/5
> >> > level paging switch while running in the execution context of EFI.
> >>
> >> I like the concept.  A couple high-level questions, since I haven’t dug into the code:
> >>
> >> Could zboot and bzImage be built into the same kernel image?  That would get this into distros, and eventually someone could modify the legacy path to switch to long mode and invoke zboot (because zboot surely doesn’t need actual UEFI — just a sensible environment like what UEFI provides.)
> >>
> >
> > That's an interesting question, and to some extent, that is actually
> > what Evgeny's patch does: execute more of what the decompressor does
> > from inside the EFI runtime context.
> >
> > The main win with zboot imho is that we get rid of all the funky
> > heuristics that look for usable memory for trampolines and
> > decompression buffers in funky ways, and instead, just use the EFI
> > APIs for allocating pages and remapping them executable as needed
> > (which is the important piece here) I'd have to think about whether
> > there is any middle ground between this approach and Evgeny's - I'll
> > have to get back to you on that.
> >
>
> Hmm.  I dug the tiniest bit into the history.  The x86/boot/compressed stuff has an allocator!  It's this:
>
>         free_mem_ptr     = heap;        /* Heap */
>         free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
>
> plus a trivial and horrible malloc() implementation in include/linux/decompress/mm.h.  There's one caller in x86/boot/compressed.
>
> And, once upon a time, the idea of allocating enough memory to store the kernel from the decompressor would have been a problem.  I'm willing to claim that we should not even try to support x86 systems that have that little memory (at least not once they've gotten long mode or at least flat 32-bit protected mode working).  We should not try to allocate below 1MB (my laptop will cry), but there's no need for that.
>
> So maybe the middle ground is to build a modern, simple malloc(), and back it by EFI when EFI is there and by just finding some free memory when EFI is not there?
>
> This would be risky -- someone might have a horrible machine that has trouble with a simple allocator.

The malloc() is the least or our concerns, tbh. It is only used by the
decompression library itself, and it is backed by a statically
allocated block of BSS.

Having just gone through this again, the main issues are:

1) The 4/5 level switching trampoline, which runs in the page tables
of the loader/EFI stub, and assumes that it is fine to grab a random
chunk of low memory, stash its contents somewhere, use it for some
code, a stack and a root level page table so we can do the x86 long
mode paging off/paging on salsa, and then copy back the contents and
carry on as if nothing happened. We currently have some code in the
stub that strips all NX restrictions from a generously overdimensioned
block of low memory so copying and running code like this actually
works.

2) We need an accurate description in the PE/COFF header of what needs
to be executable and what needs to be writable, so we can splt the
regions. This only matters for code that runs under EFI's mappings, so
not a lot.

3) The payload relocates itself to the end of the decompression buffer
so it doesn't overwrite itself before completing. This is fragile and
also unnecessary when there is a page allocator and plenty of memory,
but afaict, this all executes under the decompressor's own page tables
so the RO/NX attributes that EFI sets are not a concern here. It
would, of course, be nice if we could avoid relying on RWX mappings
here.

4) I think it was you who pointed out that the demand paging 1:1 map
should really only get triggered for data accesses and not code
accesses, so it would be nice if we could create such mappings with NX
attributes.

I've had another go at running the decompressed kernel directly,
without going through the decompressor logic at all, but I missed the
fact that SEV does a substantial amount of work in the decompressor
too, so I'm no longer convinced that this is a viable approach. But
I'm looking into this.

I just finished some patches [0] that only address 1), based on the
work I posted earlier. I'll send those out once -rc1 comes around.


[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=efi-x86-cleanup-la57