[v3,0/4] Introducing Hornet LSM

Message ID	20250502184421.1424368-1-bboscaccy@linux.microsoft.com
Headers	show Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0C69E376; Fri, 2 May 2025 18:44:32 +0000 (UTC) From: Blaise Boscaccy <bboscaccy@linux.microsoft.com> To: Jonathan Corbet <corbet@lwn.net>, David Howells <dhowells@redhat.com>, Herbert Xu <herbert@gondor.apana.org.au>, "David S. Miller" <davem@davemloft.net>, Paul Moore <paul@paul-moore.com>, James Morris <jmorris@namei.org>, "Serge E. Hallyn" <serge@hallyn.com>, Masahiro Yamada <masahiroy@kernel.org>, Nathan Chancellor <nathan@kernel.org>, Nicolas Schier <nicolas@fjasle.eu>, Shuah Khan <shuah@kernel.org>, =?utf-8?q?Micka=C3=ABl_Sala=C3=BCn?= <mic@digikod.net>, =?utf-8?q?G=C3=BCnt?= =?utf-8?q?her_Noack?= <gnoack@google.com>, Nick Desaulniers <nick.desaulniers+lkml@gmail.com>, Bill Wendling <morbo@google.com>, Justin Stitt <justinstitt@google.com>, Blaise Boscaccy <bboscaccy@linux.microsoft.com>, Jarkko Sakkinen <jarkko@kernel.org>, Jan Stancek <jstancek@redhat.com>, Neal Gompa <neal@gompa.dev>, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, keyrings@vger.kernel.org, linux-crypto@vger.kernel.org, linux-security-module@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, llvm@lists.linux.dev, nkapron@google.com, teknoraver@meta.com, roberto.sassu@huawei.com, xiyou.wangcong@gmail.com, Tyler Hicks <code@tyhicks.com>, James Bottomley <James.Bottomley@hansenpartnership.com> Subject: [PATCH v3 0/4] Introducing Hornet LSM Date: Fri, 2 May 2025 11:44:06 -0700 Message-ID: <20250502184421.1424368-1-bboscaccy@linux.microsoft.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Introducing Hornet LSM \| expand [v3,0/4] Introducing Hornet LSM [v3,1/4] security: Hornet LSM [v3,2/4] hornet: Introduce sign-ebpf [v3,3/4] hornet: Add a light skeleton data extractor script [v3,4/4] selftests/hornet: Add a selftest for the Hornet LSM

Blaise Boscaccy May 2, 2025, 6:44 p.m. UTC

This patch series introduces the Hornet LSM. The goal of Hornet is to
provide a signature verification mechanism for eBPF programs.

eBPF has similar requirements to that of modules when it comes to
loading: find symbol addresses, fix up ELF relocations, some struct
field offset handling stuff called CO-RE (compile-once run-anywhere),
and some other miscellaneous bookkeeping. During eBPF program
compilation, pseudo-values get written to the immediate operands of
instructions. During loading, those pseudo-values get rewritten with
concrete addresses or data applicable to the currently running system,
e.g., a kallsyms address or an fd for a map. This needs to happen
before the instructions for a bpf program are loaded into the kernel
via the bpf() syscall. Unlike modules, an in-kernel loader
unfortunately doesn't exist. Typically, the instruction rewriting is
done dynamically in userspace via libbpf. Since the relocations and
instruction modifications are happening in userspace, and their values
may change depending upon the running system, this breaks known
signature verification mechanisms.

Light skeleton programs were introduced in order to support early
loading of eBPF programs along with user-mode drivers. They utilize a
separate eBPF program that can load a target eBPF program and perform
all necessary relocations in-kernel without needing a working
userspace. Light skeletons were mentioned as a possible path forward
for signature verification.

Hornet takes a simple approach to light-skeleton-based eBPF signature
verification. A PKCS#7 signature of a data buffer containing the raw
instructions of an eBPF program, followed by the initial values of any
maps used by the program is used. A utility script is provided to
parse and extract the contents of autogenerated header files created
via bpftool. That payload can then be signed and appended to the light
skeleton executable.

Maps are checked that they are frozen to prevent TOCTOU bugs where a
sufficiently privileged user could rewrite map data between the calls
to BPF_PROG_LOAD and BPF_PROG_RUN. Additionally, both
sparse-array-based and fd_array_cnt-based map fd arrays are supported
for signature verification.

References:
  [1] https://lore.kernel.org/bpf/20220209054315.73833-1-alexei.starovoitov@gmail.com/
  [2] https://lore.kernel.org/bpf/CAADnVQ+wPK1KKZhCgb-Nnf0Xfjk8M1UpX5fnXC=cBzdEYbv_kg@mail.gmail.com/

Change list:
- v2 -> v3
  - Remove any and all usage of proprietary bpf APIs
  - Add optional systemd/pid1 whitelisting
  - Minor Makefile cleanup
  - Fixed buffer leak
  - Handled null current task
  - Made magic number required
  - Defensive checks against invalid buffer signature reads

- v1 -> v2
  - Jargon clarification, maintainer entry and a few cosmetic fixes

Revisions:
- v1
  https://lore.kernel.org/bpf/20250321164537.16719-1-bboscaccy@linux.microsoft.com
- v2
  https://lore.kernel.org/linux-security-module/20250404215527.1563146-1-bboscaccy@linux.microsoft.com

Blaise Boscaccy (4):
  security: Hornet LSM
  hornet: Introduce sign-ebpf
  hornet: Add a light skeleton data extractor script
  selftests/hornet: Add a selftest for the Hornet LSM

 Documentation/admin-guide/LSM/Hornet.rst      |  65 +++
 Documentation/admin-guide/LSM/index.rst       |   1 +
 MAINTAINERS                                   |   9 +
 crypto/asymmetric_keys/pkcs7_verify.c         |  10 +
 include/linux/kernel_read_file.h              |   1 +
 include/linux/verification.h                  |   1 +
 include/uapi/linux/lsm.h                      |   1 +
 scripts/Makefile                              |   1 +
 scripts/hornet/Makefile                       |   5 +
 scripts/hornet/extract-skel.sh                |  29 ++
 scripts/hornet/sign-ebpf.c                    | 411 ++++++++++++++++++
 security/Kconfig                              |   3 +-
 security/Makefile                             |   1 +
 security/hornet/Kconfig                       |  24 +
 security/hornet/Makefile                      |   4 +
 security/hornet/hornet_lsm.c                  | 250 +++++++++++
 security/selinux/hooks.c                      |  12 +-
 security/selinux/include/classmap.h           |   2 +-
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/hornet/Makefile       |  58 +++
 tools/testing/selftests/hornet/fail_loader.sh |   3 +
 tools/testing/selftests/hornet/frozen_skel.h  | 393 +++++++++++++++++
 tools/testing/selftests/hornet/loader.c       |  22 +
 tools/testing/selftests/hornet/trivial.bpf.c  |  33 ++
 24 files changed, 1336 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
 create mode 100644 scripts/hornet/Makefile
 create mode 100755 scripts/hornet/extract-skel.sh
 create mode 100644 scripts/hornet/sign-ebpf.c
 create mode 100644 security/hornet/Kconfig
 create mode 100644 security/hornet/Makefile
 create mode 100644 security/hornet/hornet_lsm.c
 create mode 100644 tools/testing/selftests/hornet/Makefile
 create mode 100755 tools/testing/selftests/hornet/fail_loader.sh
 create mode 100644 tools/testing/selftests/hornet/frozen_skel.h
 create mode 100644 tools/testing/selftests/hornet/loader.c
 create mode 100644 tools/testing/selftests/hornet/trivial.bpf.c

Paul Moore May 4, 2025, 5:36 p.m. UTC | #1

On Fri, May 2, 2025 at 5:00 PM KP Singh <kpsingh@kernel.org> wrote:
>
> > This patch series introduces the Hornet LSM. The goal of Hornet is to provide
> > a signature verification mechanism for eBPF programs.
> >
>
> [...]
>
> >
> > References: [1]
> > https://lore.kernel.org/bpf/20220209054315.73833-1-alexei.starovoitov@gmail.com/
> > [2]
> > https://lore.kernel.org/bpf/CAADnVQ+wPK1KKZhCgb-Nnf0Xfjk8M1UpX5fnXC=cBzdEYbv_kg@mail.gmail.com/
> >
> > Change list: - v2 -> v3 - Remove any and all usage of proprietary bpf APIs
>
> BPF APIs are not proprietary, but you cannot implement BPF program signing
> for BPF users without aligning with the BPF maintainers and the community.
> Signed programs are a UAPI and a key part of how developers experience BPF
> and this is not how we would like signing to be experienced by BPF users.
>
> Some more feedback (which should be pretty obvious) but explicitly:
>
> * Hacks like if (current->pid == 1) return 0; also break your threat model
>   about root being untrusted.

Speaking with Blaise off-list when that change was discussed, I
believe the intent behind that Kconfig option was simply for
development/transition purposes, and not for any long term usage.  My
understanding is that this is why it was a separate build time
configuration and not something that could be toggled at runtime, e.g.
sysctl or similar.

> * You also did not take the feedback into account:
>
>    new = map->ops->map_lookup_elem(map, &key);
>
>   This is not okay without having the BPF maintainers aligned, the same way as
>
>   https://patchwork.kernel.org/project/netdevbpf/patch/20240629084331.3807368-4-kpsingh@kernel.org/#25928981
>
>   was not okay. Let's not have double standards.

Blaise Boscaccy May 5, 2025, 5:30 p.m. UTC | #2

KP Singh <kpsingh@kernel.org> writes:

[...]

> Now if you really care about the use-case and want to work with the maintainers
> and implement signing for the community, here's how we think it should be done:
>
> * The core signing logic and the tooling stays in BPF, something that the users
>   are already using. No new tooling.
> * The policy decision on the effect of signing can be built into various
>   existing LSMs. I don't think we need a new LSM for it.
> * A simple UAPI (emphasis on UAPI!) change to union bpf_attr in uapi/bpf.h in
>   the BPF_PROG_LOAD command:
>
> __aligned_u64 signature; 
> __u32 signature_size;

I think having some actual details on the various parties' requirements
here would be helpful. KP, I do look forward to seeing your design;
however, having code signing proposals where the capabilities are
dictated from above and any dissent is dismissed as "you're doing it
wrong" isn't going to be helpful to anyone that needs to use this in
practice.

Also, I don't think anyone actually cares, at least I don't, who calls
verify_pkcs7_signature or whatnot. Verifying signed binary blobs with a
private key is a solved problem and isn't really interesting.

Our requirements for code signing are just an extension of secure boot
and module signing logic:

* Prove all code running in ring zero has been signed
* Not trivially defeatable by root
* Ultimately, no trusted userspace components
* Secure from and not vulnerable to TOCTOU attacks
* Shouldn't be overly vulnerable to supply-chain attacks
* The signature checking logic and control paths should be human-readable
* Work easily and be backportable to stable kernels
* There should be a simple kconfig option to turn this on or off
* This solution needs to be in the mainline kernel

Hornet was implemented to meet those requirements, living in the LSM
subsystem, written in C. As of today, one cannot accomplish those
requirements via BPF-LSM, which is why C was chosen.

One can easily realize there is absolutely no way to have a single
one-size-fits-all signing solution for everything listed in
https://ebpf.io/applications/.

If you want to go the UAPI route, I would wholeheartedly recommend
making it extensible and having this data be available to the policy
LSMs.

enum bpf_signature_type {
  /* x509 signature check against program instructions only */
  BPF_SIGNATURE_PROG_ONLY = 0,
  /* x509 combined signature check against program instructions and used maps */
  BPF_SIGNATURE_PROG_USED_MAPS = 1,
  /* more of these to be determined via usage */ 
  ...
};

_aligned_u64 signature; 
__u32 signature_size;
__u32 signature_type;

The other option for solving this in the general is in-kernel
loaders. That's gotten pushback as well. 

-blaise

KP Singh May 5, 2025, 8:41 p.m. UTC | #3

On Mon, May 5, 2025 at 7:30 PM Blaise Boscaccy
<bboscaccy@linux.microsoft.com> wrote:
>
> KP Singh <kpsingh@kernel.org> writes:
>
> [...]
>
> > Now if you really care about the use-case and want to work with the maintainers
> > and implement signing for the community, here's how we think it should be done:
> >
> > * The core signing logic and the tooling stays in BPF, something that the users
> >   are already using. No new tooling.
> > * The policy decision on the effect of signing can be built into various
> >   existing LSMs. I don't think we need a new LSM for it.
> > * A simple UAPI (emphasis on UAPI!) change to union bpf_attr in uapi/bpf.h in
> >   the BPF_PROG_LOAD command:
> >
> > __aligned_u64 signature;
> > __u32 signature_size;
>
> I think having some actual details on the various parties' requirements
> here would be helpful. KP, I do look forward to seeing your design;
> however, having code signing proposals where the capabilities are
> dictated from above and any dissent is dismissed as "you're doing it
> wrong" isn't going to be helpful to anyone that needs to use this in
> practice.

Please don't misrepresent the facts, you got feedback from Alexei in
various threads, but you chose to argue on the points that were
convenient for you (i.e. usage of BPF internal APIs) and yet you claim
to "work with the BPF community and maintainers" by arguing instead of
collaborating and paying attention to the feedback given to you.

1. https://lore.kernel.org/bpf/CAADnVQKF+B_YYwOCFsPBbrTBGKe4b22WVJFb8C0PHGmRAjbusQ@mail.gmail.com/

  Your solution to address the ELF loader specific issue was to just
allow list systemd? You totally ignored the part about loaders in
golang and Rust that do not use ELF. How is this    "directive from
above?"

2. Alexei suggested to you in
https://lore.kernel.org/bpf/87plhhjwqy.fsf@microsoft.com/

  "A signature for the map plus a signature for the prog
  that is tied to a map might be a better option.
  At map creation time the contents can be checked,
  the map is frozen, and then the verifier can proceed
  with prog's signature checking."

You never replied to this.

3. To signing the attachment points, you replied

> That statement is quite questionable. Yes, IIRC you brought that up. And
> again, runtime policy enforcement has nothing to do with proving code
> provenance. They are completely independent concerns.

The place where the BPF program is attached is a key part of the
provenance of the BPF program and its security (and other) effects can
vary greatly based on that. (e.g. imagine a reject all LSM program
that is attached to the wrong LSM hook). This is why it's not the same
as module loading.

4. https://lore.kernel.org/bpf/CAADnVQKF+B_YYwOCFsPBbrTBGKe4b22WVJFb8C0PHGmRAjbusQ@mail.gmail.com/

Programs can still access maps, now if you combine the issue of
ELF-less loaders and that maps are writeable from other programs as
freezing only affects userspace (i.e. when a binary gets an FD to a
map and tries to modify it with syscalls) your implementation fails.

The reply there about trusted user-space still needs to come with
better guarantees from the kernel, and the kernel can indeed give
better guarantees, which we will share. The reason for this is that
your trusted binary is not immune to attacks, and once an attacker
gains code execution as this trusted binary, there is no containing
the compromise.

- KP

>
> Also, I don't think anyone actually cares, at least I don't, who calls
> verify_pkcs7_signature or whatnot. Verifying signed binary blobs with a
> private key is a solved problem and isn't really interesting.
>
> Our requirements for code signing are just an extension of secure boot
> and module signing logic:
>
> * Prove all code running in ring zero has been signed
> * Not trivially defeatable by root
> * Ultimately, no trusted userspace components
> * Secure from and not vulnerable to TOCTOU attacks
> * Shouldn't be overly vulnerable to supply-chain attacks
> * The signature checking logic and control paths should be human-readable
> * Work easily and be backportable to stable kernels
> * There should be a simple kconfig option to turn this on or off
> * This solution needs to be in the mainline kernel
>
> Hornet was implemented to meet those requirements, living in the LSM
> subsystem, written in C. As of today, one cannot accomplish those
> requirements via BPF-LSM, which is why C was chosen.
>
> One can easily realize there is absolutely no way to have a single
> one-size-fits-all signing solution for everything listed in
> https://ebpf.io/applications/.
>
> If you want to go the UAPI route, I would wholeheartedly recommend
> making it extensible and having this data be available to the policy
> LSMs.
>
> enum bpf_signature_type {
>   /* x509 signature check against program instructions only */
>   BPF_SIGNATURE_PROG_ONLY = 0,
>   /* x509 combined signature check against program instructions and used maps */
>   BPF_SIGNATURE_PROG_USED_MAPS = 1,
>   /* more of these to be determined via usage */
>   ...
> };
>
> _aligned_u64 signature;
> __u32 signature_size;
> __u32 signature_type;
>
> The other option for solving this in the general is in-kernel
> loaders. That's gotten pushback as well.
>
> -blaise
>
>
>
>
>

Paul Moore May 7, 2025, 11:21 p.m. UTC | #4

On Wed, May 7, 2025 at 1:48 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> I'm with Paul on this: if you could share your design ideas more fully
> than you have above that would help make this debate way more
> technical.

I think it would also help some of us, at the very least me, put your
objections into context.  I believe the more durable solutions that
end up in Linus' tree are combinations of designs created out of
compromise, and right now we are missing the context and detail of
your ideal solution to be able to do that compromise and get to a
design and implementation we can all begrudgingly accept.  In the
absence of a detailed alternate design, and considering that BPF
signature validation efforts have sputtered along for years without
any real success, we'll continue to push forward on-list with
refinements to the current proposal in an effort to drive this to some
form of resolution.

> I also get the impression that there might be a disagreement over
> scope: what seems to be coming out of BPF is that every signing problem
> and scenario must be solved before signing can be considered
> acceptable; however, I think it's not unreasonable to attempt to cover
> a portion of the use cases and allow for future additions of things
> like policy so we can get some forward motion to allow others to play
> with it and produce patches based on their use cases.

Beyond any potential future updates to Hornet, I just wanted to make
it clear that the Hornet LSM approach, like any LSM, can be disabled
both at compile time for those users who build their own kernels, as
well as at kernel boot time using the "lsm=" command line option for
those who are limited to pre-built kernels, e.g. distro kernels.
Users can always disable Hornet and replace it with another LSM,
either a BPF LSM or a native/C LSM, of their choosing; the LSM
framework is intentionally flexible to allow for this substitution and
replacement, with plenty of existing examples already.

KP Singh May 11, 2025, 2:01 a.m. UTC | #5

> > I think we need a more detailed explanation of this approach on-list.
> > There has been a lot of vague guidance on BPF signature validation
> > from the BPF community which I believe has partly led us into the
> > situation we are in now.  If you are going to require yet another
> > approach, I think we all need to see a few paragraphs on-list
> > outlining the basic design.
>
> Definitely, happy to share design / code.

Here’s the design that Alexei and I have been discussing. It's
extensible, independent of ELF formats, handles all identified
use-cases, paves the way for signed unprivileged eBPF, and meets the
requirements of anyone who wants to run signed eBPF programs.

# Trusted Hash Chain

The key idea of the design is to use a signing algorithm that allows
us to integrity-protect a number of future payloads, including their
order, by creating a chain of trust.

Consider that Alice needs to send messages M_1, M_2, ..., M_n to Bob.
We define blocks of data such that:

    B_n = M_n || H(termination_marker)

(Each block contains its corresponding message and the hash of the
*next* block in the chain.)

    B_{n-1} = M_{n-1} || H(B_n)
    B_{n-2} = M_{n-2} || H(B_{n-1})

  ...

    B_2 = M_2 || H(B_3)
    B_1 = M_1 || H(B_2)

Alice does the following (e.g., on a build system where all payloads
are available):

  * Assembles the blocks B_1, B_2, ..., B_n.
  * Calculates H(B_1) and signs it, yielding Sig(H(B_1)).

Alice sends the following to Bob:

    M_1, H(B_2), Sig(H(B_1))

Bob receives this payload and does the following:

    * Reconstructs B_1 as B_1' using the received M_1 and H(B_2)
(i.e., B_1' = M_1 || H(B_2)).
    * Recomputes H(B_1') and verifies the signature against the
received Sig(H(B_1)).
    * If the signature verifies, it establishes the integrity of M_1
and H(B_2) (and transitively, the integrity of the entire chain). Bob
now stores the verified H(B_2) until it receives the next message.
    * When Bob receives M_2 (and H(B_3) if n > 2), it reconstructs
B_2' (e.g., B_2' = M_2 || H(B_3), or if n=2, B_2' = M_2 ||
H(termination_marker)). Bob then computes H(B_2') and compares it
against the stored H(B_2) that was verified in the previous step.

This process continues until the last block is received and verified.

Now, applying this to the BPF signing use-case, we simplify to two messages:

    M_1 = I_loader (the instructions of the loader program)
    M_2 = M_metadata (the metadata for the loader program, passed in a
map, which includes the programs to be loaded and other context)

For this specific BPF case, we will directly sign a composite of the
first message and the hash of the second. Let H_meta = H(M_metadata).
The block to be signed is effectively:

    B_signed = I_loader || H_meta

The signature generated is Sig(B_signed).

The process then follows a similar pattern to the Alice and Bob model,
where the kernel (Bob) verifies I_loader and H_meta using the
signature. Then, the trusted I_loader is responsible for verifying
M_metadata against the trusted H_meta.

KP Singh May 11, 2025, 2:14 a.m. UTC | #6

[...]

> Blaise started this most recent effort by attempting to address the
> concerns brought up in previous efforts, you and others rejected this
> first attempt and directed Blaise towards a light skeleton and LSM
> based approach, which is where he is at with Hornet.  Once again, you
> reject this approach with minimal guidance on what would be
> acceptable, and our response is to ask for clarification on your
> preferred design.  We're not asking for a full working solution,
> simply a couple of paragraphs outlining the design with enough detail
> to put forward a working solution that isn't immediately NACK'd.
> We've made this request multiple times in the past, most recently this
> past weekend, where KP replied that he would be "happy" to share

Here's the proposed design:

https://lore.kernel.org/bpf/CACYkzJ6VQUExfyt0=-FmXz46GHJh3d=FXh5j4KfexcEFbHV-vg@mail.gmail.com/#t


> designs/code.  Unfortunately, since then all we've received from
> either you or KP since then has been effectively just a list of your
> objections on repeat; surely typing out a couple of paragraphs
> outlining a design would have been quicker, easier, and more
> constructive then your latest reply?

Paul Moore May 14, 2025, 3:06 a.m. UTC | #7

On Sat, May 10, 2025 at 10:01 PM KP Singh <kpsingh@kernel.org> wrote:
>

...

> The signature check in the verifier (during BPF_PROG_LOAD):
>
>     verify_pkcs7_signature(prog->aux->sha, sizeof(prog->aux->sha),
> sig_from_bpf_attr, …);

I think we still need to clarify the authorization aspect of your
proposed design.

Working under the assumption that the core BPF kernel code doesn't
want to enforce any restrictions, or at least as few as possible, I'm
expecting that the BPF kernel code would want to adopt an "allow all"
policy when it comes to authorizing signed and unsigned BPF programs,
delegating any additional restrictions to the LSM.  With that in mind
I think we need to agree on a way for the BPF verifier to indicate
that it has verified the signature is correct to the LSM, and we need
a new LSM hook which runs *after* the verifier so that it can inspect
the results of the signature verification.  While it might be tempting
to relocate the existing security_bpf_prog_load() hook, I believe it
makes sense to leave that hook before the verifier for those LSMs that
wish control access prior to the verifier's inspection using criteria
other than signatures.

With respect to the LSM hook, since it appears that the signature is
going to be included in the bpf_attr struct, and I'm *guessing* the
best way for the verifier to indicate the result of the signature
verification is via a field inside bpf_prog_aux, this means the hook
could look something like this:

  int security_bpf_prog_verified(bpf_prog, bpf_attr);

... and be called immediately after bpf_check() in bpf_prog_load().
As far as the new field in bpf_prog_aux is concerned, I think we can
probably start off with a simple bool to indicate whether a signature
was verified or not, with an understanding that we can move to a
richer construct in the future if we find it necessary.  Neither of
these are directly visible to userspace so we have the ability to
start simple and modify as needed.

Does this sound reasonable to everyone?  Does anyone have any other
thoughts on the authorization aspect of BPF signature verification?

James Bottomley May 14, 2025, 3:39 p.m. UTC | #8

On Sun, 2025-05-11 at 04:01 +0200, KP Singh wrote:
[...]
> > 
> For this specific BPF case, we will directly sign a composite of the
> first message and the hash of the second. Let H_meta = H(M_metadata).
> The block to be signed is effectively:
> 
>     B_signed = I_loader || H_meta
> 
> The signature generated is Sig(B_signed).
> 
> The process then follows a similar pattern to the Alice and Bob
> model,
> where the kernel (Bob) verifies I_loader and H_meta using the
> signature. Then, the trusted I_loader is responsible for verifying
> M_metadata against the trusted H_meta.
> 
> From an implementation standpoint:
> 
> # Build
> 
> bpftool (or some other tool in the user's build environment) knows
> about the metadata (M_metadata) and the loader program (I_loader). It
> first calculates H_meta = H(M_metadata). Then it constructs the
> object
> to be signed and computes the signature:
> 
>     Sig(I_loader || H_meta)
> 
> # Loader
> 
> bpftool generates the loader program. The initial instructions of
> this loader program are designed to verify the SHA256 hash of the
> metadata (M_metadata) that will be passed in a map. These
> instructions effectively embed the precomputed H_meta as immediate
> values.
> 
>     ld_imm64 r1, const_ptr_to_map // insn[0].src_reg ==
> BPF_PSEUDO_MAP_IDX
>     r2 = *(u64 *)(r1 + 0);
>     ld_imm64 r3, sha256_of_map_part1 // constant precomputed by
> bpftool (part of H_meta)
>     if r2 != r3 goto out;
> 
>     r2 = *(u64 *)(r1 + 8);
>     ld_imm64 r3, sha256_of_map_part2 // (part of H_meta)
>     if r2 != r3 goto out;
> 
>     r2 = *(u64 *)(r1 + 16);
>     ld_imm64 r3, sha256_of_map_part3 // (part of H_meta)
>     if r2 != r3 goto out;
> 
>     r2 = *(u64 *)(r1 + 24);
>     ld_imm64 r3, sha256_of_map_part4 // (part of H_meta)
>     if r2 != r3 goto out;
>     ...
> 
> This implicitly makes the payload equivalent to the signed block
> (B_signed)
> 
>     I_loader || H_meta
> 
> bpftool then generates the signature of this I_loader payload (which
> now contains the expected H_meta) using a key (system or user) with
> new flags that work in combination with bpftool -L

Could I just push back a bit on this.  The theory of hash chains (which
I've cut to shorten) is about pure data structures.  The reason for
that is that the entire hash chain is supposed to be easily
independently verifiable in any environment because anything can
compute the hashes of the blocks and links.  This independent
verification of the chain is key to formally proving hash chains to be
correct.  In your proposal we lose the easy verifiability because the
link hash is embedded in the ebpf loader program which has to be
disassembled to do the extraction of the hash and verify the loader is
actually checking it.

I was looking at ways we could use a pure hash chain (i.e. signature
over loader and real map hash) and it does strike me that the above
ebpf hash verification code is pretty invariant and easy to construct,
so it could run as a separate BPF fragment that then jumps to the real
loader.  In that case, it could be constructed on the fly in a trusted
environment, like the kernel, from the link hash in the signature and
the signature could just be Sig(loader || map hash) which can then be
easily verified without having to disassemble ebpf code.  So we get the
formal provability benefits of using a real hash chain while still
keeping your verification in BPF. 

Regards,

James

KP Singh May 14, 2025, 5:17 p.m. UTC | #9

On Wed, May 14, 2025 at 5:39 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Sun, 2025-05-11 at 04:01 +0200, KP Singh wrote:
> [...]
> > >
> > For this specific BPF case, we will directly sign a composite of the
> > first message and the hash of the second. Let H_meta = H(M_metadata).
> > The block to be signed is effectively:
> >
> >     B_signed = I_loader || H_meta
> >
> > The signature generated is Sig(B_signed).
> >
> > The process then follows a similar pattern to the Alice and Bob
> > model,
> > where the kernel (Bob) verifies I_loader and H_meta using the
> > signature. Then, the trusted I_loader is responsible for verifying
> > M_metadata against the trusted H_meta.
> >
> > From an implementation standpoint:
> >
> > # Build
> >
> > bpftool (or some other tool in the user's build environment) knows
> > about the metadata (M_metadata) and the loader program (I_loader). It
> > first calculates H_meta = H(M_metadata). Then it constructs the
> > object
> > to be signed and computes the signature:
> >
> >     Sig(I_loader || H_meta)
> >
> > # Loader
> >
> > bpftool generates the loader program. The initial instructions of
> > this loader program are designed to verify the SHA256 hash of the
> > metadata (M_metadata) that will be passed in a map. These
> > instructions effectively embed the precomputed H_meta as immediate
> > values.
> >
> >     ld_imm64 r1, const_ptr_to_map // insn[0].src_reg ==
> > BPF_PSEUDO_MAP_IDX
> >     r2 = *(u64 *)(r1 + 0);
> >     ld_imm64 r3, sha256_of_map_part1 // constant precomputed by
> > bpftool (part of H_meta)
> >     if r2 != r3 goto out;
> >
> >     r2 = *(u64 *)(r1 + 8);
> >     ld_imm64 r3, sha256_of_map_part2 // (part of H_meta)
> >     if r2 != r3 goto out;
> >
> >     r2 = *(u64 *)(r1 + 16);
> >     ld_imm64 r3, sha256_of_map_part3 // (part of H_meta)
> >     if r2 != r3 goto out;
> >
> >     r2 = *(u64 *)(r1 + 24);
> >     ld_imm64 r3, sha256_of_map_part4 // (part of H_meta)
> >     if r2 != r3 goto out;
> >     ...
> >
> > This implicitly makes the payload equivalent to the signed block
> > (B_signed)
> >
> >     I_loader || H_meta
> >
> > bpftool then generates the signature of this I_loader payload (which
> > now contains the expected H_meta) using a key (system or user) with
> > new flags that work in combination with bpftool -L
>
> Could I just push back a bit on this.  The theory of hash chains (which
> I've cut to shorten) is about pure data structures.  The reason for
> that is that the entire hash chain is supposed to be easily
> independently verifiable in any environment because anything can
> compute the hashes of the blocks and links.  This independent
> verification of the chain is key to formally proving hash chains to be
> correct.  In your proposal we lose the easy verifiability because the
> link hash is embedded in the ebpf loader program which has to be
> disassembled to do the extraction of the hash and verify the loader is
> actually checking it.

I am not sure I understand your concern. This is something that can
easily be built into tooling / annotations.

    bpftool -S -v <verification_key> <loader> <metadata>

Could you explain what's the use-case for "easy verifiability".

>
> I was looking at ways we could use a pure hash chain (i.e. signature
> over loader and real map hash) and it does strike me that the above
> ebpf hash verification code is pretty invariant and easy to construct,
> so it could run as a separate BPF fragment that then jumps to the real
> loader.  In that case, it could be constructed on the fly in a trusted
> environment, like the kernel, from the link hash in the signature and
> the signature could just be Sig(loader || map hash) which can then be

The design I proposed does the same thing:

    Sig(loader || H_metadata)

metadata is actually the data (programs, context etc) that's passed in
the map. The verification just happens in the loader program and the
loader || H_metadata is implemented elegantly to avoid any separate
payloads.

> easily verified without having to disassemble ebpf code.  So we get the
> formal provability benefits of using a real hash chain while still
> keeping your verification in BPF.
>
> Regards,
>
> James
>
>

KP Singh May 14, 2025, 6:35 p.m. UTC | #10

On Wed, May 14, 2025 at 7:45 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2025-05-14 at 19:17 +0200, KP Singh wrote:
> > On Wed, May 14, 2025 at 5:39 PM James Bottomley
> > <James.Bottomley@hansenpartnership.com> wrote:
> > > On Sun, 2025-05-11 at 04:01 +0200, KP Singh wrote:
> [...]
> > > > This implicitly makes the payload equivalent to the signed block
> > > > (B_signed)
> > > >
> > > >     I_loader || H_meta
> > > >
> > > > bpftool then generates the signature of this I_loader payload
> > > > (which now contains the expected H_meta) using a key (system or
> > > > user) with new flags that work in combination with bpftool -L
> > >
> > > Could I just push back a bit on this.  The theory of hash chains
> > > (which I've cut to shorten) is about pure data structures.  The
> > > reason for that is that the entire hash chain is supposed to be
> > > easily independently verifiable in any environment because anything
> > > can compute the hashes of the blocks and links.  This independent
> > > verification of the chain is key to formally proving hash chains to
> > > be correct.  In your proposal we lose the easy verifiability
> > > because the link hash is embedded in the ebpf loader program which
> > > has to be disassembled to do the extraction of the hash and verify
> > > the loader is actually checking it.
> >
> > I am not sure I understand your concern. This is something that can
> > easily be built into tooling / annotations.
> >
> >     bpftool -S -v <verification_key> <loader> <metadata>
> >
> > Could you explain what's the use-case for "easy verifiability".
>
> I mean verifiability of the hash chain link.  Given a signed program,
> (i.e. a .h file which is generated by bpftool) which is a signature
> over the loader only how would one use simple cryptographic operations
> to verify it?
>

I literally just said it above the hash can be extracted if you really
want offline verification. Are you saying this code is hard to write?
or is the tooling hard to write? Do you have some definition of
"simple cryptographic operations".  All operations use tooling.

>
> >
> > > I was looking at ways we could use a pure hash chain (i.e.
> > > signature over loader and real map hash) and it does strike me that
> > > the above ebpf hash verification code is pretty invariant and easy
> > > to construct, so it could run as a separate BPF fragment that then
> > > jumps to the real loader.  In that case, it could be constructed on
> > > the fly in a trusted environment, like the kernel, from the link
> > > hash in the signature and the signature could just be Sig(loader ||
> > > map hash) which can then be
> >
> > The design I proposed does the same thing:
> >
> >     Sig(loader || H_metadata)
> >
> > metadata is actually the data (programs, context etc) that's passed
> > in the map. The verification just happens in the loader program and
> > the loader || H_metadata is implemented elegantly to avoid any
> > separate payloads.
>
> OK, so I think this is the crux of the problem:  In formal methods
> proving the validity of a data based hash link is an easy set of
> cryptographic operations.  You can assert that's equivalent to a
> signature over a program that verifies the hash, but formally proving
> it requires a formal analysis of the program to show that 1) it
> contains the correct hash and 2) it correctly checks the hash against
> the map.  That makes the task of someone receiving the .h file
> containing the signed skeleton way harder: it's easy to prove the
> signature matches the loader instructions, but they still have to prove
> the instructions contain and verify the correct map hash.
>

I don't see this as a problem for 2 reasons:

1. It's not hard
2. Your typical user does not want to do formal verification and
extract signatures etc.

[1] alone is enough.

The key user journey is:

* Build the program and the metadata
* Sign the blob once (as explained)
* A simple API to verify the sequence of operations.

The user builds a program and signs the blob, they sign it because it
contains the hash of the metadata. It seems like you are optimizing
for the formal researcher but not for the tooling. The user just needs
good tooling and a simple API which is exactly what was proposed.

- KP

> Regards,
>
> James
>
>

KP Singh May 14, 2025, 8:41 p.m. UTC | #11

On Wed, May 14, 2025 at 10:32 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2025-05-14 at 20:35 +0200, KP Singh wrote:
> > On Wed, May 14, 2025 at 7:45 PM James Bottomley
> > <James.Bottomley@hansenpartnership.com> wrote:
> > >
> > > On Wed, 2025-05-14 at 19:17 +0200, KP Singh wrote:
> > > > On Wed, May 14, 2025 at 5:39 PM James Bottomley
> > > > <James.Bottomley@hansenpartnership.com> wrote:
> > > > > On Sun, 2025-05-11 at 04:01 +0200, KP Singh wrote:
> > > [...]
> > > > > > This implicitly makes the payload equivalent to the signed
> > > > > > block
> > > > > > (B_signed)
> > > > > >
> > > > > >     I_loader || H_meta
> > > > > >
> > > > > > bpftool then generates the signature of this I_loader payload
> > > > > > (which now contains the expected H_meta) using a key (system
> > > > > > or
> > > > > > user) with new flags that work in combination with bpftool -L
> > > > >
> > > > > Could I just push back a bit on this.  The theory of hash
> > > > > chains
> > > > > (which I've cut to shorten) is about pure data structures.  The
> > > > > reason for that is that the entire hash chain is supposed to be
> > > > > easily independently verifiable in any environment because
> > > > > anything
> > > > > can compute the hashes of the blocks and links.  This
> > > > > independent
> > > > > verification of the chain is key to formally proving hash
> > > > > chains to
> > > > > be correct.  In your proposal we lose the easy verifiability
> > > > > because the link hash is embedded in the ebpf loader program
> > > > > which
> > > > > has to be disassembled to do the extraction of the hash and
> > > > > verify
> > > > > the loader is actually checking it.
> > > >
> > > > I am not sure I understand your concern. This is something that
> > > > can
> > > > easily be built into tooling / annotations.
> > > >
> > > >     bpftool -S -v <verification_key> <loader> <metadata>
> > > >
> > > > Could you explain what's the use-case for "easy verifiability".
> > >
> > > I mean verifiability of the hash chain link.  Given a signed
> > > program, (i.e. a .h file which is generated by bpftool) which is a
> > > signature over the loader only how would one use simple
> > > cryptographic operations to verify it?
> > >
> >
> > I literally just said it above the hash can be extracted if you
> > really want offline verification. Are you saying this code is hard to
> > write? or is the tooling hard to write? Do you have some definition
> > of "simple cryptographic operations".  All operations use tooling.
>
> As I said, you have a gap in that you not only have to extract the hash
> and verify it against the map (which I agree is fairly simple) but also
> verify the loader program actually checks it correctly.  That latter
> operation is not a simple cryptographic one and represents a security
> gap between this proposal and the hash linked chains you introduced in
> your first email in this thread.

Sure, but I don't see this as being problematic. If it's hard for
folks who do theoretical work, then I think it's okay to push this
effort on them rather than every user.

>
> > > > > I was looking at ways we could use a pure hash chain (i.e.
> > > > > signature over loader and real map hash) and it does strike me
> > > > > that the above ebpf hash verification code is pretty invariant
> > > > > and easy to construct, so it could run as a separate BPF
> > > > > fragment that then jumps to the real loader.  In that case, it
> > > > > could be constructed on the fly in a trusted environment, like
> > > > > the kernel, from the link hash in the signature and the
> > > > > signature could just be Sig(loader || map hash) which can then
> > > > > be
> > > >
> > > > The design I proposed does the same thing:
> > > >
> > > >     Sig(loader || H_metadata)
> > > >
> > > > metadata is actually the data (programs, context etc) that's
> > > > passed in the map. The verification just happens in the loader
> > > > program and the loader || H_metadata is implemented elegantly to
> > > > avoid any separate payloads.
> > >
> > > OK, so I think this is the crux of the problem:  In formal methods
> > > proving the validity of a data based hash link is an easy set of
> > > cryptographic operations.  You can assert that's equivalent to a
> > > signature over a program that verifies the hash, but formally
> > > proving it requires a formal analysis of the program to show that
> > > 1) it contains the correct hash and 2) it correctly checks the hash
> > > against the map.  That makes the task of someone receiving the .h
> > > file containing the signed skeleton way harder: it's easy to prove
> > > the signature matches the loader instructions, but they still have
> > > to prove the instructions contain and verify the correct map hash.
> > >
> >
> > I don't see this as a problem for 2 reasons:
> >
> > 1. It's not hard
>
> it requires disassembling the first 20 or so BPF instructions and
> verifying their operation, so that's harder than simply calculating
> hashes and signatures.
>
> > 2. Your typical user does not want to do formal verification and
> > extract signatures etc.
>
> Users don't want to do formal verification, agreed ... but they do want
> to know that security experts have verified the algorithms they're
> using.
>
> That's why I was thinking, since the loader preamble that verifies the
> hash is easy to construct, that the scheme could use a real hash linked
> chain, which has already been formally verified and is well understood,
> then construct the preamble for the loader you want in a trusted
> environment based on the hashes, meaning there's no security gap.
>
> Regards,
>
> James
>

[v3,0/4] Introducing Hornet LSM

Message

Comments