mbox series

[RFC,security-next,0/4] Introducing Hornet LSM

Message ID 20250321164537.16719-1-bboscaccy@linux.microsoft.com
Headers show
Series Introducing Hornet LSM | expand

Message

Blaise Boscaccy March 21, 2025, 4:45 p.m. UTC
This patch series introduces the Hornet LSM.

Hornet takes a simple approach to light-skeleton-based eBPF signature
verification. Signature data can be easily generated for the binary
data that is generated via bpftool gen -L. This signature can be
appended to a skeleton executable via scripts/sign-ebpf. Hornet checks
the signature against a binary buffer containing the lskel
instructions that the loader maps use. Maps are frozen to prevent
TOCTOU bugs where a sufficiently privileged user could rewrite map
data between the calls to BPF_PROG_LOAD and
BPF_PROG_RUN. Additionally, both sparse-array-based and
fd_array_cnt-based map fd arrays are supported for signature
verification.


Blaise Boscaccy (4):
  security: Hornet LSM
  hornet: Introduce sign-ebpf
  hornet: Add an example lskel data extactor script
  selftests/hornet: Add a selftest for the hornet LSM

 Documentation/admin-guide/LSM/Hornet.rst     |  51 +++
 crypto/asymmetric_keys/pkcs7_verify.c        |  10 +
 include/linux/kernel_read_file.h             |   1 +
 include/linux/verification.h                 |   1 +
 include/uapi/linux/lsm.h                     |   1 +
 scripts/Makefile                             |   1 +
 scripts/hornet/Makefile                      |   5 +
 scripts/hornet/extract-skel.sh               |  29 ++
 scripts/hornet/sign-ebpf.c                   | 420 +++++++++++++++++++
 security/Kconfig                             |   3 +-
 security/Makefile                            |   1 +
 security/hornet/Kconfig                      |  11 +
 security/hornet/Makefile                     |   4 +
 security/hornet/hornet_lsm.c                 | 239 +++++++++++
 tools/testing/selftests/Makefile             |   1 +
 tools/testing/selftests/hornet/Makefile      |  51 +++
 tools/testing/selftests/hornet/loader.c      |  21 +
 tools/testing/selftests/hornet/trivial.bpf.c |  33 ++
 18 files changed, 882 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
 create mode 100644 scripts/hornet/Makefile
 create mode 100755 scripts/hornet/extract-skel.sh
 create mode 100644 scripts/hornet/sign-ebpf.c
 create mode 100644 security/hornet/Kconfig
 create mode 100644 security/hornet/Makefile
 create mode 100644 security/hornet/hornet_lsm.c
 create mode 100644 tools/testing/selftests/hornet/Makefile
 create mode 100644 tools/testing/selftests/hornet/loader.c
 create mode 100644 tools/testing/selftests/hornet/trivial.bpf.c

Comments

Paul Moore March 22, 2025, 8:48 p.m. UTC | #1
On Sat, Mar 22, 2025 at 4:44 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote:
> > > This patch series introduces the Hornet LSM.
> > >
> > > Hornet takes a simple approach to light-skeleton-based eBPF signature
> >
> > Can you define "light-skeleton-based" before using the term.
> >
> > This is the first time in my life when I hear about it.
>
> I was in the same situation a few months ago when I first heard about it :)
>
> Blaise can surely provide a much better answer that what I'm about to
> write, but since Blaise is going to be at LSFMMBPF this coming week I
> suspect he might not have a lot of time to respond to email in the
> next few days so I thought I would do my best to try and answer :)
>
> An eBPF "light skeleton" is basically a BPF loader program and while
> I'm sure there are several uses for a light skeleton, or lskel for
> brevity, the single use case that we are interested in here, and the
> one that Hornet deals with, is the idea of using a lskel to enable
> signature verification of BPF programs as it seems to be the one way
> that has been deemed acceptable by the BPF maintainers.
>
> Once again, skipping over a lot of details, the basic idea is that you
> take your original BPF program (A), feed it into a BPF userspace tool
> to encapsulate the original program A into a BPF map and generate a
> corresponding light skeleton BPF program (B), and then finally sign
> the resulting binary containing the lskel program (B) and map
> corresponding to the original program A.

Forgive me, I mixed up my "A" and "B" above :/

> At runtime, the lskel binary
> is loaded into the kernel, and if Hornet is enabled, the signature of
> both the lskel program A and original program B is verified.

... and I did again here

> If the
> signature verification passes, lskel program A performs the necessary
> BPF CO-RE transforms on BPF program A stored in the BPF map and then
> attempts to load the original BPF program B, all from within the
> kernel, and with the map frozen to prevent tampering from userspace.

... and once more here because why not? :)

> Hopefully that helps fill in some gaps until someone more
> knowledgeable can provide a better answer and/or correct any mistakes
> in my explanation above ;)
Jarkko Sakkinen March 22, 2025, 9:43 p.m. UTC | #2
On Sat, Mar 22, 2025 at 04:48:14PM -0400, Paul Moore wrote:
> On Sat, Mar 22, 2025 at 4:44 PM Paul Moore <paul@paul-moore.com> wrote:
> >
> > On Sat, Mar 22, 2025 at 1:22 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote:
> > > > This patch series introduces the Hornet LSM.
> > > >
> > > > Hornet takes a simple approach to light-skeleton-based eBPF signature
> > >
> > > Can you define "light-skeleton-based" before using the term.
> > >
> > > This is the first time in my life when I hear about it.
> >
> > I was in the same situation a few months ago when I first heard about it :)
> >
> > Blaise can surely provide a much better answer that what I'm about to
> > write, but since Blaise is going to be at LSFMMBPF this coming week I
> > suspect he might not have a lot of time to respond to email in the
> > next few days so I thought I would do my best to try and answer :)
> >
> > An eBPF "light skeleton" is basically a BPF loader program and while
> > I'm sure there are several uses for a light skeleton, or lskel for
> > brevity, the single use case that we are interested in here, and the
> > one that Hornet deals with, is the idea of using a lskel to enable
> > signature verification of BPF programs as it seems to be the one way
> > that has been deemed acceptable by the BPF maintainers.
> >
> > Once again, skipping over a lot of details, the basic idea is that you
> > take your original BPF program (A), feed it into a BPF userspace tool
> > to encapsulate the original program A into a BPF map and generate a
> > corresponding light skeleton BPF program (B), and then finally sign
> > the resulting binary containing the lskel program (B) and map
> > corresponding to the original program A.
> 
> Forgive me, I mixed up my "A" and "B" above :/
> 
> > At runtime, the lskel binary
> > is loaded into the kernel, and if Hornet is enabled, the signature of
> > both the lskel program A and original program B is verified.
> 
> ... and I did again here
> 
> > If the
> > signature verification passes, lskel program A performs the necessary
> > BPF CO-RE transforms on BPF program A stored in the BPF map and then
> > attempts to load the original BPF program B, all from within the
> > kernel, and with the map frozen to prevent tampering from userspace.
> 
> ... and once more here because why not? :)

No worries I was able to decipher this :-)

> 
> > Hopefully that helps fill in some gaps until someone more
> > knowledgeable can provide a better answer and/or correct any mistakes
> > in my explanation above ;)
> 
> -- 
> paul-moore.com

BR, Jarkko
Blaise Boscaccy March 31, 2025, 8:04 p.m. UTC | #3
Jarkko Sakkinen <jarkko@kernel.org> writes:

> On Fri, Mar 21, 2025 at 09:45:05AM -0700, Blaise Boscaccy wrote:
>> This script eases lskel developments against hornet by generating the
>
> 1. What iskel?

It's a "light-skeleton". I'll remove the abbreviations from this
patchset's commit messages. The jargon is hard enough to grok as-is. 

> 2. Why hornet is here in lower case?
>

Typo. Thanks for finding that. 

>> data payload used for code signing. It extracts the data out of the
>> autogenerated lskel header that gets created via bpftool.
>> 
>> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> ---
>>  scripts/hornet/extract-skel.sh | 29 +++++++++++++++++++++++++++++
>>  1 file changed, 29 insertions(+)
>>  create mode 100755 scripts/hornet/extract-skel.sh
>> 
>> diff --git a/scripts/hornet/extract-skel.sh b/scripts/hornet/extract-skel.sh
>> new file mode 100755
>> index 0000000000000..9ace78794b85e
>> --- /dev/null
>> +++ b/scripts/hornet/extract-skel.sh
>> @@ -0,0 +1,29 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# Copyright (c) 2025 Microsoft Corporation
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of version 2 of the GNU General Public
>> +# License as published by the Free Software Foundation.
>> +
>> +function usage() {
>> +    echo "Sample script for extracting instructions and map data out of"
>> +    echo "autogenerated eBPF lskel headers"
>> +    echo ""
>> +    echo "USAGE: header_file output_file"
>> +    exit
>> +}
>> +
>> +ARGC=$#
>> +
>> +EXPECTED_ARGS=2
>> +
>> +if [ $ARGC -ne $EXPECTED_ARGS ] ; then
>> +    usage
>> +else
>> +    printf $(gcc -E $1 | grep "static const char opts_insn" | \
>> +		 awk -F"=" '{print $2}' | sed 's/;\+$//' | sed 's/\"//g') > $2
>> +    printf $(gcc -E $1 | grep "static const char opts_data" | \
>> +		 awk -F"=" '{print $2}' | sed 's/;\+$//' | sed 's/\"//g') >> $2
>> +fi
>> -- 
>> 2.48.1
>> 
>
> BR, Jarkko
Blaise Boscaccy March 31, 2025, 8:08 p.m. UTC | #4
sergeh@kernel.org writes:

> On Fri, Mar 21, 2025 at 09:45:03AM -0700, Blaise Boscaccy wrote:
>> This adds the Hornet Linux Security Module which provides signature
>> verification of eBPF programs.
>> 
>> Hornet uses a similar signature verification scheme similar to that of
>
> used 'similar' twice
>
>> kernel modules. A pkcs#7 signature is appended to the end of an
>> executable file. During an invocation of bpf_prog_load, the signature
>> is fetched from the current task's executable file. That signature is
>> used to verify the integrity of the bpf instructions and maps which
>> where passed into the kernel. Additionally, Hornet implicitly trusts any
>
> s/where/were
>
>> programs which where loaded from inside kernel rather than userspace,
>
> s/where/were
>
>> which allows BPF_PRELOAD programs along with outputs for BPF_SYSCALL
>> programs to run.
>> 
>> Hornet allows users to continue to maintain an invariant that all code
>> running inside of the kernel has been signed and works well with
>> light-skeleton based loaders, or any statically generated program that
>> doesn't require userspace instruction rewriting.
>> 
>> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> ---
>>  Documentation/admin-guide/LSM/Hornet.rst |  51 +++++
>>  crypto/asymmetric_keys/pkcs7_verify.c    |  10 +
>>  include/linux/kernel_read_file.h         |   1 +
>>  include/linux/verification.h             |   1 +
>>  include/uapi/linux/lsm.h                 |   1 +
>>  security/Kconfig                         |   3 +-
>>  security/Makefile                        |   1 +
>>  security/hornet/Kconfig                  |  11 ++
>>  security/hornet/Makefile                 |   4 +
>>  security/hornet/hornet_lsm.c             | 239 +++++++++++++++++++++++
>>  10 files changed, 321 insertions(+), 1 deletion(-)
>>  create mode 100644 Documentation/admin-guide/LSM/Hornet.rst
>>  create mode 100644 security/hornet/Kconfig
>>  create mode 100644 security/hornet/Makefile
>>  create mode 100644 security/hornet/hornet_lsm.c
>> 
>> diff --git a/Documentation/admin-guide/LSM/Hornet.rst b/Documentation/admin-guide/LSM/Hornet.rst
>> new file mode 100644
>> index 0000000000000..fa112412638f1
>> --- /dev/null
>> +++ b/Documentation/admin-guide/LSM/Hornet.rst
>> @@ -0,0 +1,51 @@
>> +======
>> +Hornet
>> +======
>> +
>> +Hornet is a Linux Security Module that provides signature verification
>> +for eBPF programs. This is selectable at build-time with
>> +``CONFIG_SECURITY_HORNET``.
>> +
>> +Overview
>> +========
>> +
>> +Hornet provides signature verification for eBPF programs by utilizing
>> +the existing PKCS#7 infrastructure that's used for module signature
>> +verification. Hornet works by creating a buffer containing the eBPF
>> +program instructions along with its associated maps and checking a
>> +signature against that buffer. The signature is appended to the end of
>> +the lskel executable file and is extracted at runtime via
>> +get_task_exe_file. Hornet works by hooking into the
>> +security_bpf_prog_load hook. Load invocations that originate from the
>> +kernel (bpf preload, results of bpf_syscall programs, etc.) are
>> +allowed to run unconditionally. Calls that originate from userspace
>> +require signature verification. If signature verification fails, the
>> +program will fail to load.
>> +
>> +Instruction/Map Ordering
>> +========================
>> +
>> +Hornet supports both sparse-array based maps via map discovery along
>> +with the newly added fd_array_cnt API for continuous map arrays. The
>> +buffer used for signature verification is assumed to be the
>> +instructions followed by all maps used, ordered by their index in
>> +fd_array.
>> +
>> +Tooling
>> +=======
>> +
>> +Some tooling is provided to aid with the development of signed eBPF lskels.
>> +
>> +extract-skel.sh
>> +---------------
>> +
>> +This simple shell script extracts the instructions and map data used
>> +by the light skeleton from the autogenerated header file created by
>> +bpftool.
>> +
>> +sign-ebpf
>> +---------
>> +
>> +sign-ebpf works similarly to the sign-file script with one key
>> +difference: it takes a separate input binary used for signature
>> +verification and will append the signature to a different output file.
>> diff --git a/crypto/asymmetric_keys/pkcs7_verify.c b/crypto/asymmetric_keys/pkcs7_verify.c
>> index f0d4ff3c20a83..1a5fbb3612188 100644
>> --- a/crypto/asymmetric_keys/pkcs7_verify.c
>> +++ b/crypto/asymmetric_keys/pkcs7_verify.c
>> @@ -428,6 +428,16 @@ int pkcs7_verify(struct pkcs7_message *pkcs7,
>>  		}
>>  		/* Authattr presence checked in parser */
>>  		break;
>> +	case VERIFYING_EBPF_SIGNATURE:
>> +		if (pkcs7->data_type != OID_data) {
>> +			pr_warn("Invalid ebpf sig (not pkcs7-data)\n");
>> +			return -EKEYREJECTED;
>> +		}
>> +		if (pkcs7->have_authattrs) {
>> +			pr_warn("Invalid ebpf sig (has authattrs)\n");
>> +			return -EKEYREJECTED;
>> +		}
>> +		break;
>>  	case VERIFYING_UNSPECIFIED_SIGNATURE:
>>  		if (pkcs7->data_type != OID_data) {
>>  			pr_warn("Invalid unspecified sig (not pkcs7-data)\n");
>> diff --git a/include/linux/kernel_read_file.h b/include/linux/kernel_read_file.h
>> index 90451e2e12bd1..7ed9337be5423 100644
>> --- a/include/linux/kernel_read_file.h
>> +++ b/include/linux/kernel_read_file.h
>> @@ -14,6 +14,7 @@
>>  	id(KEXEC_INITRAMFS, kexec-initramfs)	\
>>  	id(POLICY, security-policy)		\
>>  	id(X509_CERTIFICATE, x509-certificate)	\
>> +	id(EBPF, ebpf)				\
>>  	id(MAX_ID, )
>>  
>>  #define __fid_enumify(ENUM, dummy) READING_ ## ENUM,
>> diff --git a/include/linux/verification.h b/include/linux/verification.h
>> index 4f3022d081c31..812be8ad5f744 100644
>> --- a/include/linux/verification.h
>> +++ b/include/linux/verification.h
>> @@ -35,6 +35,7 @@ enum key_being_used_for {
>>  	VERIFYING_KEXEC_PE_SIGNATURE,
>>  	VERIFYING_KEY_SIGNATURE,
>>  	VERIFYING_KEY_SELF_SIGNATURE,
>> +	VERIFYING_EBPF_SIGNATURE,
>>  	VERIFYING_UNSPECIFIED_SIGNATURE,
>>  	NR__KEY_BEING_USED_FOR
>>  };
>> diff --git a/include/uapi/linux/lsm.h b/include/uapi/linux/lsm.h
>> index 938593dfd5daf..2ff9bcdd551e2 100644
>> --- a/include/uapi/linux/lsm.h
>> +++ b/include/uapi/linux/lsm.h
>> @@ -65,6 +65,7 @@ struct lsm_ctx {
>>  #define LSM_ID_IMA		111
>>  #define LSM_ID_EVM		112
>>  #define LSM_ID_IPE		113
>> +#define LSM_ID_HORNET		114
>>  
>>  /*
>>   * LSM_ATTR_XXX definitions identify different LSM attributes
>> diff --git a/security/Kconfig b/security/Kconfig
>> index f10dbf15c2947..0030f0224c7ab 100644
>> --- a/security/Kconfig
>> +++ b/security/Kconfig
>> @@ -230,6 +230,7 @@ source "security/safesetid/Kconfig"
>>  source "security/lockdown/Kconfig"
>>  source "security/landlock/Kconfig"
>>  source "security/ipe/Kconfig"
>> +source "security/hornet/Kconfig"
>>  
>>  source "security/integrity/Kconfig"
>>  
>> @@ -273,7 +274,7 @@ config LSM
>>  	default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
>>  	default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
>>  	default "landlock,lockdown,yama,loadpin,safesetid,ipe,bpf" if DEFAULT_SECURITY_DAC
>> -	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,bpf"
>> +	default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,hornet,bpf"
>>  	help
>>  	  A comma-separated list of LSMs, in initialization order.
>>  	  Any LSMs left off this list, except for those with order
>> diff --git a/security/Makefile b/security/Makefile
>> index 22ff4c8bd8cec..e24bccd951f88 100644
>> --- a/security/Makefile
>> +++ b/security/Makefile
>> @@ -26,6 +26,7 @@ obj-$(CONFIG_CGROUPS)			+= device_cgroup.o
>>  obj-$(CONFIG_BPF_LSM)			+= bpf/
>>  obj-$(CONFIG_SECURITY_LANDLOCK)		+= landlock/
>>  obj-$(CONFIG_SECURITY_IPE)		+= ipe/
>> +obj-$(CONFIG_SECURITY_HORNET)		+= hornet/
>>  
>>  # Object integrity file lists
>>  obj-$(CONFIG_INTEGRITY)			+= integrity/
>> diff --git a/security/hornet/Kconfig b/security/hornet/Kconfig
>> new file mode 100644
>> index 0000000000000..19406aa237ac6
>> --- /dev/null
>> +++ b/security/hornet/Kconfig
>> @@ -0,0 +1,11 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +config SECURITY_HORNET
>> +	bool "Hornet support"
>> +	depends on SECURITY
>> +	default n
>> +	help
>> +	  This selects Hornet.
>> +	  Further information can be found in
>> +	  Documentation/admin-guide/LSM/Hornet.rst.
>> +
>> +	  If you are unsure how to answer this question, answer N.
>> diff --git a/security/hornet/Makefile b/security/hornet/Makefile
>> new file mode 100644
>> index 0000000000000..79f4657b215fa
>> --- /dev/null
>> +++ b/security/hornet/Makefile
>> @@ -0,0 +1,4 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +obj-$(CONFIG_SECURITY_HORNET) := hornet.o
>> +
>> +hornet-y := hornet_lsm.o
>> diff --git a/security/hornet/hornet_lsm.c b/security/hornet/hornet_lsm.c
>> new file mode 100644
>> index 0000000000000..3616c68b76fbc
>> --- /dev/null
>> +++ b/security/hornet/hornet_lsm.c
>> @@ -0,0 +1,239 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Hornet Linux Security Module
>> + *
>> + * Author: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> + *
>> + * Copyright (C) 2025 Microsoft Corporation
>> + */
>> +
>> +#include <linux/lsm_hooks.h>
>> +#include <uapi/linux/lsm.h>
>> +#include <linux/bpf.h>
>> +#include <linux/verification.h>
>> +#include <crypto/public_key.h>
>> +#include <linux/module_signature.h>
>> +#include <crypto/pkcs7.h>
>> +#include <linux/bpf_verifier.h>
>> +#include <linux/sort.h>
>> +
>> +#define EBPF_SIG_STRING "~eBPF signature appended~\n"
>> +
>> +struct hornet_maps {
>> +	u32 used_idx[MAX_USED_MAPS];
>> +	u32 used_map_cnt;
>> +	bpfptr_t fd_array;
>> +};
>> +
>> +static int cmp_idx(const void *a, const void *b)
>> +{
>> +	return *(const u32 *)a - *(const u32 *)b;
>> +}
>> +
>> +static int add_used_map(struct hornet_maps *maps, int idx)
>> +{
>> +	int i;
>> +
>> +	for (i = 0; i < maps->used_map_cnt; i++)
>> +		if (maps->used_idx[i] == idx)
>> +			return i;
>> +
>> +	if (maps->used_map_cnt >= MAX_USED_MAPS)
>> +		return -E2BIG;
>> +
>> +	maps->used_idx[maps->used_map_cnt] = idx;
>> +	return maps->used_map_cnt++;
>> +}
>> +
>> +static int hornet_find_maps(struct bpf_prog *prog, struct hornet_maps *maps)
>> +{
>> +	struct bpf_insn *insn = prog->insnsi;
>> +	int insn_cnt = prog->len;
>> +	int i;
>> +	int err;
>> +
>> +	for (i = 0; i < insn_cnt; i++, insn++) {
>> +		if (insn[0].code == (BPF_LD | BPF_IMM | BPF_DW)) {
>> +			switch (insn[0].src_reg) {
>> +			case BPF_PSEUDO_MAP_IDX_VALUE:
>> +			case BPF_PSEUDO_MAP_IDX:
>> +				err = add_used_map(maps, insn[0].imm);
>> +				if (err < 0)
>> +					return err;
>> +				break;
>> +			default:
>> +				break;
>> +			}
>> +		}
>> +	}
>> +	/* Sort the spare-array indices. This should match the map ordering used during
>> +	 * signature generation
>> +	 */
>> +	sort(maps->used_idx, maps->used_map_cnt, sizeof(*maps->used_idx),
>> +	     cmp_idx, NULL);
>> +
>> +	return 0;
>> +}
>> +
>> +static int hornet_populate_fd_array(struct hornet_maps *maps, u32 fd_array_cnt)
>> +{
>> +	int i;
>> +
>> +	if (fd_array_cnt > MAX_USED_MAPS)
>> +		return -E2BIG;
>> +
>> +	for (i = 0; i < fd_array_cnt; i++)
>> +		maps->used_idx[i] = i;
>> +
>> +	maps->used_map_cnt = fd_array_cnt;
>> +	return 0;
>> +}
>> +
>> +/* kern_sys_bpf is declared as an EXPORT_SYMBOL in kernel/bpf/syscall.c, however no definition is
>> + * provided in any bpf header files. If/when this function has a proper definition provided
>> + * somewhere this declaration should be removed
>> + */
>> +int kern_sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
>> +
>> +static int hornet_verify_lskel(struct bpf_prog *prog, struct hornet_maps *maps,
>> +			       void *sig, size_t sig_len)
>> +{
>> +	int fd;
>> +	u32 i;
>> +	void *buf;
>> +	void *new;
>> +	size_t buf_sz;
>> +	struct bpf_map *map;
>> +	int err = 0;
>> +	int key = 0;
>> +	union bpf_attr attr = {0};
>> +
>> +	buf = kmalloc_array(prog->len, sizeof(struct bpf_insn), GFP_KERNEL);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +	buf_sz = prog->len * sizeof(struct bpf_insn);
>> +	memcpy(buf, prog->insnsi, buf_sz);
>> +
>> +	for (i = 0; i < maps->used_map_cnt; i++) {
>> +		err = copy_from_bpfptr_offset(&fd, maps->fd_array,
>> +					      maps->used_idx[i] * sizeof(fd),
>> +					      sizeof(fd));
>> +		if (err < 0)
>> +			continue;
>> +		if (fd < 1)
>> +			continue;
>> +
>> +		map = bpf_map_get(fd);
>> +		if (IS_ERR(map))
>> +			continue;
>> +
>> +		/* don't allow userspace to change map data used for signature verification */
>> +		if (!map->frozen) {
>> +			attr.map_fd = fd;
>> +			err = kern_sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr));
>> +			if (err < 0)
>> +				goto out;
>> +		}
>> +
>> +		new = krealloc(buf, buf_sz + map->value_size, GFP_KERNEL);
>> +		if (!new) {
>> +			err = -ENOMEM;
>> +			goto out;
>> +		}
>> +		buf = new;
>> +		new = map->ops->map_lookup_elem(map, &key);
>> +		if (!new) {
>> +			err = -ENOENT;
>> +			goto out;
>> +		}
>> +		memcpy(buf + buf_sz, new, map->value_size);
>> +		buf_sz += map->value_size;
>> +	}
>> +
>> +	err = verify_pkcs7_signature(buf, buf_sz, sig, sig_len,
>> +				     VERIFY_USE_SECONDARY_KEYRING,
>> +				     VERIFYING_EBPF_SIGNATURE,
>> +				     NULL, NULL);
>> +out:
>> +	kfree(buf);
>> +	return err;
>> +}
>> +
>> +static int hornet_check_binary(struct bpf_prog *prog, union bpf_attr *attr,
>> +			       struct hornet_maps *maps)
>> +{
>> +	struct file *file = get_task_exe_file(current);
>> +	const unsigned long markerlen = sizeof(EBPF_SIG_STRING) - 1;
>> +	void *buf = NULL;
>> +	size_t sz = 0, sig_len, prog_len, buf_sz;
>> +	int err = 0;
>> +	struct module_signature sig;
>> +
>> +	buf_sz = kernel_read_file(file, 0, &buf, INT_MAX, &sz, READING_EBPF);
>> +	fput(file);
>> +	if (!buf_sz)
>> +		return -1;
>> +
>> +	prog_len = buf_sz;
>> +
>> +	if (prog_len > markerlen &&
>> +	    memcmp(buf + prog_len - markerlen, EBPF_SIG_STRING, markerlen) == 0)
>> +		prog_len -= markerlen;
>> +
>> +	memcpy(&sig, buf + (prog_len - sizeof(sig)), sizeof(sig));
>> +	sig_len = be32_to_cpu(sig.sig_len);
>> +	prog_len -= sig_len + sizeof(sig);
>> +
>> +	err = mod_check_sig(&sig, prog->len * sizeof(struct bpf_insn), "ebpf");
>> +	if (err)
>> +		return err;
>> +	return hornet_verify_lskel(prog, maps, buf + prog_len, sig_len);
>> +}
>> +
>> +static int hornet_check_signature(struct bpf_prog *prog, union bpf_attr *attr,
>> +				  struct bpf_token *token, bool is_kernel)
>
> It's a little confusing that you are passing is_kernel in here, when the
> only caller will always pass in true.  Is there a good reason not to
> drop the arg here and pass 'true' in to make_bpfptr().  Of course, then
> people will ask why not define an IS_KERNEL to true as passing true to
> second argument is cryptic...  Maybe you just can't win here :)
>

Initially during development churn, this code was using a bpfptr_t that
ended up becoming a boolean flag in the LSM hooks and this appears to be a
relic of that. I think I'll remove the boolean param to
hornet_check_signature since this code is only interested in checking
stuff that originiated in userspace.  

>> +{
>> +	struct hornet_maps maps = {0};
>> +	int err;
>> +
>> +	/* support both sparse arrays and explicit continuous arrays of map fds */
>> +	if (attr->fd_array_cnt)
>> +		err = hornet_populate_fd_array(&maps, attr->fd_array_cnt);
>> +	else
>> +		err = hornet_find_maps(prog, &maps);
>> +
>> +	if (err < 0)
>> +		return err;
>> +
>> +	maps.fd_array = make_bpfptr(attr->fd_array, is_kernel);
>> +	return hornet_check_binary(prog, attr, &maps);
>> +}
>> +
>> +static int hornet_bpf_prog_load(struct bpf_prog *prog, union bpf_attr *attr,
>> +				struct bpf_token *token, bool is_kernel)
>> +{
>> +	if (is_kernel)
>> +		return 0;
>> +	return hornet_check_signature(prog, attr, token, is_kernel);
>> +}
>> +
>> +static struct security_hook_list hornet_hooks[] __ro_after_init = {
>> +	LSM_HOOK_INIT(bpf_prog_load, hornet_bpf_prog_load),
>> +};
>> +
>> +static const struct lsm_id hornet_lsmid = {
>> +	.name = "hornet",
>> +	.id = LSM_ID_HORNET,
>> +};
>> +
>> +static int __init hornet_init(void)
>> +{
>> +	pr_info("Hornet: eBPF signature verification enabled\n");
>> +	security_add_hooks(hornet_hooks, ARRAY_SIZE(hornet_hooks), &hornet_lsmid);
>> +	return 0;
>> +}
>> +
>> +DEFINE_LSM(hornet) = {
>> +	.name = "hornet",
>> +	.init = hornet_init,
>> +};
>> -- 
>> 2.48.1
>>
Blaise Boscaccy March 31, 2025, 8:09 p.m. UTC | #5
Jonathan Corbet <corbet@lwn.net> writes:

> Blaise Boscaccy <bboscaccy@linux.microsoft.com> writes:
>
>> This adds the Hornet Linux Security Module which provides signature
>> verification of eBPF programs.
>>
>> Hornet uses a similar signature verification scheme similar to that of
>> kernel modules. A pkcs#7 signature is appended to the end of an
>> executable file. During an invocation of bpf_prog_load, the signature
>> is fetched from the current task's executable file. That signature is
>> used to verify the integrity of the bpf instructions and maps which
>> where passed into the kernel. Additionally, Hornet implicitly trusts any
>> programs which where loaded from inside kernel rather than userspace,
>> which allows BPF_PRELOAD programs along with outputs for BPF_SYSCALL
>> programs to run.
>>
>> Hornet allows users to continue to maintain an invariant that all code
>> running inside of the kernel has been signed and works well with
>> light-skeleton based loaders, or any statically generated program that
>> doesn't require userspace instruction rewriting.
>>
>> Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
>> ---
>>  Documentation/admin-guide/LSM/Hornet.rst |  51 +++++
>
> You will need to add that file to .../index.rst, or it won't be included
> in the docs build.
>
> Thanks,
>
> jon

Good catch, will get that fixed. Thanks Jon.
Blaise Boscaccy March 31, 2025, 8:57 p.m. UTC | #6
Jarkko Sakkinen <jarkko@kernel.org> writes:

Hi Jarkko,

Thanks for the comments. Paul did a very nice job providing some
background info, allow me to provide some additional data.

> On Fri, Mar 21, 2025 at 09:45:02AM -0700, Blaise Boscaccy wrote:
>> This patch series introduces the Hornet LSM.
>> 
>> Hornet takes a simple approach to light-skeleton-based eBPF signature
>
> Can you define "light-skeleton-based" before using the term.
>
> This is the first time in my life when I hear about it.
>

Sure. Here is the patchset where this stuff got introduced if you are
curious.
https://lore.kernel.org/bpf/20220209054315.73833-1-alexei.starovoitov@gmail.com/

eBPF has similar requirements to that of modules when it comes to
loading: find kallysym addresses, fix up elf relocations, some
struct field offset handing stuff called CO-RE (compile-one
run-anywhere), and some other miscellaneous bookkeeping.  During eBPF
program compilation, pseudo-values get written to the immedate operands
of instructions.  During loading, those pseudo-values get rewritten with
concrete addresses or data applicable to the currently running system,
e.g. a kallsym address or a fd for a map. This needs to happen before
the instructions for a bpf program are loaded into the kernel via the
bpf() syscall.

Unlike modules, an in-kernel loader unfortunately doesn't
exist. Typically, the instruction rewriting is done dynamically in
userspace via libbpf (or the rust/go/python loader). What skeletons do
is generate a script of required instruction-rewriting operations which
then gets played back at load-time against a hard-coded blob of raw
instruction data. This removes the need to distribute source-code or
object files.

There are two flavors of skeletons, normal skeletons, and light
skeletons. Normal skeletons utilize relocation logic that lives in
libbpf, and the relocations/instruction rewriting happen in userspace.
The second flavor, light skeletons, uses a small eBPF program that
contains the relocation lookup logic. As it's running in in the kernel,
it unpacks the target program, peforms the instruction rewriting, and
loads the target program. Light skeletons are currently utilized for
some drivers, and BPF_PRELOAD functionionality since they can operate
without userspace.

Light skeletons were recommended on various mailing list discussions as
the preffered path to performing signature verification. There are some
PoCs floating around that used light-skeletons in concert with
fs-verity/IMA and eBPF LSMs. We took a slightly different approach to
Hornet, by utilizing the existing PCKS#7 signing scheme that is used for
kernel modules.

>> verification. Signature data can be easily generated for the binary
>
> s/easily//
>
> Useless word having no measure.
>

Ack, thanks.


>> data that is generated via bpftool gen -L. This signature can be
>
> I have no idea what that command does.
>
> "Signature data can be generated for the binary data as follows:
>
> bpftool gen -L
>
> <explanation>"
>
> Here you'd need to answer to couple of unknowns:
>
> 1. What is in exact terms "signature data"?

That is a PKCS#7 signature of a data buffer containing the raw
instructions of an eBPF program, followed by the initial values of any
maps used by the program. 

> 2. What does "bpftool gen -L" do?
>

eBPF programs often have 2 parts. An orchestrator/loader program that
provides load -> attach/run -> i/o -> teardown logic and the in-kernel
program.

That command is used to generate a skeleton which can be used by the
orchestrator prgoram. Skeletons get generated as a C header file, that
contains various autogenerated functions that open and load bpf programs
as decribed above. That header file ends up being included in a
userspace orchestrator program or possibly a kernel module.

> This feedback maps to other examples too in the cover letter.
>
> BR, Jarkko


I'll rework this with some definitions of the eBPF subsystem jargon
along with your suggestions.

-blaise
Jarkko Sakkinen April 1, 2025, 3:50 p.m. UTC | #7
On Mon, Mar 31, 2025 at 01:57:15PM -0700, Blaise Boscaccy wrote:
> There are two flavors of skeletons, normal skeletons, and light
> skeletons. Normal skeletons utilize relocation logic that lives in
> libbpf, and the relocations/instruction rewriting happen in userspace.
> The second flavor, light skeletons, uses a small eBPF program that
> contains the relocation lookup logic. As it's running in in the kernel,
> it unpacks the target program, peforms the instruction rewriting, and
> loads the target program. Light skeletons are currently utilized for
> some drivers, and BPF_PRELOAD functionionality since they can operate
> without userspace.
> 
> Light skeletons were recommended on various mailing list discussions as
> the preffered path to performing signature verification. There are some
> PoCs floating around that used light-skeletons in concert with
> fs-verity/IMA and eBPF LSMs. We took a slightly different approach to
> Hornet, by utilizing the existing PCKS#7 signing scheme that is used for
> kernel modules.

Right, because in the normal skeletons relocation logic remains
unsigned?

I have to admit I don't fully cope how the relocation process translates
into eBPF program but I do get how it is better for signatures if it
does :-)

> 
> >> verification. Signature data can be easily generated for the binary
> >
> > s/easily//
> >
> > Useless word having no measure.
> >
> 
> Ack, thanks.
> 
> 
> >> data that is generated via bpftool gen -L. This signature can be
> >
> > I have no idea what that command does.
> >
> > "Signature data can be generated for the binary data as follows:
> >
> > bpftool gen -L
> >
> > <explanation>"
> >
> > Here you'd need to answer to couple of unknowns:
> >
> > 1. What is in exact terms "signature data"?
> 
> That is a PKCS#7 signature of a data buffer containing the raw
> instructions of an eBPF program, followed by the initial values of any
> maps used by the program. 

Got it, thanks. This motivates to refine my TPM2 asymmetric keys
series so that TPM2 could anchor these :-)

https://lore.kernel.org/linux-integrity/20240528210823.28798-1-jarkko@kernel.org/


> 
> > 2. What does "bpftool gen -L" do?
> >
> 
> eBPF programs often have 2 parts. An orchestrator/loader program that
> provides load -> attach/run -> i/o -> teardown logic and the in-kernel
> program.
> 
> That command is used to generate a skeleton which can be used by the
> orchestrator prgoram. Skeletons get generated as a C header file, that
> contains various autogenerated functions that open and load bpf programs
> as decribed above. That header file ends up being included in a
> userspace orchestrator program or possibly a kernel module.

I did read the man page now too, but thanks for the commentary!

> 
> > This feedback maps to other examples too in the cover letter.
> >
> > BR, Jarkko
> 
> 
> I'll rework this with some definitions of the eBPF subsystem jargon
> along with your suggestions.

Yeah, you should be able to put the gist a factor better to nutshell :-)

> 
> -blaise

BR, Jarkko
Blaise Boscaccy April 1, 2025, 6:56 p.m. UTC | #8
Jarkko Sakkinen <jarkko@kernel.org> writes:

> On Mon, Mar 31, 2025 at 01:57:15PM -0700, Blaise Boscaccy wrote:
>> There are two flavors of skeletons, normal skeletons, and light
>> skeletons. Normal skeletons utilize relocation logic that lives in
>> libbpf, and the relocations/instruction rewriting happen in userspace.
>> The second flavor, light skeletons, uses a small eBPF program that
>> contains the relocation lookup logic. As it's running in in the kernel,
>> it unpacks the target program, peforms the instruction rewriting, and
>> loads the target program. Light skeletons are currently utilized for
>> some drivers, and BPF_PRELOAD functionionality since they can operate
>> without userspace.
>> 
>> Light skeletons were recommended on various mailing list discussions as
>> the preffered path to performing signature verification. There are some
>> PoCs floating around that used light-skeletons in concert with
>> fs-verity/IMA and eBPF LSMs. We took a slightly different approach to
>> Hornet, by utilizing the existing PCKS#7 signing scheme that is used for
>> kernel modules.
>
> Right, because in the normal skeletons relocation logic remains
> unsigned?
>

Yup, Exactly. 

> I have to admit I don't fully cope how the relocation process translates
> into eBPF program but I do get how it is better for signatures if it
> does :-)
>
>> 
>> >> verification. Signature data can be easily generated for the binary
>> >
>> > s/easily//
>> >
>> > Useless word having no measure.
>> >
>> 
>> Ack, thanks.
>> 
>> 
>> >> data that is generated via bpftool gen -L. This signature can be
>> >
>> > I have no idea what that command does.
>> >
>> > "Signature data can be generated for the binary data as follows:
>> >
>> > bpftool gen -L
>> >
>> > <explanation>"
>> >
>> > Here you'd need to answer to couple of unknowns:
>> >
>> > 1. What is in exact terms "signature data"?
>> 
>> That is a PKCS#7 signature of a data buffer containing the raw
>> instructions of an eBPF program, followed by the initial values of any
>> maps used by the program. 
>
> Got it, thanks. This motivates to refine my TPM2 asymmetric keys
> series so that TPM2 could anchor these :-)
>
> https://lore.kernel.org/linux-integrity/20240528210823.28798-1-jarkko@kernel.org/
>
>

Oooh. That would be very nice :) 

>> 
>> > 2. What does "bpftool gen -L" do?
>> >
>> 
>> eBPF programs often have 2 parts. An orchestrator/loader program that
>> provides load -> attach/run -> i/o -> teardown logic and the in-kernel
>> program.
>> 
>> That command is used to generate a skeleton which can be used by the
>> orchestrator prgoram. Skeletons get generated as a C header file, that
>> contains various autogenerated functions that open and load bpf programs
>> as decribed above. That header file ends up being included in a
>> userspace orchestrator program or possibly a kernel module.
>
> I did read the man page now too, but thanks for the commentary!
>
>> 
>> > This feedback maps to other examples too in the cover letter.
>> >
>> > BR, Jarkko
>> 
>> 
>> I'll rework this with some definitions of the eBPF subsystem jargon
>> along with your suggestions.
>
> Yeah, you should be able to put the gist a factor better to nutshell :-)
>
>> 
>> -blaise
>
> BR, Jarkko