Message ID | 20201202001616.3378929-11-andrii@kernel.org |
---|---|
State | New |
Headers | show |
Series | Support BTF-powered BPF tracing programs for kernel modules | expand |
On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote: > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index c3458ec1f30a..60b95b51ccb8 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -558,6 +558,7 @@ union bpf_attr { > __u32 line_info_cnt; /* number of bpf_line_info records */ > __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ > __u32 attach_prog_fd; /* 0 to attach to vmlinux */ > + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ I think the uapi should use attach_btf_obj_fd here. Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs. BTF of a module isn't different from BTF of a program. Looking at libbpf implementation... it has the FD of a module anyway, since it needs to fetch it to search for the function btf_id in there. So there won't be any inconvenience for libbpf to pass FD in here.
On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote: > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > index c3458ec1f30a..60b95b51ccb8 100644 > > --- a/include/uapi/linux/bpf.h > > +++ b/include/uapi/linux/bpf.h > > @@ -558,6 +558,7 @@ union bpf_attr { > > __u32 line_info_cnt; /* number of bpf_line_info records */ > > __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ > > __u32 attach_prog_fd; /* 0 to attach to vmlinux */ > > + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ > > I think the uapi should use attach_btf_obj_fd here. > Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs. > BTF of a module isn't different from BTF of a program. > Looking at libbpf implementation... it has the FD of a module anyway, > since it needs to fetch it to search for the function btf_id in there. > So there won't be any inconvenience for libbpf to pass FD in here. > From the uapi perspective attach_btf_obj_fd will remove potential > race condition. It's very unlikely race, of course. Yes, I actually contemplated that, but my preference went the ID way, because it made libbpf implementation simpler and there was a nice duality of using ID for types and BTF instances themselves. The problem with FD is that when I load all module BTF objects, I open their FD one at a time, and close it as soon as I read BTF raw data back. If I don't do that on systems with many modules, I'll be keeping potentially hundreds of FDs open, so I figured I don't want to do that. But I do see the FD instead of ID consistency as well, so I can go with a simple and inefficient implementation of separate FD for each BTF object for now, and if someone complains, we can teach libbpf to lazily open FDs of module BTFs that are actually used (later, it will complicate code unnecessarily). Not really worried about racing with kernel modules being unloaded. Also, if we use FD, we might not need a new attach_bpf_obj_id field at all, we can re-use attach_prog_fd field (put it in union and have attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to check whether provided FD is for bpf_prog or btf. What do you think? Too mysterious? Or good? > > The rest of the series look good to me. Cool, thanks.
On Wed, Dec 2, 2020 at 2:43 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote: > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > index c3458ec1f30a..60b95b51ccb8 100644 > > > --- a/include/uapi/linux/bpf.h > > > +++ b/include/uapi/linux/bpf.h > > > @@ -558,6 +558,7 @@ union bpf_attr { > > > __u32 line_info_cnt; /* number of bpf_line_info records */ > > > __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ > > > __u32 attach_prog_fd; /* 0 to attach to vmlinux */ > > > + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ > > > > I think the uapi should use attach_btf_obj_fd here. > > Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs. > > BTF of a module isn't different from BTF of a program. > > Looking at libbpf implementation... it has the FD of a module anyway, > > since it needs to fetch it to search for the function btf_id in there. > > So there won't be any inconvenience for libbpf to pass FD in here. > > From the uapi perspective attach_btf_obj_fd will remove potential > > race condition. It's very unlikely race, of course. > > Yes, I actually contemplated that, but my preference went the ID way, > because it made libbpf implementation simpler and there was a nice > duality of using ID for types and BTF instances themselves. > > The problem with FD is that when I load all module BTF objects, I open > their FD one at a time, and close it as soon as I read BTF raw data > back. If I don't do that on systems with many modules, I'll be keeping > potentially hundreds of FDs open, so I figured I don't want to do > that. > > But I do see the FD instead of ID consistency as well, so I can go > with a simple and inefficient implementation of separate FD for each > BTF object for now, and if someone complains, we can teach libbpf to > lazily open FDs of module BTFs that are actually used (later, it will > complicate code unnecessarily). Not really worried about racing with > kernel modules being unloaded. > > Also, if we use FD, we might not need a new attach_bpf_obj_id field at > all, we can re-use attach_prog_fd field (put it in union and have > attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to > check whether provided FD is for bpf_prog or btf. What do you think? > Too mysterious? Or good? You mean like: union { __u32 attach_prog_fd; /* valid prog_fd to attach to bpf prog */ __u32 attach_btf_obj_fd; /* or valid module BTF object fd or zero to attach to vmlinux */ }; or don't introduce a new field name at all? Sure. I'm fine with both. I think it's a good idea.
On Wed, Dec 2, 2020 at 3:12 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Dec 2, 2020 at 2:43 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Dec 2, 2020 at 12:58 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Tue, Dec 01, 2020 at 04:16:12PM -0800, Andrii Nakryiko wrote: > > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > > > > index c3458ec1f30a..60b95b51ccb8 100644 > > > > --- a/include/uapi/linux/bpf.h > > > > +++ b/include/uapi/linux/bpf.h > > > > @@ -558,6 +558,7 @@ union bpf_attr { > > > > __u32 line_info_cnt; /* number of bpf_line_info records */ > > > > __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ > > > > __u32 attach_prog_fd; /* 0 to attach to vmlinux */ > > > > + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ > > > > > > I think the uapi should use attach_btf_obj_fd here. > > > Everywhere else uapi is using FDs to point to maps, progs, BTFs of progs. > > > BTF of a module isn't different from BTF of a program. > > > Looking at libbpf implementation... it has the FD of a module anyway, > > > since it needs to fetch it to search for the function btf_id in there. > > > So there won't be any inconvenience for libbpf to pass FD in here. > > > From the uapi perspective attach_btf_obj_fd will remove potential > > > race condition. It's very unlikely race, of course. > > > > Yes, I actually contemplated that, but my preference went the ID way, > > because it made libbpf implementation simpler and there was a nice > > duality of using ID for types and BTF instances themselves. > > > > The problem with FD is that when I load all module BTF objects, I open > > their FD one at a time, and close it as soon as I read BTF raw data > > back. If I don't do that on systems with many modules, I'll be keeping > > potentially hundreds of FDs open, so I figured I don't want to do > > that. > > > > But I do see the FD instead of ID consistency as well, so I can go > > with a simple and inefficient implementation of separate FD for each > > BTF object for now, and if someone complains, we can teach libbpf to > > lazily open FDs of module BTFs that are actually used (later, it will > > complicate code unnecessarily). Not really worried about racing with > > kernel modules being unloaded. > > > > Also, if we use FD, we might not need a new attach_bpf_obj_id field at > > all, we can re-use attach_prog_fd field (put it in union and have > > attach_prog_fd/attach_btf_fd). On the kernel side, it would be easy to > > check whether provided FD is for bpf_prog or btf. What do you think? > > Too mysterious? Or good? > > You mean like: > union { > __u32 attach_prog_fd; /* valid prog_fd to attach to > bpf prog */ > __u32 attach_btf_obj_fd; /* or valid module BTF > object fd or zero to attach to vmlinux */ > }; like this with union, an aliased field name with a meaningful name > or don't introduce a new field name at all? > Sure. I'm fine with both. I think it's a good idea. ok, will do this then
diff --git a/include/linux/btf.h b/include/linux/btf.h index fb608e4de076..53ce2bc6dc54 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -89,7 +89,9 @@ int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, char *buf, int len, u64 flags); int btf_get_fd_by_id(u32 id); +struct btf *btf_get_by_id(int id); u32 btf_obj_id(const struct btf *btf); +bool btf_is_kernel(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, const struct btf_member *m, u32 expected_offset, u32 expected_size); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c3458ec1f30a..60b95b51ccb8 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -558,6 +558,7 @@ union bpf_attr { __u32 line_info_cnt; /* number of bpf_line_info records */ __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ __u32 attach_prog_fd; /* 0 to attach to vmlinux */ + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ }; struct { /* anonymous struct used by BPF_OBJ_* commands */ diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 7a19bf5bfe97..12876b272c6b 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5652,6 +5652,19 @@ struct btf *btf_get_by_fd(int fd) return btf; } +struct btf *btf_get_by_id(int id) +{ + struct btf *btf; + + rcu_read_lock(); + btf = idr_find(&btf_idr, id); + if (!btf || !refcount_inc_not_zero(&btf->refcnt)) + btf = ERR_PTR(-ENOENT); + rcu_read_unlock(); + + return btf; +} + int btf_get_info_by_fd(const struct btf *btf, const union bpf_attr *attr, union bpf_attr __user *uattr) @@ -5717,12 +5730,7 @@ int btf_get_fd_by_id(u32 id) struct btf *btf; int fd; - rcu_read_lock(); - btf = idr_find(&btf_idr, id); - if (!btf || !refcount_inc_not_zero(&btf->refcnt)) - btf = ERR_PTR(-ENOENT); - rcu_read_unlock(); - + btf = btf_get_by_id(id); if (IS_ERR(btf)) return PTR_ERR(btf); @@ -5738,6 +5746,11 @@ u32 btf_obj_id(const struct btf *btf) return btf->id; } +bool btf_is_kernel(const struct btf *btf) +{ + return btf->kernel_btf; +} + static int btf_id_cmp_func(const void *a, const void *b) { const int *pa = a, *pb = b; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 5ee00611af53..3af073642664 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1968,7 +1968,7 @@ static void bpf_prog_load_fixup_attach_type(union bpf_attr *attr) static int bpf_prog_load_check_attach(enum bpf_prog_type prog_type, enum bpf_attach_type expected_attach_type, - u32 btf_id, u32 prog_fd) + u32 btf_obj_id, u32 btf_id, u32 prog_fd) { if (btf_id) { if (btf_id > BTF_MAX_TYPE) @@ -1985,6 +1985,9 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type, } } + if (btf_obj_id && (!btf_id || prog_fd)) + return -EINVAL; + if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING && prog_type != BPF_PROG_TYPE_EXT) return -EINVAL; @@ -2097,7 +2100,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type) } /* last field in 'union bpf_attr' used by this command */ -#define BPF_PROG_LOAD_LAST_FIELD attach_prog_fd +#define BPF_PROG_LOAD_LAST_FIELD attach_btf_obj_id static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -2146,6 +2149,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) bpf_prog_load_fixup_attach_type(attr); if (bpf_prog_load_check_attach(type, attr->expected_attach_type, + attr->attach_btf_obj_id, attr->attach_btf_id, attr->attach_prog_fd)) return -EINVAL; @@ -2158,7 +2162,19 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr) prog->expected_attach_type = attr->expected_attach_type; prog->aux->attach_btf_id = attr->attach_btf_id; - if (attr->attach_btf_id && !attr->attach_prog_fd) { + if (attr->attach_btf_obj_id) { + struct btf *btf; + + btf = btf_get_by_id(attr->attach_btf_obj_id); + if (IS_ERR(btf)) + return PTR_ERR(btf); + if (!btf_is_kernel(btf)) { + btf_put(btf); + return -EINVAL; + } + + prog->aux->attach_btf = btf; + } else if (attr->attach_btf_id && !attr->attach_prog_fd) { struct btf *btf; btf = bpf_get_btf_vmlinux(); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c3458ec1f30a..60b95b51ccb8 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -558,6 +558,7 @@ union bpf_attr { __u32 line_info_cnt; /* number of bpf_line_info records */ __u32 attach_btf_id; /* in-kernel BTF type id to attach to */ __u32 attach_prog_fd; /* 0 to attach to vmlinux */ + __u32 attach_btf_obj_id; /* vmlinux/module BTF object ID for BTF type */ }; struct { /* anonymous struct used by BPF_OBJ_* commands */
Add ability for user-space programs to specify non-vmlinux BTF when attaching BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this, add attach_btf_obj_id field which contains BTF object ID for either vmlinux or module. For backwards compatibility (and simplicity) reasons, 0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> --- include/linux/btf.h | 2 ++ include/uapi/linux/bpf.h | 1 + kernel/bpf/btf.c | 25 +++++++++++++++++++------ kernel/bpf/syscall.c | 22 +++++++++++++++++++--- tools/include/uapi/linux/bpf.h | 1 + 5 files changed, 42 insertions(+), 9 deletions(-)