mbox series

[RFC,bpf-next,0/2] bpf, libbpf: share BTF data show functionality

Message ID 1610386373-24162-1-git-send-email-alan.maguire@oracle.com
Headers show
Series bpf, libbpf: share BTF data show functionality | expand

Message

Alan Maguire Jan. 11, 2021, 5:32 p.m. UTC
The BPF Type Format (BTF) can be used in conjunction with the helper
bpf_snprintf_btf() to display kernel data with type information.

This series generalizes that support and shares it with libbpf so
that libbpf can display typed data.  BTF display functionality is
factored out of kernel/bpf/btf.c into kernel/bpf/btf_show_common.c,
and that file is duplicated in tools/lib/bpf.  Similarly, common
definitions and inline functions needed for this support are
extracted into include/linux/btf_common.h and this header is again
duplicated in tools/lib/bpf.

Patch 1 carries out the refactoring, for which no kernel changes
are intended, and introduces btf__snprintf() a libbpf function
that supports dumping a string representation of typed data using
the struct btf * and id associated with that type.

Patch 2 tests btf__snprintf() with built-in and kernel types to
ensure data is of expected format.  The test closely mirrors
the BPF program associated with the snprintf_btf.c; in this case
however the string representations are verified in userspace rather
than in BPF program context.

Alan Maguire (2):
  bpf: share BTF "show" implementation between kernel and libbpf
  selftests/bpf: test libbpf-based type display

 include/linux/btf.h                                |  121 +-
 include/linux/btf_common.h                         |  286 +++++
 kernel/bpf/Makefile                                |    2 +-
 kernel/bpf/arraymap.c                              |    1 +
 kernel/bpf/bpf_struct_ops.c                        |    1 +
 kernel/bpf/btf.c                                   | 1215 +------------------
 kernel/bpf/btf_show_common.c                       | 1218 ++++++++++++++++++++
 kernel/bpf/core.c                                  |    1 +
 kernel/bpf/hashtab.c                               |    1 +
 kernel/bpf/local_storage.c                         |    1 +
 kernel/bpf/verifier.c                              |    1 +
 kernel/trace/bpf_trace.c                           |    1 +
 tools/lib/bpf/Build                                |    2 +-
 tools/lib/bpf/btf.h                                |    7 +
 tools/lib/bpf/btf_common.h                         |  286 +++++
 tools/lib/bpf/btf_show_common.c                    | 1218 ++++++++++++++++++++
 tools/lib/bpf/libbpf.map                           |    1 +
 .../selftests/bpf/prog_tests/snprintf_btf_user.c   |  192 +++
 18 files changed, 3236 insertions(+), 1319 deletions(-)
 create mode 100644 include/linux/btf_common.h
 create mode 100644 kernel/bpf/btf_show_common.c
 create mode 100644 tools/lib/bpf/btf_common.h
 create mode 100644 tools/lib/bpf/btf_show_common.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf_btf_user.c

Comments

Alexei Starovoitov Jan. 12, 2021, 2:45 a.m. UTC | #1
On Mon, Jan 11, 2021 at 05:32:51PM +0000, Alan Maguire wrote:
> The BPF Type Format (BTF) can be used in conjunction with the helper
> bpf_snprintf_btf() to display kernel data with type information.
> 
> This series generalizes that support and shares it with libbpf so
> that libbpf can display typed data.  BTF display functionality is
> factored out of kernel/bpf/btf.c into kernel/bpf/btf_show_common.c,
> and that file is duplicated in tools/lib/bpf.  Similarly, common
> definitions and inline functions needed for this support are
> extracted into include/linux/btf_common.h and this header is again
> duplicated in tools/lib/bpf.

I think duplication will inevitable cause code to diverge.
Could you keep a single copy?
Take a look at kernel/bpf/disasm.[ch]
It compiled once for the kernel and once for bpftool.
So should be possible to do something similar for this case as well?
Andrii Nakryiko Jan. 12, 2021, 7:48 a.m. UTC | #2
On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> libbpf already supports a "dumper" API for dumping type information,
> but there is currently no support for dumping typed _data_ via libbpf.
> However this functionality does exist in the kernel, in part to
> facilitate the bpf_snprintf_btf() helper which dumps a string
> representation of the pointer passed in utilizing the BTF type id
> of the data pointed to.  For example, the pair of a pointer to
> a "struct sk_buff" and the BTF type id of "struct sk_buff" can be
> used.
>
> Here the kernel code is generalized into btf_show_common.c.  For the
> most part, code is identical for userspace and kernel, beyond a few API
> differences and missing functions.  The only significant differences are
>
>  - the "safe copy" logic used by the kernel to ensure we do not induce a
>    crash during BPF operation; and
>  - the BTF seq file support that is kernel-only.
>
> The mechanics are to maintain identical btf_show_common.c files in
> kernel/bpf and tools/lib/bpf , and a common header btf_common.h in
> include/linux/ and tools/lib/bpf/.  This file duplication seems to
> be the common practice with duplication between kernel and tools/
> so it's the approach taken here.
>
> The common code approach could likely be explored further, but here
> the minimum common code required to support BTF show functionality is
> used.
>

I don't think this approach will work. libbpf and kernel have
considerably different restrictions and styles, I don't think it's
appropriate to take kernel code and try to fit it into libbpf almost
as is, with a bunch of #defines. It would be much cleaner, simpler,
and more maintainable to just re-implement core logic for libbpf, IMO.

> Currently the only "show" function for userspace is to write the
> representation of the typed data to a string via
>
> LIBBPF_API int
> btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
>               __u64 flags);
>
> ...but other approaches could be pursued including printf()-based
> show, or even a callback mechanism could be supported to allow
> user-defined show functions.
>

It's strange that you saw btf_dump APIs, and yet decided to go with
this API instead. snprintf() is not a natural "method" of struct btf.
Using char buffer as an output is overly restrictive and inconvenient.
It's appropriate for kernel and BPF program due to their restrictions,
but there is no need to cripple libbpf APIs for that. I think it
should follow btf_dump APIs with custom callback so that it's easy to
just printf() everything, but also user can create whatever elaborate
mechanism they need and that fits their use case.

Code reuse is not the ultimate goal, it should facilitate
maintainability, not harm it. There are times where sharing code
introduces unnecessary coupling and maintainability issues. And I
think this one is a very obvious case of that.

See below a few comments as well. But overall it's really hard to
review such a humongous patch, of course. So I so far just skimmed
through it.

> Here's an example usage, storing a string representation of
> struct sk_buff *skb in buf:
>
>         struct btf *btf = libbpf_find_kernel_btf();
>         char buf[8192];
>         __s32 skb_id;
>
>         skb_id = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
>         if (skb_id < 0)
>                 fprintf(stderr, "no skbuff, err %d\n", skb_id);
>         else
>                 btf__snprintf(btf, buf, sizeof(buf), skb_id, skb, 0);
>
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  include/linux/btf.h             |  121 +---
>  include/linux/btf_common.h      |  286 +++++++++
>  kernel/bpf/Makefile             |    2 +-
>  kernel/bpf/arraymap.c           |    1 +
>  kernel/bpf/bpf_struct_ops.c     |    1 +
>  kernel/bpf/btf.c                | 1215 +-------------------------------------
>  kernel/bpf/btf_show_common.c    | 1218 +++++++++++++++++++++++++++++++++++++++
>  kernel/bpf/core.c               |    1 +
>  kernel/bpf/hashtab.c            |    1 +
>  kernel/bpf/local_storage.c      |    1 +
>  kernel/bpf/verifier.c           |    1 +
>  kernel/trace/bpf_trace.c        |    1 +
>  tools/lib/bpf/Build             |    2 +-
>  tools/lib/bpf/btf.h             |    7 +
>  tools/lib/bpf/btf_common.h      |  286 +++++++++
>  tools/lib/bpf/btf_show_common.c | 1218 +++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.map        |    1 +
>  17 files changed, 3044 insertions(+), 1319 deletions(-)
>  create mode 100644 include/linux/btf_common.h
>  create mode 100644 kernel/bpf/btf_show_common.c
>  create mode 100644 tools/lib/bpf/btf_common.h
>  create mode 100644 tools/lib/bpf/btf_show_common.c
>

[...]

> +/* For kernel u64 is long long unsigned int... */
> +#define FMT64          "ll"
> +
> +#else
> +/* ...while for userspace it is long unsigned int.  These definitions avoid
> + * format specifier warnings.
> + */

that's not true, it depends on the architecture

> +#define FMT64          "l"
> +
> +/* libbpf names differ slightly to in-kernel function names. */
> +#define btf_type_by_id         btf__type_by_id
> +#define btf_name_by_offset     btf__name_by_offset
> +#define btf_str_by_offset      btf__str_by_offset
> +#define btf_resolve_size       btf__resolve_size

ugh... good luck navigating the code in libbpf....

> +
> +#endif /* __KERNEL__ */
> +/*
> + * Options to control show behaviour.
> + *     - BTF_SHOW_COMPACT: no formatting around type information
> + *     - BTF_SHOW_NONAME: no struct/union member names/types
> + *     - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values;
> + *       equivalent to %px.
> + *     - BTF_SHOW_ZERO: show zero-valued struct/union members; they
> + *       are not displayed by default
> + *     - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read
> + *       data before displaying it.
> + */
> +#define BTF_SHOW_COMPACT       BTF_F_COMPACT
> +#define BTF_SHOW_NONAME                BTF_F_NONAME
> +#define BTF_SHOW_PTR_RAW       BTF_F_PTR_RAW
> +#define BTF_SHOW_ZERO          BTF_F_ZERO
> +#define BTF_SHOW_UNSAFE                (1ULL << 4)

this (or some subset of them) should be done as opts struct's bool
fields for libbpf

> +
> +/*
> + * Copy len bytes of string representation of obj of BTF type_id into buf.
> + *
> + * @btf: struct btf object
> + * @type_id: type id of type obj points to
> + * @obj: pointer to typed data
> + * @buf: buffer to write to
> + * @len: maximum length to write to buf
> + * @flags: show options (see above)
> + *
> + * Return: length that would have been/was copied as per snprintf, or
> + *        negative error.
> + */
> +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
> +                          char *buf, int len, u64 flags);
> +
> +#define for_each_member(i, struct_type, member)                        \
> +       for (i = 0, member = btf_type_member(struct_type);      \
> +            i < btf_type_vlen(struct_type);                    \
> +            i++, member++)
> +
> +#define for_each_vsi(i, datasec_type, member)                  \
> +       for (i = 0, member = btf_type_var_secinfo(datasec_type);        \
> +            i < btf_type_vlen(datasec_type);                   \
> +            i++, member++)
> +
> +static inline bool btf_type_is_ptr(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
> +}
> +
> +static inline bool btf_type_is_int(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_INT;
> +}
> +
> +static inline bool btf_type_is_small_int(const struct btf_type *t)
> +{
> +       return btf_type_is_int(t) && t->size <= sizeof(u64);
> +}
> +
> +static inline bool btf_type_is_enum(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
> +}
> +
> +static inline bool btf_type_is_typedef(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;
> +}
> +
> +static inline bool btf_type_is_func(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC;
> +}
> +
> +static inline bool btf_type_is_func_proto(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
> +}
> +
> +static inline bool btf_type_is_var(const struct btf_type *t)
> +{
> +       return BTF_INFO_KIND(t->info) == BTF_KIND_VAR;
> +}
> +
> +/* union is only a special case of struct:
> + * all its offsetof(member) == 0
> + */
> +static inline bool btf_type_is_struct(const struct btf_type *t)
> +{
> +       u8 kind = BTF_INFO_KIND(t->info);
> +
> +       return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
> +}
> +
> +static inline bool btf_type_is_modifier(const struct btf_type *t)
> +{
> +       /* Some of them is not strictly a C modifier
> +        * but they are grouped into the same bucket
> +        * for BTF concern:
> +        *   A type (t) that refers to another
> +        *   type through t->type AND its size cannot
> +        *   be determined without following the t->type.
> +        *
> +        * ptr does not fall into this bucket
> +        * because its size is always sizeof(void *).
> +        */
> +       switch (BTF_INFO_KIND(t->info)) {
> +       case BTF_KIND_TYPEDEF:
> +       case BTF_KIND_VOLATILE:
> +       case BTF_KIND_CONST:
> +       case BTF_KIND_RESTRICT:
> +               return true;
> +       default:
> +               return false;
> +       }
> +}
> +
> +static inline
> +const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
> +                                              u32 id, u32 *res_id)
> +{
> +       const struct btf_type *t = btf_type_by_id(btf, id);
> +
> +       while (btf_type_is_modifier(t)) {
> +               id = t->type;
> +               t = btf_type_by_id(btf, t->type);
> +       }
> +
> +       if (res_id)
> +               *res_id = id;
> +
> +       return t;
> +}
> +
> +static inline u32 btf_type_int(const struct btf_type *t)
> +{
> +       return *(u32 *)(t + 1);
> +}
> +
> +static inline const struct btf_array *btf_type_array(const struct btf_type *t)
> +{
> +       return (const struct btf_array *)(t + 1);
> +}
> +
> +static inline const struct btf_enum *btf_type_enum(const struct btf_type *t)
> +{
> +       return (const struct btf_enum *)(t + 1);
> +}
> +
> +static inline const struct btf_var *btf_type_var(const struct btf_type *t)
> +{
> +       return (const struct btf_var *)(t + 1);
> +}
> +
> +static inline u16 btf_type_vlen(const struct btf_type *t)
> +{
> +       return BTF_INFO_VLEN(t->info);
> +}
> +
> +static inline u16 btf_func_linkage(const struct btf_type *t)
> +{
> +       return BTF_INFO_VLEN(t->info);
> +}
> +
> +/* size can be used */
> +static inline bool btf_type_has_size(const struct btf_type *t)
> +{
> +       switch (BTF_INFO_KIND(t->info)) {
> +       case BTF_KIND_INT:
> +       case BTF_KIND_STRUCT:
> +       case BTF_KIND_UNION:
> +       case BTF_KIND_ENUM:
> +       case BTF_KIND_DATASEC:
> +               return true;
> +       default:
> +               return false;
> +       }
> +}
> +
> +static inline const struct btf_member *btf_type_member(const struct btf_type *t)
> +{
> +       return (const struct btf_member *)(t + 1);
> +}
> +
> +static inline const struct btf_var_secinfo *btf_type_var_secinfo(
> +               const struct btf_type *t)
> +{
> +       return (const struct btf_var_secinfo *)(t + 1);
> +}
> +
> +static inline const char *__btf_name_by_offset(const struct btf *btf,
> +                                              u32 offset)
> +{
> +       const char *name;
> +
> +       if (!offset)
> +               return "(anon)";
> +
> +       name = btf_str_by_offset(btf, offset);
> +       return name ?: "(invalid-name-offset)";
> +}
> +

(almost?) all of the above helpers are already defined in libbpf's
btf.h, no need to add all this duplication

> +/* functions shared between btf.c and btf_show_common.c */
> +void btf_type_ops_show(const struct btf *btf, const struct btf_type *t,
> +                      __u32 type_id, void *obj, u8 bits_offset,
> +                      struct btf_show *show);

[...]

> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 1c0fd2d..35bd9dc 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -346,6 +346,7 @@ LIBBPF_0.3.0 {
>                 btf__parse_split;
>                 btf__new_empty_split;
>                 btf__new_split;
> +               btf__snprintf;

It's LIBBPF_0.4.0 already, I or someone else should send a patch
adding a new section in .map file.

>                 ring_buffer__epoll_fd;
>                 xsk_setup_xdp_prog;
>                 xsk_socket__update_xskmap;

> --
> 1.8.3.1
>
Alan Maguire Jan. 14, 2021, 3:37 p.m. UTC | #3
On Mon, 11 Jan 2021, Andrii Nakryiko wrote:

> On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> > Currently the only "show" function for userspace is to write the
> > representation of the typed data to a string via
> >
> > LIBBPF_API int
> > btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,
> >               __u64 flags);
> >
> > ...but other approaches could be pursued including printf()-based
> > show, or even a callback mechanism could be supported to allow
> > user-defined show functions.
> >
> 
> It's strange that you saw btf_dump APIs, and yet decided to go with
> this API instead. snprintf() is not a natural "method" of struct btf.
> Using char buffer as an output is overly restrictive and inconvenient.
> It's appropriate for kernel and BPF program due to their restrictions,
> but there is no need to cripple libbpf APIs for that. I think it
> should follow btf_dump APIs with custom callback so that it's easy to
> just printf() everything, but also user can create whatever elaborate
> mechanism they need and that fits their use case.
> 
> Code reuse is not the ultimate goal, it should facilitate
> maintainability, not harm it. There are times where sharing code
> introduces unnecessary coupling and maintainability issues. And I
> think this one is a very obvious case of that.
> 

Okay, so I've been exploring adding dumper API support.  The initial
approach I've been using is to provide an API like this:

/* match show flags for bpf_show_snprintf() */
enum {
        BTF_DUMP_F_COMPACT      =       (1ULL << 0),
        BTF_DUMP_F_NONAME       =       (1ULL << 1),
        BTF_DUMP_F_ZERO         =       (1ULL << 3),
};

struct btf_dump_emit_type_data_opts {
        /* size of this struct, for forward/backward compatibility */
        size_t sz;
        void *data;
        int indent_level;
        __u64 flags;
};
#define btf_dump_emit_type_data_opts__last_field flags

LIBBPF_API int
btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
                         const struct btf_dump_emit_type_data_opts *opts);


...so the opts play a similiar role to the struct btf_ptr + flags
in bpf_snprintf_btf.  I've got this working, but the current 
implementation is tied to emitting the same C-based syntax as 
bpf_snprintf_btf(); though of course the printf function is invoked.
So a use case looks something like this:

        struct btf_dump_emit_type_data_opts opts;
        char skbufmem[1024], skbufstr[8192];
        struct btf *btf = libbpf_find_kernel_btf();
        struct btf_dump *d;
        __s32 skbid;
        int indent = 0;

        memset(skbufmem, 0xff, sizeof(skbufmem));
        opts.data = skbufmem;
        opts.sz = sizeof(opts);
        opts.indent_level = indent;

        d = btf_dump__new(btf, NULL, NULL, printffn);

        skbid = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);
        if (skbid < 0) {
                fprintf(stderr, "no skbuff, err %d\n", skbid);
                exit(1);
        }

        btf_dump__emit_type_data(d, skbid, &opts);


..and we get output of the form

(struct sk_buff){
 (union){
  (struct){
   .next = (struct sk_buff *)0xffffffffffffffff,
   .prev = (struct sk_buff *)0xffffffffffffffff,
   (union){
    .dev = (struct net_device *)0xffffffffffffffff,
    .dev_scratch = (long unsigned int)18446744073709551615,
   },
  },
...

etc.  However it would be nice to find a way to help printf function
providers emit different formats such as JSON without having to
parse the data they are provided in the printf function.
That would remove the need for the output flags, since the printf
function provider could control display.

If we provided an option to provider a "kind" printf function,
and ensured that the BTF dumper sets a "kind" prior to each
_internal_ call to the printf function, we could use that info
to adapt output in various ways.  For example, consider the case
where we want to emit C-type output.  We can use the kind
info to control output for various scenarios:

void c_dump_kind_printf(struct btf_dump *d, enum btf_dump_kind kind,
			void *ctx, const char *fmt, va_list args)
{	
	switch (kind) {
	case BTF_DUMP_KIND_TYPE_NAME:
		/* For C, add brackets around the type name string ( ) */
		btf_dump__printf(d, "(");
		btf_dump__vprintf(d, fmt, args);
		btf_dump__printf(d, ")");
		break;
	case BTF_DUMP_KIND_MEMBER_NAME:
		/* for C, prefix a "." to member name, suffix a "=" */
		btf_dump__printf(d, ".");
		btf_dump__vprintf(d, fmt, args);
		btf_dump__printf(d, " = ");
		break;
	...

Whenever we internally call btf_dump_kind_printf() - and have
a kind printf function - it is invoked, and once it's added formatting
it invokes the printf function.  So there are two layers of callbacks

- the kind callback determines what we print based on the kinds
  of objects provided (type names, member names, type data, etc); and
- the printf callback determines _how_ we print (e.g. to a file, stdout,
  etc).

The above suggests we'd need to add btf_dump__*printf() functions.

This might allow us to refactor bpftool such that the
type traversal code lived in libbpf, while the specifics of
how that info is to be dumped live in bpftool.  We'd probably
need to provide a C-style kind dumper out of the box in libbpf
as a default mechanism.

What do you think?

Alan
Andrii Nakryiko Jan. 15, 2021, 3:51 a.m. UTC | #4
On Thu, Jan 14, 2021 at 7:37 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>

> On Mon, 11 Jan 2021, Andrii Nakryiko wrote:

>

> > On Mon, Jan 11, 2021 at 9:34 AM Alan Maguire <alan.maguire@oracle.com> wrote:

> > > Currently the only "show" function for userspace is to write the

> > > representation of the typed data to a string via

> > >

> > > LIBBPF_API int

> > > btf__snprintf(struct btf *btf, char *buf, int len, __u32 id, void *obj,

> > >               __u64 flags);

> > >

> > > ...but other approaches could be pursued including printf()-based

> > > show, or even a callback mechanism could be supported to allow

> > > user-defined show functions.

> > >

> >

> > It's strange that you saw btf_dump APIs, and yet decided to go with

> > this API instead. snprintf() is not a natural "method" of struct btf.

> > Using char buffer as an output is overly restrictive and inconvenient.

> > It's appropriate for kernel and BPF program due to their restrictions,

> > but there is no need to cripple libbpf APIs for that. I think it

> > should follow btf_dump APIs with custom callback so that it's easy to

> > just printf() everything, but also user can create whatever elaborate

> > mechanism they need and that fits their use case.

> >

> > Code reuse is not the ultimate goal, it should facilitate

> > maintainability, not harm it. There are times where sharing code

> > introduces unnecessary coupling and maintainability issues. And I

> > think this one is a very obvious case of that.

> >

>

> Okay, so I've been exploring adding dumper API support.  The initial

> approach I've been using is to provide an API like this:

>

> /* match show flags for bpf_show_snprintf() */

> enum {

>         BTF_DUMP_F_COMPACT      =       (1ULL << 0),

>         BTF_DUMP_F_NONAME       =       (1ULL << 1),

>         BTF_DUMP_F_ZERO         =       (1ULL << 3),

> };

>


I'd use bool fields instead, we are not constrained with extensibility
of this, no need for opaque "flags" field.

> struct btf_dump_emit_type_data_opts {

>         /* size of this struct, for forward/backward compatibility */

>         size_t sz;

>         void *data;


data is not optional, so should be moved out and be a direct argument
to btf_dump__emit_type_data()

>         int indent_level;

>         __u64 flags;

> };

> #define btf_dump_emit_type_data_opts__last_field flags

>

> LIBBPF_API int

> btf_dump__emit_type_data(struct btf_dump *d, __u32 id,

>                          const struct btf_dump_emit_type_data_opts *opts);

>


yes, this is something more like what I had in mind

>

> ...so the opts play a similiar role to the struct btf_ptr + flags

> in bpf_snprintf_btf.  I've got this working, but the current

> implementation is tied to emitting the same C-based syntax as

> bpf_snprintf_btf(); though of course the printf function is invoked.

> So a use case looks something like this:

>

>         struct btf_dump_emit_type_data_opts opts;

>         char skbufmem[1024], skbufstr[8192];

>         struct btf *btf = libbpf_find_kernel_btf();

>         struct btf_dump *d;

>         __s32 skbid;

>         int indent = 0;

>

>         memset(skbufmem, 0xff, sizeof(skbufmem));

>         opts.data = skbufmem;

>         opts.sz = sizeof(opts);

>         opts.indent_level = indent;

>

>         d = btf_dump__new(btf, NULL, NULL, printffn);

>

>         skbid = btf__find_by_name_kind(btf, "sk_buff", BTF_KIND_STRUCT);

>         if (skbid < 0) {

>                 fprintf(stderr, "no skbuff, err %d\n", skbid);

>                 exit(1);

>         }

>

>         btf_dump__emit_type_data(d, skbid, &opts);

>

>

> ..and we get output of the form

>

> (struct sk_buff){

>  (union){

>   (struct){

>    .next = (struct sk_buff *)0xffffffffffffffff,

>    .prev = (struct sk_buff *)0xffffffffffffffff,

>    (union){

>     .dev = (struct net_device *)0xffffffffffffffff,

>     .dev_scratch = (long unsigned int)18446744073709551615,

>    },

>   },

> ...

>

> etc.  However it would be nice to find a way to help printf function

> providers emit different formats such as JSON without having to

> parse the data they are provided in the printf function.

> That would remove the need for the output flags, since the printf

> function provider could control display.


I might have missed the stated goal for the work you are doing with
these changes, but in my mind it's mostly debugging/information dump
of some captured data, for human consumption. I'm very skeptical about
trying to generalize it to support JSON and other "structured"
formats. Humans won't be reading JSON when they have the ability to
look at human-readable C-like syntax. For any other application where
they'd want more structured representation (e.g., if they want to
filter, aggregate, etc), it's not really hard to implement similar
(but tailored to the application's needs) logic just given a raw data
dump and BTF information. Luckily, BTF and C types are simple enough
to do this quite effortlessly.

So I'm all for doing a text dump APIs (similar to how BTF-to-C dumping
API works), but against designing it for JSON and other formats.

>

> If we provided an option to provider a "kind" printf function,

> and ensured that the BTF dumper sets a "kind" prior to each

> _internal_ call to the printf function, we could use that info

> to adapt output in various ways.  For example, consider the case

> where we want to emit C-type output.  We can use the kind

> info to control output for various scenarios:

>

> void c_dump_kind_printf(struct btf_dump *d, enum btf_dump_kind kind,

>                         void *ctx, const char *fmt, va_list args)

> {

>         switch (kind) {

>         case BTF_DUMP_KIND_TYPE_NAME:

>                 /* For C, add brackets around the type name string ( ) */

>                 btf_dump__printf(d, "(");

>                 btf_dump__vprintf(d, fmt, args);

>                 btf_dump__printf(d, ")");

>                 break;

>         case BTF_DUMP_KIND_MEMBER_NAME:

>                 /* for C, prefix a "." to member name, suffix a "=" */

>                 btf_dump__printf(d, ".");

>                 btf_dump__vprintf(d, fmt, args);

>                 btf_dump__printf(d, " = ");

>                 break;

>         ...


Curious, when you are going to dump an array, you'll have separate
enums for start of array, start of array element, end of array
element, end of array, etc? It feels a bit like re-inventing
high-level semantics of the C type system, which BTF is already doing
(in a different way, of course). Which is why I'm saying having BTF
and raw bytes dump seems to be a more appropriate approach for more
sophisticated applications that need to understand data, not just
pretty-print it.

>

> Whenever we internally call btf_dump_kind_printf() - and have

> a kind printf function - it is invoked, and once it's added formatting

> it invokes the printf function.  So there are two layers of callbacks

>

> - the kind callback determines what we print based on the kinds

>   of objects provided (type names, member names, type data, etc); and

> - the printf callback determines _how_ we print (e.g. to a file, stdout,

>   etc).

>

> The above suggests we'd need to add btf_dump__*printf() functions.

>

> This might allow us to refactor bpftool such that the

> type traversal code lived in libbpf, while the specifics of

> how that info is to be dumped live in bpftool.  We'd probably

> need to provide a C-style kind dumper out of the box in libbpf

> as a default mechanism.

>

> What do you think?

>

> Alan