mbox series

[v3,0/2] bpftool: improve error handing for missing .BTF section

Message ID 20221217223509.88254-1-changbin.du@gmail.com
Headers show
Series bpftool: improve error handing for missing .BTF section | expand

Message

Changbin Du Dec. 17, 2022, 10:35 p.m. UTC
Display error message for missing ".BTF" section and clean up empty
vmlinux.h file.

v3:
 - fix typo and make error message consistent. (Andrii Nakryiko)
 - split out perf change.
v2:
 - remove vmlinux specific error info.
 - use builtin target .DELETE_ON_ERROR: to delete empty vmlinux.h


Changbin Du (2):
  libbpf: show error info about missing ".BTF" section
  bpf: makefiles: do not generate empty vmlinux.h

 tools/bpf/bpftool/Makefile           | 3 +++
 tools/lib/bpf/btf.c                  | 1 +
 tools/testing/selftests/bpf/Makefile | 3 +++
 3 files changed, 7 insertions(+)

Comments

Leo Yan Dec. 19, 2022, 3:45 a.m. UTC | #1
Hi Changbin,

On Sun, Dec 18, 2022 at 06:35:08AM +0800, Changbin Du wrote:
> Show the real problem instead of just saying "No such file or directory".
> 
> Now will print below info:
> libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux

Recently I encountered the same issue, it could be caused by:
either missing to install tool pahole or missing to enable kernel
configuration CONFIG_DEBUG_INFO_BTF.

Could we give explict info for reasoning failure?  Like:

"libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".

> Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory

This log is confusing when we can find vmlinux file but without BTF
section.  Consider to use a separate patch to detect vmlinux not
found case and print out "No such file or directory"?

Thanks,
Leo

> Signed-off-by: Changbin Du <changbin.du@gmail.com>
> ---
>  tools/lib/bpf/btf.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 71e165b09ed5..dd2badf1a54e 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>  	err = 0;
>  
>  	if (!btf_data) {
> +		pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
>  		err = -ENOENT;
>  		goto done;
>  	}
> -- 
> 2.37.2
>
Changbin Du Dec. 20, 2022, 1:31 a.m. UTC | #2
On Mon, Dec 19, 2022 at 11:45:08AM +0800, Leo Yan wrote:
> Hi Changbin,
> 
> On Sun, Dec 18, 2022 at 06:35:08AM +0800, Changbin Du wrote:
> > Show the real problem instead of just saying "No such file or directory".
> > 
> > Now will print below info:
> > libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
> 
> Recently I encountered the same issue, it could be caused by:
> either missing to install tool pahole or missing to enable kernel
> configuration CONFIG_DEBUG_INFO_BTF.
> 
> Could we give explict info for reasoning failure?  Like:
> 
> "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
> please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
>
This is vmlinux special information and similar tips are removed from
patch V2. libbpf is common for all ELFs.

> > Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
> 
> This log is confusing when we can find vmlinux file but without BTF
> section.  Consider to use a separate patch to detect vmlinux not
> found case and print out "No such file or directory"?
>
I think it's already there. If the file doesn't exist, open will fail.

> Thanks,
> Leo
> 
> > Signed-off-by: Changbin Du <changbin.du@gmail.com>
> > ---
> >  tools/lib/bpf/btf.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 71e165b09ed5..dd2badf1a54e 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >  	err = 0;
> >  
> >  	if (!btf_data) {
> > +		pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> >  		err = -ENOENT;
> >  		goto done;
> >  	}
> > -- 
> > 2.37.2
> >
Leo Yan Dec. 20, 2022, 11:34 a.m. UTC | #3
On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote:

[...]

> > > Now will print below info:
> > > libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
> > 
> > Recently I encountered the same issue, it could be caused by:
> > either missing to install tool pahole or missing to enable kernel
> > configuration CONFIG_DEBUG_INFO_BTF.
> > 
> > Could we give explict info for reasoning failure?  Like:
> > 
> > "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
> > please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
> >
> This is vmlinux special information and similar tips are removed from
> patch V2. libbpf is common for all ELFs.

Okay, I see.  Sorry for noise.

> > > Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
> > 
> > This log is confusing when we can find vmlinux file but without BTF
> > section.  Consider to use a separate patch to detect vmlinux not
> > found case and print out "No such file or directory"?
> >
> I think it's already there. If the file doesn't exist, open will fail.

[...]

> > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > >  	err = 0;
> > >  
> > >  	if (!btf_data) {
> > > +		pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > >  		err = -ENOENT;

btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
section, therefore, bpftool dumps error string "No such file or
directory".  It's confused that actually vmlinux is existed.

I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
better choice?) to replace -ENOENT at here, this can avoid bpftool to
outputs "No such file or directory" in this case.

Thanks,
Leo
Andrii Nakryiko Dec. 21, 2022, 12:13 a.m. UTC | #4
On Tue, Dec 20, 2022 at 3:34 AM Leo Yan <leo.yan@linaro.org> wrote:
>
> On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote:
>
> [...]
>
> > > > Now will print below info:
> > > > libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
> > >
> > > Recently I encountered the same issue, it could be caused by:
> > > either missing to install tool pahole or missing to enable kernel
> > > configuration CONFIG_DEBUG_INFO_BTF.
> > >
> > > Could we give explict info for reasoning failure?  Like:
> > >
> > > "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
> > > please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
> > >
> > This is vmlinux special information and similar tips are removed from
> > patch V2. libbpf is common for all ELFs.
>
> Okay, I see.  Sorry for noise.
>
> > > > Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
> > >
> > > This log is confusing when we can find vmlinux file but without BTF
> > > section.  Consider to use a separate patch to detect vmlinux not
> > > found case and print out "No such file or directory"?
> > >
> > I think it's already there. If the file doesn't exist, open will fail.
>
> [...]
>
> > > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > > >   err = 0;
> > > >
> > > >   if (!btf_data) {
> > > > +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > > >           err = -ENOENT;
>
> btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> section, therefore, bpftool dumps error string "No such file or
> directory".  It's confused that actually vmlinux is existed.
>
> I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> better choice?) to replace -ENOENT at here, this can avoid bpftool to
> outputs "No such file or directory" in this case.

The only really meaningful error code would be -ESRCH, which
strerror() will translate to "No such process", which is also
completely confusing.

In general, I always found these strerror() messages extremely
unhelpful and confusing. I wonder if we should make an effort to
actually emit symbolic names of errors instead (literally, "-ENOENT"
in this case). This is all tooling for engineers, I find -ENOENT or
-ESRCH much more meaningful as an error message, compared to "No such
file" seemingly human-readable interpretation.

Quenting, what do you think about the above proposal for bpftool? We
can have some libbpf helper internally and do it in libbpf error
messages as well and just reuse the logic in bpftool, perhaps?


Anyways, I've applied this patch set to bpf-next. Thanks.


>
> Thanks,
> Leo
patchwork-bot+netdevbpf@kernel.org Dec. 21, 2022, 12:20 a.m. UTC | #5
Hello:

This series was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Sun, 18 Dec 2022 06:35:07 +0800 you wrote:
> Display error message for missing ".BTF" section and clean up empty
> vmlinux.h file.
> 
> v3:
>  - fix typo and make error message consistent. (Andrii Nakryiko)
>  - split out perf change.
> v2:
>  - remove vmlinux specific error info.
>  - use builtin target .DELETE_ON_ERROR: to delete empty vmlinux.h
> 
> [...]

Here is the summary with links:
  - [v3,1/2] libbpf: show error info about missing ".BTF" section
    https://git.kernel.org/bpf/bpf-next/c/e6b4e1d759d3
  - [v3,2/2] bpf: makefiles: do not generate empty vmlinux.h
    https://git.kernel.org/bpf/bpf-next/c/e7f0d5cdd023

You are awesome, thank you!
Leo Yan Dec. 21, 2022, 3:55 a.m. UTC | #6
On Tue, Dec 20, 2022 at 04:13:13PM -0800, Andrii Nakryiko wrote:

[...]

> > > > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > > > >   err = 0;
> > > > >
> > > > >   if (!btf_data) {
> > > > > +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > > > >           err = -ENOENT;
> >
> > btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> > section, therefore, bpftool dumps error string "No such file or
> > directory".  It's confused that actually vmlinux is existed.
> >
> > I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> > better choice?) to replace -ENOENT at here, this can avoid bpftool to
> > outputs "No such file or directory" in this case.
> 
> The only really meaningful error code would be -ESRCH, which
> strerror() will translate to "No such process", which is also
> completely confusing.

Or maybe -ENODATA (No data available) is a better choice?

Thanks,
Leo

> In general, I always found these strerror() messages extremely
> unhelpful and confusing. I wonder if we should make an effort to
> actually emit symbolic names of errors instead (literally, "-ENOENT"
> in this case). This is all tooling for engineers, I find -ENOENT or
> -ESRCH much more meaningful as an error message, compared to "No such
> file" seemingly human-readable interpretation.
> 
> Quenting, what do you think about the above proposal for bpftool? We
> can have some libbpf helper internally and do it in libbpf error
> messages as well and just reuse the logic in bpftool, perhaps?
> 
> Anyways, I've applied this patch set to bpf-next. Thanks.
Andrii Nakryiko Dec. 22, 2022, 6:51 p.m. UTC | #7
On Tue, Dec 20, 2022 at 7:55 PM Leo Yan <leo.yan@linaro.org> wrote:
>
> On Tue, Dec 20, 2022 at 04:13:13PM -0800, Andrii Nakryiko wrote:
>
> [...]
>
> > > > > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > > > > >   err = 0;
> > > > > >
> > > > > >   if (!btf_data) {
> > > > > > +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > > > > >           err = -ENOENT;
> > >
> > > btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> > > section, therefore, bpftool dumps error string "No such file or
> > > directory".  It's confused that actually vmlinux is existed.
> > >
> > > I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> > > better choice?) to replace -ENOENT at here, this can avoid bpftool to
> > > outputs "No such file or directory" in this case.
> >
> > The only really meaningful error code would be -ESRCH, which
> > strerror() will translate to "No such process", which is also
> > completely confusing.
>
> Or maybe -ENODATA (No data available) is a better choice?

-ENODATA sounds good to me, yep.

>
> Thanks,
> Leo
>
> > In general, I always found these strerror() messages extremely
> > unhelpful and confusing. I wonder if we should make an effort to
> > actually emit symbolic names of errors instead (literally, "-ENOENT"
> > in this case). This is all tooling for engineers, I find -ENOENT or
> > -ESRCH much more meaningful as an error message, compared to "No such
> > file" seemingly human-readable interpretation.
> >
> > Quenting, what do you think about the above proposal for bpftool? We
> > can have some libbpf helper internally and do it in libbpf error
> > messages as well and just reuse the logic in bpftool, perhaps?
> >
> > Anyways, I've applied this patch set to bpf-next. Thanks.
Changbin Du Dec. 30, 2022, 12:10 p.m. UTC | #8
On Wed, Dec 21, 2022 at 11:55:24AM +0800, Leo Yan wrote:
> On Tue, Dec 20, 2022 at 04:13:13PM -0800, Andrii Nakryiko wrote:
> 
> [...]
> 
> > > > > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > > > > >   err = 0;
> > > > > >
> > > > > >   if (!btf_data) {
> > > > > > +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > > > > >           err = -ENOENT;
> > >
> > > btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> > > section, therefore, bpftool dumps error string "No such file or
> > > directory".  It's confused that actually vmlinux is existed.
> > >
> > > I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> > > better choice?) to replace -ENOENT at here, this can avoid bpftool to
> > > outputs "No such file or directory" in this case.
> > 
> > The only really meaningful error code would be -ESRCH, which
> > strerror() will translate to "No such process", which is also
> > completely confusing.
> 
> Or maybe -ENODATA (No data available) is a better choice?
> 
> Thanks,
> Leo
>
Yan, will you have a patch for this suggestion?

[snip]
Leo Yan Dec. 30, 2022, 12:28 p.m. UTC | #9
On Fri, Dec 30, 2022 at 08:10:20PM +0800, Changbin Du wrote:
> On Wed, Dec 21, 2022 at 11:55:24AM +0800, Leo Yan wrote:
> > On Tue, Dec 20, 2022 at 04:13:13PM -0800, Andrii Nakryiko wrote:
> > 
> > [...]
> > 
> > > > > > > @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> > > > > > >   err = 0;
> > > > > > >
> > > > > > >   if (!btf_data) {
> > > > > > > +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> > > > > > >           err = -ENOENT;
> > > >
> > > > btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> > > > section, therefore, bpftool dumps error string "No such file or
> > > > directory".  It's confused that actually vmlinux is existed.
> > > >
> > > > I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> > > > better choice?) to replace -ENOENT at here, this can avoid bpftool to
> > > > outputs "No such file or directory" in this case.
> > > 
> > > The only really meaningful error code would be -ESRCH, which
> > > strerror() will translate to "No such process", which is also
> > > completely confusing.
> > 
> > Or maybe -ENODATA (No data available) is a better choice?
> > 
> > Thanks,
> > Leo
> >
> Yan, will you have a patch for this suggestion?

You are welcome to send a patch, otherwise, I can cook one.

Thanks,
Leo
Quentin Monnet Jan. 3, 2023, 3:03 p.m. UTC | #10
2022-12-20 16:13 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
> On Tue, Dec 20, 2022 at 3:34 AM Leo Yan <leo.yan@linaro.org> wrote:
>>
>> On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote:
>>
>> [...]
>>
>>>>> Now will print below info:
>>>>> libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
>>>>
>>>> Recently I encountered the same issue, it could be caused by:
>>>> either missing to install tool pahole or missing to enable kernel
>>>> configuration CONFIG_DEBUG_INFO_BTF.
>>>>
>>>> Could we give explict info for reasoning failure?  Like:
>>>>
>>>> "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
>>>> please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
>>>>
>>> This is vmlinux special information and similar tips are removed from
>>> patch V2. libbpf is common for all ELFs.
>>
>> Okay, I see.  Sorry for noise.
>>
>>>>> Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
>>>>
>>>> This log is confusing when we can find vmlinux file but without BTF
>>>> section.  Consider to use a separate patch to detect vmlinux not
>>>> found case and print out "No such file or directory"?
>>>>
>>> I think it's already there. If the file doesn't exist, open will fail.
>>
>> [...]
>>
>>>>> @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>>>>>   err = 0;
>>>>>
>>>>>   if (!btf_data) {
>>>>> +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
>>>>>           err = -ENOENT;
>>
>> btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
>> section, therefore, bpftool dumps error string "No such file or
>> directory".  It's confused that actually vmlinux is existed.
>>
>> I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
>> better choice?) to replace -ENOENT at here, this can avoid bpftool to
>> outputs "No such file or directory" in this case.
> 
> The only really meaningful error code would be -ESRCH, which
> strerror() will translate to "No such process", which is also
> completely confusing.
> 
> In general, I always found these strerror() messages extremely
> unhelpful and confusing. I wonder if we should make an effort to
> actually emit symbolic names of errors instead (literally, "-ENOENT"
> in this case). This is all tooling for engineers, I find -ENOENT or
> -ESRCH much more meaningful as an error message, compared to "No such
> file" seemingly human-readable interpretation.
> 
> Quenting, what do you think about the above proposal for bpftool? We
> can have some libbpf helper internally and do it in libbpf error
> messages as well and just reuse the logic in bpftool, perhaps?

Apologies for the delay.
What you're proposing is to replace all messages currently looking like
this:

	$ bpftool prog
	Error: can't get next program: Operation not permitted

by:

	$ bpftool prog
	Error: can't get next program: -EPERM

Do I understand correctly?

I think the strerror() messages are helpful in some occasions (they
_are_ more human-friendly to many users), but it's also true that
they're not always precise. With bpftool, "Invalid argument" is a
classic when the program doesn't load, and may lead to confusion with
the args passed to bpftool on the command line. Then there are the other
corner cases like the one discussed in this thread. So, why not.

If we do change, yeah I'd rather have as much of this handling in libbpf
itself, and then adjust bpftool to handle the remaining cases, for
consistency.

Quentin
Andrii Nakryiko Jan. 3, 2023, 11:46 p.m. UTC | #11
On Tue, Jan 3, 2023 at 7:03 AM Quentin Monnet <quentin@isovalent.com> wrote:
>
> 2022-12-20 16:13 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > On Tue, Dec 20, 2022 at 3:34 AM Leo Yan <leo.yan@linaro.org> wrote:
> >>
> >> On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote:
> >>
> >> [...]
> >>
> >>>>> Now will print below info:
> >>>>> libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
> >>>>
> >>>> Recently I encountered the same issue, it could be caused by:
> >>>> either missing to install tool pahole or missing to enable kernel
> >>>> configuration CONFIG_DEBUG_INFO_BTF.
> >>>>
> >>>> Could we give explict info for reasoning failure?  Like:
> >>>>
> >>>> "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
> >>>> please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
> >>>>
> >>> This is vmlinux special information and similar tips are removed from
> >>> patch V2. libbpf is common for all ELFs.
> >>
> >> Okay, I see.  Sorry for noise.
> >>
> >>>>> Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
> >>>>
> >>>> This log is confusing when we can find vmlinux file but without BTF
> >>>> section.  Consider to use a separate patch to detect vmlinux not
> >>>> found case and print out "No such file or directory"?
> >>>>
> >>> I think it's already there. If the file doesn't exist, open will fail.
> >>
> >> [...]
> >>
> >>>>> @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >>>>>   err = 0;
> >>>>>
> >>>>>   if (!btf_data) {
> >>>>> +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> >>>>>           err = -ENOENT;
> >>
> >> btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
> >> section, therefore, bpftool dumps error string "No such file or
> >> directory".  It's confused that actually vmlinux is existed.
> >>
> >> I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
> >> better choice?) to replace -ENOENT at here, this can avoid bpftool to
> >> outputs "No such file or directory" in this case.
> >
> > The only really meaningful error code would be -ESRCH, which
> > strerror() will translate to "No such process", which is also
> > completely confusing.
> >
> > In general, I always found these strerror() messages extremely
> > unhelpful and confusing. I wonder if we should make an effort to
> > actually emit symbolic names of errors instead (literally, "-ENOENT"
> > in this case). This is all tooling for engineers, I find -ENOENT or
> > -ESRCH much more meaningful as an error message, compared to "No such
> > file" seemingly human-readable interpretation.
> >
> > Quenting, what do you think about the above proposal for bpftool? We
> > can have some libbpf helper internally and do it in libbpf error
> > messages as well and just reuse the logic in bpftool, perhaps?
>
> Apologies for the delay.
> What you're proposing is to replace all messages currently looking like
> this:
>
>         $ bpftool prog
>         Error: can't get next program: Operation not permitted
>
> by:
>
>         $ bpftool prog
>         Error: can't get next program: -EPERM
>
> Do I understand correctly?

yep, that's what I had in mind

>
> I think the strerror() messages are helpful in some occasions (they
> _are_ more human-friendly to many users), but it's also true that
> they're not always precise. With bpftool, "Invalid argument" is a
> classic when the program doesn't load, and may lead to confusion with
> the args passed to bpftool on the command line. Then there are the other
> corner cases like the one discussed in this thread. So, why not.

maybe the right approach would be to have both symbolic error name and
its human-readable representation, so for example above

Error: can't get next program: [-EPERM] Operation not permitted

or something like that? And if error value is unknown, just keep it as
integer: "[-5555]" ?

>
> If we do change, yeah I'd rather have as much of this handling in libbpf
> itself, and then adjust bpftool to handle the remaining cases, for
> consistency.

we can teach libbpf_strerror_r() to do this and if bpftool is going to
use it consistently then it would get the benefit automatically

>
> Quentin
Quentin Monnet Jan. 5, 2023, 2:57 p.m. UTC | #12
2023-01-03 15:46 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
> On Tue, Jan 3, 2023 at 7:03 AM Quentin Monnet <quentin@isovalent.com> wrote:
>>
>> 2022-12-20 16:13 UTC-0800 ~ Andrii Nakryiko <andrii.nakryiko@gmail.com>
>>> On Tue, Dec 20, 2022 at 3:34 AM Leo Yan <leo.yan@linaro.org> wrote:
>>>>
>>>> On Tue, Dec 20, 2022 at 09:31:14AM +0800, Changbin Du wrote:
>>>>
>>>> [...]
>>>>
>>>>>>> Now will print below info:
>>>>>>> libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux
>>>>>>
>>>>>> Recently I encountered the same issue, it could be caused by:
>>>>>> either missing to install tool pahole or missing to enable kernel
>>>>>> configuration CONFIG_DEBUG_INFO_BTF.
>>>>>>
>>>>>> Could we give explict info for reasoning failure?  Like:
>>>>>>
>>>>>> "libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux,
>>>>>> please install pahole and enable CONFIG_DEBUG_INFO_BTF=y for kernel building".
>>>>>>
>>>>> This is vmlinux special information and similar tips are removed from
>>>>> patch V2. libbpf is common for all ELFs.
>>>>
>>>> Okay, I see.  Sorry for noise.
>>>>
>>>>>>> Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory
>>>>>>
>>>>>> This log is confusing when we can find vmlinux file but without BTF
>>>>>> section.  Consider to use a separate patch to detect vmlinux not
>>>>>> found case and print out "No such file or directory"?
>>>>>>
>>>>> I think it's already there. If the file doesn't exist, open will fail.
>>>>
>>>> [...]
>>>>
>>>>>>> @@ -990,6 +990,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>>>>>>>   err = 0;
>>>>>>>
>>>>>>>   if (!btf_data) {
>>>>>>> +         pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
>>>>>>>           err = -ENOENT;
>>>>
>>>> btf_parse_elf() returns -ENOENT when ELF file doesn't contain BTF
>>>> section, therefore, bpftool dumps error string "No such file or
>>>> directory".  It's confused that actually vmlinux is existed.
>>>>
>>>> I am wondering if we can use error -LIBBPF_ERRNO__FORMAT (or any
>>>> better choice?) to replace -ENOENT at here, this can avoid bpftool to
>>>> outputs "No such file or directory" in this case.
>>>
>>> The only really meaningful error code would be -ESRCH, which
>>> strerror() will translate to "No such process", which is also
>>> completely confusing.
>>>
>>> In general, I always found these strerror() messages extremely
>>> unhelpful and confusing. I wonder if we should make an effort to
>>> actually emit symbolic names of errors instead (literally, "-ENOENT"
>>> in this case). This is all tooling for engineers, I find -ENOENT or
>>> -ESRCH much more meaningful as an error message, compared to "No such
>>> file" seemingly human-readable interpretation.
>>>
>>> Quenting, what do you think about the above proposal for bpftool? We
>>> can have some libbpf helper internally and do it in libbpf error
>>> messages as well and just reuse the logic in bpftool, perhaps?
>>
>> Apologies for the delay.
>> What you're proposing is to replace all messages currently looking like
>> this:
>>
>>         $ bpftool prog
>>         Error: can't get next program: Operation not permitted
>>
>> by:
>>
>>         $ bpftool prog
>>         Error: can't get next program: -EPERM
>>
>> Do I understand correctly?
> 
> yep, that's what I had in mind
> 
>>
>> I think the strerror() messages are helpful in some occasions (they
>> _are_ more human-friendly to many users), but it's also true that
>> they're not always precise. With bpftool, "Invalid argument" is a
>> classic when the program doesn't load, and may lead to confusion with
>> the args passed to bpftool on the command line. Then there are the other
>> corner cases like the one discussed in this thread. So, why not.
> 
> maybe the right approach would be to have both symbolic error name and
> its human-readable representation, so for example above
> 
> Error: can't get next program: [-EPERM] Operation not permitted
> 
> or something like that? And if error value is unknown, just keep it as
> integer: "[-5555]" ?
That would be great, we'd have both the error name for savvy users and
the (more or less accurate) interpretation for others.

>> If we do change, yeah I'd rather have as much of this handling in libbpf
>> itself, and then adjust bpftool to handle the remaining cases, for
>> consistency.
> 
> we can teach libbpf_strerror_r() to do this and if bpftool is going to
> use it consistently then it would get the benefit automatically
Sounds good to me.