Message ID | f369c4e4-2c2b-4eae-3353-05cb7282af24@gmail.com |
---|---|
State | New |
Headers | show |
On 2016.12.15 at 19:27 -0700, Martin Sebor wrote: > On 12/14/2016 09:19 PM, Jeff Law wrote: > > On 12/14/2016 03:56 PM, Martin Sebor wrote: > > > The -Wnonnull warning improvement (PR c/17308 - nonnull attribute > > > not as useful as it could be) causes a couple of false positives > > > in a powerpc64le bootstrap. The attached fix suppresses them. > > > It passes bootstrap on powerpc64le but tests are still running. > > > > > > I couldn't reproduce the bootstrap comparison failure that Segher > > > mentioned in his comment on the bug. I've gone over the patch > > > again to see if I could spot what could possibly be behind it > > > but couldn't really see anything. Jeff, you mentioned attribute > > > nonnull is exploited by the null pointer optimization. Do you > > > think it could possibly have this effect in GCC? > > It's certainly possible. It would be an indicator that something in GCC > > is passing a NULL pointer into a routine where it shouldn't at runtime > > and that GCC is using its new knowledge to optimize the code assuming > > that can't happen. > > > > ie, it's an indicator of a latent bug in GCC. Depending on the > > difficulty in tracking down, you may need to revert the change. This is > > exactly the kind of scenario I was concerned about in that approval > > message. > > I couldn't reproduce the comparison error either on powerpc64-linux > or on powerpc64le-linux. > > > The change to the vec<T, va_heap, vl_ptr> is OK. Can you please verify > > that if you install just that change that ppc bootstraps are restored > > and if so, please install. > > Done. > > A profiledbootstrap exposed a few more instances of the enhanced > -Wnonnull warning. I spent some time analyzing them and decided > that three of them are justified (those in gengtype-parse.c and > gengtype.c) and the other three false positives. > > The attached patch silences them. > > The gengtype code alternates between checking for null pointers > and assuming that the same pointers are not null (and passing nulls > to nonnull functions like libiberty's lbasename). It could probably > benefit from some more cleanup but I'm out of cycles for that. > > There are two outstanding instances of -Wnonnull in the profiled- > bootstrap log that both point to the same function that I haven't > figured out yet: > > In function ‘void check_function_format(tree, int, tree_node**)’: > cc1plus: warning: argument 1 null where non-null expected [-Wnonnull] > /usr/include/string.h:398:15: note: in a call to function ‘size_t > strlen(const char*)’ declared here > > I'll deal with them next but I want to get this patch reviewed > and approved so bootstrap can be restored on the targets affected > by the vec.h warning. > > While waiting for builds to finish I also built Binutils, Busybox, > and the Linux kernel to see if the warning shows up there. It does > not. It does for me with an allmodconf. At -O2 I get three warnings, and at -O3 I get two additional warnings. Now these additional ones happen way too deep into the pipeline to be reliable. (For a reduced testcase see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c8) When you as the author of the warning have difficulties in figuring out what causes these warnings, what about the average user? Therefore -fsanitize=undefined should be preferred. It gives you better info and a backtrace that makes diagnosis easy. So in my opinion -Wnonnull should warn only for trivial cases. -- Markus
On 12/16/2016 01:13 AM, Markus Trippelsdorf wrote: > On 2016.12.15 at 19:27 -0700, Martin Sebor wrote: >> On 12/14/2016 09:19 PM, Jeff Law wrote: >>> On 12/14/2016 03:56 PM, Martin Sebor wrote: >>>> The -Wnonnull warning improvement (PR c/17308 - nonnull attribute >>>> not as useful as it could be) causes a couple of false positives >>>> in a powerpc64le bootstrap. The attached fix suppresses them. >>>> It passes bootstrap on powerpc64le but tests are still running. >>>> >>>> I couldn't reproduce the bootstrap comparison failure that Segher >>>> mentioned in his comment on the bug. I've gone over the patch >>>> again to see if I could spot what could possibly be behind it >>>> but couldn't really see anything. Jeff, you mentioned attribute >>>> nonnull is exploited by the null pointer optimization. Do you >>>> think it could possibly have this effect in GCC? >>> It's certainly possible. It would be an indicator that something in GCC >>> is passing a NULL pointer into a routine where it shouldn't at runtime >>> and that GCC is using its new knowledge to optimize the code assuming >>> that can't happen. >>> >>> ie, it's an indicator of a latent bug in GCC. Depending on the >>> difficulty in tracking down, you may need to revert the change. This is >>> exactly the kind of scenario I was concerned about in that approval >>> message. >> >> I couldn't reproduce the comparison error either on powerpc64-linux >> or on powerpc64le-linux. >> >>> The change to the vec<T, va_heap, vl_ptr> is OK. Can you please verify >>> that if you install just that change that ppc bootstraps are restored >>> and if so, please install. >> >> Done. >> >> A profiledbootstrap exposed a few more instances of the enhanced >> -Wnonnull warning. I spent some time analyzing them and decided >> that three of them are justified (those in gengtype-parse.c and >> gengtype.c) and the other three false positives. >> >> The attached patch silences them. >> >> The gengtype code alternates between checking for null pointers >> and assuming that the same pointers are not null (and passing nulls >> to nonnull functions like libiberty's lbasename). It could probably >> benefit from some more cleanup but I'm out of cycles for that. >> >> There are two outstanding instances of -Wnonnull in the profiled- >> bootstrap log that both point to the same function that I haven't >> figured out yet: >> >> In function ‘void check_function_format(tree, int, tree_node**)’: >> cc1plus: warning: argument 1 null where non-null expected [-Wnonnull] >> /usr/include/string.h:398:15: note: in a call to function ‘size_t >> strlen(const char*)’ declared here >> >> I'll deal with them next but I want to get this patch reviewed >> and approved so bootstrap can be restored on the targets affected >> by the vec.h warning. >> >> While waiting for builds to finish I also built Binutils, Busybox, >> and the Linux kernel to see if the warning shows up there. It does >> not. > > It does for me with an allmodconf. At -O2 I get three warnings, and at > -O3 I get two additional warnings. Now these additional ones happen way > too deep into the pipeline to be reliable. (For a reduced testcase see: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c8) The warning looks quite justified in this case. The first call to sm_read_sector exits with the pointer being null and so the second call is diagnosed. That seems like a great example of when the warning is useful. > > When you as the author of the warning have difficulties in figuring out > what causes these warnings, what about the average user? I haven't encountered such difficulties. The instance I pointed to above in function check_function_format suggests there is a bug that prevents GCC from pointing to the line that triggers warning (it reads "cc1plus: warning: argument 1 null"). I just haven't had time to understand what causes it. > Therefore -fsanitize=undefined should be preferred. It gives you better > info and a backtrace that makes diagnosis easy. > > So in my opinion -Wnonnull should warn only for trivial cases. The sanitizer is a useful and more reliable tool but it has a number of limitations that put it off limits for many projects (runtime instrumentation and cost, may not be available for embedded systems, relies on running the code and exercising all paths to find bugs). Warnings have their own limitations (false positives and negatives) but they are useful as the first line of defense that doesn't suffer from the sanitizer limitations. They help catch mistakes before they get committed into the code base and affect other programmers. That's by far the cheapest way to find bugs. (Sorry if I'm stating the obvious.) The -Wnonnull implementation in GCC 6 and prior only warns on null pointer constants, like in memset(NULL, 0, n). It's exceedingly unlikely that someone will write such code by mistake, and so the warning is substantially ineffective. That's why several users over the last decade have requested it be enhanced. To be honest, I'm surprised that this is proving controversial. This is not something new. It's an improvement to an existing warning that people have asked for. It's bound to have some false positives but so far the rate is far lower than that of almost all the other warnings on the list I posted across the four projects I built. I don't claim it can't be improved but it seems pretty good as it is already. Among the 6 instances it's found in GCC three look like real bugs. Martin
On Fri, Dec 16, 2016 at 09:36:25AM -0700, Martin Sebor wrote: > > It does for me with an allmodconf. At -O2 I get three warnings, and at > > -O3 I get two additional warnings. Now these additional ones happen way > > too deep into the pipeline to be reliable. (For a reduced testcase see: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c8) > > The warning looks quite justified in this case. The first call > to sm_read_sector exits with the pointer being null and so the > second call is diagnosed. That seems like a great example of > when the warning is useful. No. The first call to sm_read_sector just doesn't exit. So it is warning about dead code. > I don't claim it can't be improved but it seems pretty good as > it is already. Among the 6 instances it's found in GCC three > look like real bugs. None look like real bugs to me. Jakub
On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: > > No. The first call to sm_read_sector just doesn't exit. So it is warning > > about dead code. > > If the code is dead then GCC should eliminate it. With it eliminated There is (especially with jump threading, but not limited to that, other optimizations may result in that too), code that even very smart optimizing compiler isn't able to prove that is dead. > the warning would be gone. The warning isn't smart and it doesn't > try to be. It only works with what GCC gives it. In this case the > dump shows that GCC thinks the code is reachable. If it isn't that > would seem to highlight a missed optimization opportunity, not a need > to make the warning smarter than the optimizer. No, it highlights that the warning is done in a wrong place where it suffers from too many false positives. > > None look like real bugs to me. > > They do to me. There are calls in gengtype.c to a function decorated > with attribute nonnull (lbasename) that pass to it a pointer that's > potentially null. For example below. get_input_file_name returns Most pointers passed to functions with nonnull attributes are, from the compiler POV, potentially NULL. Usually the compiler just can't prove it can't be non-NULL, it is an exception if it can. If you have a generic function that sometimes can be called with NULL (or some other "failure" argument) and in that case return NULL, and for valid arguments it guarantees returning non-NULL (and IMHO the gengtype functions we talk about fall into that category), then it is not a bug to use this function and pass the result to functions with nonnull arguments if the programmer knows it just will not happen. Forcing the programmer to workaround dubious warnings by throwing in not very portable attributes everywhere and duplicating functions, so that one copy can be returns_nonnull and requiring non-NULL argument and another copy allow NULL argument and allowing NULL returns is just a nightmare we IMHO don't want to throw at users. If they want to do it, sure, they can, but we shouldn't force them into that. We also don't warn about division by zero and similar UB after warning about the obvious cases of those in the FE, such warnings would have similarly huge false positive rate and again can be handled by -fsanitize=undefined very well. Jakub
On 12/16/2016 10:27 AM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: >>> No. The first call to sm_read_sector just doesn't exit. So it is warning >>> about dead code. >> >> If the code is dead then GCC should eliminate it. With it eliminated > > There is (especially with jump threading, but not limited to that, other > optimizations may result in that too), code that even very smart optimizing > compiler isn't able to prove that is dead. > >> the warning would be gone. The warning isn't smart and it doesn't >> try to be. It only works with what GCC gives it. In this case the >> dump shows that GCC thinks the code is reachable. If it isn't that >> would seem to highlight a missed optimization opportunity, not a need >> to make the warning smarter than the optimizer. > > No, it highlights that the warning is done in a wrong place where it suffers > from too many false positives. Asserting a claim without providing any data or evidence doesn't make it true. We have seen fewer than 10 instances of it in just one out of four sizable projects. At least three of these instances are debatable (as evidenced by the ongoing debate). That can hardly be interpreted as "too many false positives." But I'm not at all attached to where to implement it and I'm fully aware that you have a much better idea than me what's the appropriate place for it. I'll be just as happy if it's done in a pass of its own just as long as it gives users what they've been asking for. >>> None look like real bugs to me. >> >> They do to me. There are calls in gengtype.c to a function decorated >> with attribute nonnull (lbasename) that pass to it a pointer that's >> potentially null. For example below. get_input_file_name returns > > Most pointers passed to functions with nonnull attributes are, from the > compiler POV, potentially NULL. Usually the compiler just can't prove it > can't be non-NULL, it is an exception if it can. > If you have a generic function that sometimes can be called with NULL (or > some other "failure" argument) and in that case return NULL, and for valid > arguments it guarantees returning non-NULL (and IMHO the gengtype functions > we talk about fall into that category), then it is not a bug to use this > function and pass the result to functions with nonnull arguments if the > programmer knows it just will not happen. Forcing the programmer to > workaround dubious warnings by throwing in not very portable attributes > everywhere and duplicating functions, so that one copy can be > returns_nonnull and requiring non-NULL argument and another copy allow > NULL argument and allowing NULL returns is just a nightmare we IMHO don't > want to throw at users. If they want to do it, sure, they can, but we > shouldn't force them into that. I'm not entirely sure I follow your argument here, and not just because I find your propensity for exaggeration and for pejoratives distracting. It seems to me that we simply have different views of what's a true positive and what's not, and what's useful and what's not. That's likely to be the case for many others and I think it might help to acknowledge that neither one of us is necessarily 100% right or wrong. In that spirit let me reiterate that I don't insist on the warning being implemented in any particular area of the compiler, or even that it necessarily diagnose any of the cases we've been discussing. If your implementation diagnoses the common cases exercised by the tests I wrote I have no objection to it (not that objecting would do me any good). Martin
On Fri, Dec 16, 2016 at 11:07:29AM -0700, Martin Sebor wrote: > If your implementation diagnoses the common cases exercised by the > tests I wrote I have no objection to it (not that objecting would > do me any good). Your tests all pass with the patch I've posted, we can surely add further ones if needed. Anyway, I think I've posted what I wanted to say on the topic, so let's wait for others to decide what we want to do, somebody has to ack either of the patches (or some other, or request reversion). Jakub
On 2016.12.16 at 11:07 -0700, Martin Sebor wrote: > On 12/16/2016 10:27 AM, Jakub Jelinek wrote: > > On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: > > > > No. The first call to sm_read_sector just doesn't exit. So it is warning > > > > about dead code. > > > > > > If the code is dead then GCC should eliminate it. With it eliminated > > > > There is (especially with jump threading, but not limited to that, other > > optimizations may result in that too), code that even very smart optimizing > > compiler isn't able to prove that is dead. > > > > > the warning would be gone. The warning isn't smart and it doesn't > > > try to be. It only works with what GCC gives it. In this case the > > > dump shows that GCC thinks the code is reachable. If it isn't that > > > would seem to highlight a missed optimization opportunity, not a need > > > to make the warning smarter than the optimizer. > > > > No, it highlights that the warning is done in a wrong place where it suffers > > from too many false positives. > > Asserting a claim without providing any data or evidence doesn't make > it true. We have seen fewer than 10 instances of it in just one out > of four sizable projects. At least three of these instances are > debatable (as evidenced by the ongoing debate). That can hardly be > interpreted as "too many false positives." So, to be fair a gave Jakub's patch a try and it has exactly the same issues for the Linux kernel: sometimes the warning only triggers with -O3, e.g.: % cat sm_ftl.i int a; void mtd_read_oob(int); void sm_read_sector(int *buffer) { __builtin_memset(buffer, 0, 1); mtd_read_oob(a); } void sm_get_zone() { sm_read_sector(0); } % gcc -c -Wnonnull -O2 sm_ftl.i % gcc -c -Wnonnull -O3 sm_ftl.i sm_ftl.i: In function ‘sm_get_zone’: sm_ftl.i:4:3: warning: argument 1 null where non-null expected [-Wnonnull] __builtin_memset(buffer, 0, 1); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sm_ftl.i:4:3: note: in a call to built-in function ‘__builtin_memset’ Also, Jakub's patch doesn't catch the following case: (tools/perf/util/trace-event-info.c from Linux) //... dir = opendir(path); if (!dir) { err = -errno; pr_debug("can't read directory '%s'", path); goto out; } //... other stuff out: closedir(dir); Whereas Martin's does. -- Markus
On Fri, Dec 16, 2016 at 07:29:13PM +0100, Markus Trippelsdorf wrote: > So, to be fair a gave Jakub's patch a try and it has exactly the same > issues for the Linux kernel: sometimes the warning only triggers with > -O3, e.g.: > > % cat sm_ftl.i > int a; > void mtd_read_oob(int); > void sm_read_sector(int *buffer) { > __builtin_memset(buffer, 0, 1); > mtd_read_oob(a); > } > void sm_get_zone() { sm_read_sector(0); } In this case I think the warning is right, if you ever call sm_get_zone, it will always invoke UB. > % gcc -c -Wnonnull -O2 sm_ftl.i > % gcc -c -Wnonnull -O3 sm_ftl.i > sm_ftl.i: In function ‘sm_get_zone’: > sm_ftl.i:4:3: warning: argument 1 null where non-null expected [-Wnonnull] > __builtin_memset(buffer, 0, 1); > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > sm_ftl.i:4:3: note: in a call to built-in function ‘__builtin_memset’ > > Also, Jakub's patch doesn't catch the following case: > > (tools/perf/util/trace-event-info.c from Linux) > //... > dir = opendir(path); > if (!dir) { > err = -errno; > pr_debug("can't read directory '%s'", path); > goto out; > } > //... other stuff > out: > closedir(dir); > > Whereas Martin's does. This is where you rely on jump threading to duplicate closedir, otherwise you don't warn. If you don't jump thread it (say there are a few more statements after closedir that make it not worth it), then you don't warn even late. If the warning in calls.c is guarded with -Wmaybe-nonnull instead of -Wnonnull, not included in -Wall, I have no objections. Then users can select the amount of warnings and false positives. Or -Wnonnull can have 3 levels, one warn in the FE only, then after inlining in the second level and late in the 3rd level. Jakub
On Fri, 16 Dec 2016, Martin Sebor wrote: > I don't claim it can't be improved but it seems pretty good as > it is already. Among the 6 instances it's found in GCC three > look like real bugs. FWIW it's found at least one real bug in glibc <https://sourceware.org/bugzilla/show_bug.cgi?id=20978> - that's a case where strlen is called on a pointer that can never be non-null if the strlen call is reached. (I don't know if there are other cases in glibc, whether genuine bugs or false positives, that would appear later in the build once that bug is fixed.) -- Joseph S. Myers joseph@codesourcery.com
On Fri, Dec 16, 2016 at 07:00:06PM +0000, Joseph Myers wrote: > On Fri, 16 Dec 2016, Martin Sebor wrote: > > > I don't claim it can't be improved but it seems pretty good as > > it is already. Among the 6 instances it's found in GCC three > > look like real bugs. > > FWIW it's found at least one real bug in glibc > <https://sourceware.org/bugzilla/show_bug.cgi?id=20978> - that's a case > where strlen is called on a pointer that can never be non-null if the > strlen call is reached. (I don't know if there are other cases in glibc, > whether genuine bugs or false positives, that would appear later in the > build once that bug is fixed.) Thanks. Reduced to something like: int foo (const char *name) { if (name) return 6; return __builtin_strlen (name); } This is warned about both with Martin's late warning and my after ccp2 warning version. We should include it in gcc testsuite. Jakub
On 12/16/2016 12:08 PM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 07:00:06PM +0000, Joseph Myers wrote: >> On Fri, 16 Dec 2016, Martin Sebor wrote: >> >>> I don't claim it can't be improved but it seems pretty good as >>> it is already. Among the 6 instances it's found in GCC three >>> look like real bugs. >> >> FWIW it's found at least one real bug in glibc >> <https://sourceware.org/bugzilla/show_bug.cgi?id=20978> - that's a case >> where strlen is called on a pointer that can never be non-null if the >> strlen call is reached. (I don't know if there are other cases in glibc, >> whether genuine bugs or false positives, that would appear later in the >> build once that bug is fixed.) > > Thanks. Reduced to something like: > int > foo (const char *name) > { > if (name) > return 6; > return __builtin_strlen (name); > } > This is warned about both with Martin's late warning and my after ccp2 > warning version. We should include it in gcc testsuite. Agreed. jeff
On 12/16/2016 12:08 PM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 07:00:06PM +0000, Joseph Myers wrote: >> On Fri, 16 Dec 2016, Martin Sebor wrote: >> >>> I don't claim it can't be improved but it seems pretty good as >>> it is already. Among the 6 instances it's found in GCC three >>> look like real bugs. >> >> FWIW it's found at least one real bug in glibc >> <https://sourceware.org/bugzilla/show_bug.cgi?id=20978> - that's a case >> where strlen is called on a pointer that can never be non-null if the >> strlen call is reached. (I don't know if there are other cases in glibc, >> whether genuine bugs or false positives, that would appear later in the >> build once that bug is fixed.) > > Thanks. Reduced to something like: > int > foo (const char *name) > { > if (name) > return 6; > return __builtin_strlen (name); > } > This is warned about both with Martin's late warning and my after ccp2 > warning version. We should include it in gcc testsuite. I'll note this is an example of a case where Andrew's work would likely help because it allows us to ask for a range of name_XX at the return statement and it'll give us back a range that is constrained by the path(s) which reach the return statement. Contrast to the current VRP work where each SSA_NAME has a range, but that range must be valid for every context in which that SSA_NAME appears. jeff
On 12/16/2016 11:40 AM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 07:29:13PM +0100, Markus Trippelsdorf wrote: >> So, to be fair a gave Jakub's patch a try and it has exactly the same >> issues for the Linux kernel: sometimes the warning only triggers with >> -O3, e.g.: >> >> % cat sm_ftl.i >> int a; >> void mtd_read_oob(int); >> void sm_read_sector(int *buffer) { >> __builtin_memset(buffer, 0, 1); >> mtd_read_oob(a); >> } >> void sm_get_zone() { sm_read_sector(0); } > > In this case I think the warning is right, if you ever call sm_get_zone, > it will always invoke UB. Agreed. > >> % gcc -c -Wnonnull -O2 sm_ftl.i >> % gcc -c -Wnonnull -O3 sm_ftl.i >> sm_ftl.i: In function ‘sm_get_zone’: >> sm_ftl.i:4:3: warning: argument 1 null where non-null expected [-Wnonnull] >> __builtin_memset(buffer, 0, 1); >> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> sm_ftl.i:4:3: note: in a call to built-in function ‘__builtin_memset’ >> >> Also, Jakub's patch doesn't catch the following case: >> >> (tools/perf/util/trace-event-info.c from Linux) >> //... >> dir = opendir(path); >> if (!dir) { >> err = -errno; >> pr_debug("can't read directory '%s'", path); >> goto out; >> } >> //... other stuff >> out: >> closedir(dir); >> >> Whereas Martin's does. > > This is where you rely on jump threading to duplicate closedir, > otherwise you don't warn. If you don't jump thread it (say there are > a few more statements after closedir that make it not worth it), then you > don't warn even late. Right. This is an inherent issue with path isolation (which is performed by jump threading and other passes). Path isolation can easily expose constants that were previously hidden with two paths joined. Propagation of those constants can then trigger this kind of warning. And I personally think that is a good thing. It's precisely these kind of path specific failures that I want GCC to be catching -- because it's often hard for humans to see the error, particularly if there's a significant amount of visual space between the key points in the code. And yes, it's lame that additional code before the closedir may hide the warning. That's an artifact of the cost/benefit of path isolation. But that already happens with a slew of our other warnings. In an ideal world we'd have a stronger static analyzer that followed all the paths, but that's not where we are today. I don't see that this kind of scenario inherently affecting the decision one way or the other -- it's more about frequency of false positives and mitigation in the false positive case in my mind. jeff
On Fri, Dec 16, 2016 at 01:01:13PM -0700, Jeff Law wrote: > > Thanks. Reduced to something like: > > int > > foo (const char *name) > > { > > if (name) > > return 6; > > return __builtin_strlen (name); > > } > > This is warned about both with Martin's late warning and my after ccp2 > > warning version. We should include it in gcc testsuite. > I'll note this is an example of a case where Andrew's work would likely help > because it allows us to ask for a range of name_XX at the return statement > and it'll give us back a range that is constrained by the path(s) which > reach the return statement. > > Contrast to the current VRP work where each SSA_NAME has a range, but that > range must be valid for every context in which that SSA_NAME appears. Well, inside the current VRP it has separate ranges for the different paths and actually replaces the name in the strlen argument with NULL during evrp, so doesn't suffer from the current VRP limitations. Anyway, let's consider the warning from the linux kernel with the closedir, I guess it can be simplified to something along the lines of: void baz (char *) __attribute__((nonnull)); char *bar (int); int foo (void) { char *p = bar (1); int ret = 0; if (p == 0) { bar (2); bar (3); bar (4); ret = 1; goto out; } bar (5); bar (6); bar (7); bar (8); out: baz (p); if (ret) bar (10); return ret; } Here we jump thread it and with Martin's warning position warn, with my patch don't warn. But if the if (ret) bar (10); is removed, the code has the same problem that on the error path p will be NULL, but it is not going to be diagnosed. For -Wmaybe-nonnull we could e.g. look at if the argument is a PHI that has NULL at any of the edges; but that doesn't cover the above case, because p has just one def and so there will be no PHIs. And yet another testcase: void foo (const char *) __attribute__((nonnull)); void bar (int x) { foo (x ? (const char *) 0 : ""); } Here we warn only with GCC 6 or with my patch, but not with current trunk, so this case is a warning regression (if needed, I can split my patch into smaller chunks, the part that addresses this is keeping the warning in the FE and setting TREE_NO_WARNING if it warned, instead of not warning when optimize in the FE at all). Handling COND_EXPR in the argument might be more -Wmaybe-nonnull thing though. It isn't if you reach this source location, there will be always UB, but if you reach this source location along this and this path (and only careful analysis can discover if that path is actually possible at runtime), then there will be UB. Kind like we have -Wuninitialized and -Wmaybe-uninitialized. Jakub
On 2016.12.16 at 18:27 +0100, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: > > > No. The first call to sm_read_sector just doesn't exit. So it is warning > > > about dead code. > > > > If the code is dead then GCC should eliminate it. With it eliminated > > There is (especially with jump threading, but not limited to that, other > optimizations may result in that too), code that even very smart optimizing > compiler isn't able to prove that is dead. > > > the warning would be gone. The warning isn't smart and it doesn't > > try to be. It only works with what GCC gives it. In this case the > > dump shows that GCC thinks the code is reachable. If it isn't that > > would seem to highlight a missed optimization opportunity, not a need > > to make the warning smarter than the optimizer. > > No, it highlights that the warning is done in a wrong place where it suffers > from too many false positives. Another issue with Martin's patch is that it adds many false positives when one uses -fsanitize=undefined, e.g.: % cat test.ii struct A { char *msg; A(const char *); }; A::A(const char *p1) { msg = new char[__builtin_strlen(p1) + 1]; __builtin_strcpy(msg, p1); } % g++ -Wall -O2 -c test.ii % g++ -Wall -fsanitize=undefined -O2 -c test.ii test.ii: In constructor ‘A::A(const char*)’: test.ii:6:34: warning: argument 1 null where non-null expected [-Wnonnull] msg = new char[__builtin_strlen(p1) + 1]; ~~~~~~~~~~~~~~~~^~~~ test.ii:6:34: note: in a call to built-in function ‘long unsigned int __builtin_strlen(const char*)’ -- Markus
On 12/17/2016 01:01 PM, Markus Trippelsdorf wrote: > On 2016.12.16 at 18:27 +0100, Jakub Jelinek wrote: >> On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: >>>> No. The first call to sm_read_sector just doesn't exit. So it is warning >>>> about dead code. >>> >>> If the code is dead then GCC should eliminate it. With it eliminated >> >> There is (especially with jump threading, but not limited to that, other >> optimizations may result in that too), code that even very smart optimizing >> compiler isn't able to prove that is dead. >> >>> the warning would be gone. The warning isn't smart and it doesn't >>> try to be. It only works with what GCC gives it. In this case the >>> dump shows that GCC thinks the code is reachable. If it isn't that >>> would seem to highlight a missed optimization opportunity, not a need >>> to make the warning smarter than the optimizer. >> >> No, it highlights that the warning is done in a wrong place where it suffers >> from too many false positives. > > Another issue with Martin's patch is that it adds many false positives > when one uses -fsanitize=undefined, e.g.: > > % cat test.ii > struct A { > char *msg; > A(const char *); > }; > A::A(const char *p1) { > msg = new char[__builtin_strlen(p1) + 1]; > __builtin_strcpy(msg, p1); > } > > % g++ -Wall -O2 -c test.ii > % g++ -Wall -fsanitize=undefined -O2 -c test.ii > test.ii: In constructor ‘A::A(const char*)’: > test.ii:6:34: warning: argument 1 null where non-null expected [-Wnonnull] > msg = new char[__builtin_strlen(p1) + 1]; > ~~~~~~~~~~~~~~~~^~~~ > test.ii:6:34: note: in a call to built-in function ‘long unsigned int __builtin_strlen(const char*)’ I agree that these warnings should probably not be issued, though it's interesting to see where they come from. The calls are in the code emitted by GCC, are reachable, and end up taking place with the right Ubsan runtime recovery options. It turns out that Ubsan transforms calls to nonnull functions into conditional branches testing the argument for null, like so: if (s == 0) __builtin___ubsan_handle_nonnull_arg(); n = strlen (s); and GCC then transforms those into if (s == 0) { __builtin___ubsan_handle_nonnull_arg(); n = strlen (NULL); } When the ubsan_handle_nonnull_arg function returns to the caller the call to strlen(NULL) is made. This doesn't happen when -fno-sanitize-recover=undefined is used when compiling the file because then Ubsan inserts calls to __builtin___ubsan_handle_nonnull_arg_abort instead which is declared noreturn. It would be easy to suppress these warnings when -fsantize=undefined is used. Distinguishing the Ubsan-inserted calls from those in the source code would be more challenging. Implementing the warning as a pass that runs before the Ubsan pass gets around the problem. Martin
On Sat, Dec 17, 2016 at 02:55:15PM -0700, Martin Sebor wrote: > I agree that these warnings should probably not be issued, though > it's interesting to see where they come from. The calls are in > the code emitted by GCC, are reachable, and end up taking place > with the right Ubsan runtime recovery options. It turns out that > Ubsan transforms calls to nonnull functions into conditional > branches testing the argument for null, like so: > > if (s == 0) > __builtin___ubsan_handle_nonnull_arg(); > n = strlen (s); > > and GCC then transforms those into > > if (s == 0) > { > __builtin___ubsan_handle_nonnull_arg(); > n = strlen (NULL); > } > > When the ubsan_handle_nonnull_arg function returns to the caller > the call to strlen(NULL) is made. > > This doesn't happen when -fno-sanitize-recover=undefined is used > when compiling the file because then Ubsan inserts calls to > __builtin___ubsan_handle_nonnull_arg_abort instead which is declared > noreturn. > > It would be easy to suppress these warnings when -fsantize=undefined > is used. Distinguishing the Ubsan-inserted calls from those in the > source code would be more challenging. Implementing the warning as > a pass that runs before the Ubsan pass gets around the problem. The ubsan pass runs before IPA, so not sure how do you want to do that (and it is needed to run it early). One question is if we should perform path isolation in this case at all, since the branches to __builtin___ubsan_handle_nonnull_arg are with PROB_VERY_UNLIKELY, so the question is if we should be growing the paths which is really cold. It has some small benefit (removing one cheap comparison from the hot path), but the grows cold block. Another thing is that what the compiler does can very well just happen in some generic function that is called by the function that calls these strlen/strcpy etc. functions (fns with nonnull attribute). BTW, in the testcase from the Linux kernel it is also path isolation for error recovery path, something that ought to be predicted unlikely (though, probably not very unlikely unlike this case), so the question is if we want to isolate that or not too. Jakub
On 12/19/2016 02:42 AM, Jakub Jelinek wrote: > On Sat, Dec 17, 2016 at 02:55:15PM -0700, Martin Sebor wrote: >> I agree that these warnings should probably not be issued, though >> it's interesting to see where they come from. The calls are in >> the code emitted by GCC, are reachable, and end up taking place >> with the right Ubsan runtime recovery options. It turns out that >> Ubsan transforms calls to nonnull functions into conditional >> branches testing the argument for null, like so: >> >> if (s == 0) >> __builtin___ubsan_handle_nonnull_arg(); >> n = strlen (s); >> >> and GCC then transforms those into >> >> if (s == 0) >> { >> __builtin___ubsan_handle_nonnull_arg(); >> n = strlen (NULL); >> } >> >> When the ubsan_handle_nonnull_arg function returns to the caller >> the call to strlen(NULL) is made. >> >> This doesn't happen when -fno-sanitize-recover=undefined is used >> when compiling the file because then Ubsan inserts calls to >> __builtin___ubsan_handle_nonnull_arg_abort instead which is declared >> noreturn. >> >> It would be easy to suppress these warnings when -fsantize=undefined >> is used. Distinguishing the Ubsan-inserted calls from those in the >> source code would be more challenging. Implementing the warning as >> a pass that runs before the Ubsan pass gets around the problem. > > The ubsan pass runs before IPA, so not sure how do you want to do that > (and it is needed to run it early). > > One question is if we should perform path isolation in this case at all, > since the branches to __builtin___ubsan_handle_nonnull_arg are with > PROB_VERY_UNLIKELY, so the question is if we should be growing the > paths which is really cold. It has some small benefit (removing one > cheap comparison from the hot path), but the grows cold block. > > Another thing is that what the compiler does can very well just happen > in some generic function that is called by the function that calls these > strlen/strcpy etc. functions (fns with nonnull attribute). For the string built-ins (though perhaps not for user-defined nonnull functions), I wonder if it would make sense to have Ubsan emit an if (p == null) ubsan_handle_nonnull; else strfunc(p); to let the program continue instead of almost certainly crash right after the handler returns. That would solve the warning problem and also achieve the goal of allowing the handler to return and uncovering subsequent instances of undefined behavior. > BTW, in the testcase from the Linux kernel it is also path isolation > for error recovery path, something that ought to be predicted unlikely > (though, probably not very unlikely unlike this case), so the question is > if we want to isolate that or not too. I don't expect to have the cycles to do significant work on this before stage 3 ends. For GCC 7, to avoid the Ubsan warnings, the late -Wnonnull warnings could simply be suppressed when -fsanitize=undefined is used. It wouldn't be ideal but it would be no worse than what GCC does today. In 8 the warning could be made smarter to avoid the problem in general. Martin
On Mon, Dec 19, 2016 at 08:52:44AM -0700, Martin Sebor wrote: > > Another thing is that what the compiler does can very well just happen > > in some generic function that is called by the function that calls these > > strlen/strcpy etc. functions (fns with nonnull attribute). > > For the string built-ins (though perhaps not for user-defined > nonnull functions), I wonder if it would make sense to have Ubsan > emit an if (p == null) ubsan_handle_nonnull; else strfunc(p); to That would be just weird, have one behavior for selected subset of functions and another for the rest? Ugh. > > BTW, in the testcase from the Linux kernel it is also path isolation > > for error recovery path, something that ought to be predicted unlikely > > (though, probably not very unlikely unlike this case), so the question is > > if we want to isolate that or not too. > > I don't expect to have the cycles to do significant work on this > before stage 3 ends. For GCC 7, to avoid the Ubsan warnings, > the late -Wnonnull warnings could simply be suppressed when > -fsanitize=undefined is used. It wouldn't be ideal but it would > be no worse than what GCC does today. In 8 the warning could be > made smarter to avoid the problem in general. Or apply the patch I've posted which doesn't suffer from this problem, or revert the -Wnonnull changes and resolve somehow in GCC 8. Jakub
On 12/19/2016 09:17 AM, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 08:52:44AM -0700, Martin Sebor wrote: >>> Another thing is that what the compiler does can very well just happen >>> in some generic function that is called by the function that calls these >>> strlen/strcpy etc. functions (fns with nonnull attribute). >> >> For the string built-ins (though perhaps not for user-defined >> nonnull functions), I wonder if it would make sense to have Ubsan >> emit an if (p == null) ubsan_handle_nonnull; else strfunc(p); to > > That would be just weird, have one behavior for selected subset of functions > and another for the rest? Ugh. The selected set of the string built-ins are special -- they are known not to recover from null pointers so I think treating them differently would be reasonable (and useful) irrespective of the -Wnonnull warning. We don't know what any arbitrary user- defined nonnull function might do when it gets a null pointer so skipping those may not make as much sense. >>> BTW, in the testcase from the Linux kernel it is also path isolation >>> for error recovery path, something that ought to be predicted unlikely >>> (though, probably not very unlikely unlike this case), so the question is >>> if we want to isolate that or not too. >> >> I don't expect to have the cycles to do significant work on this >> before stage 3 ends. For GCC 7, to avoid the Ubsan warnings, >> the late -Wnonnull warnings could simply be suppressed when >> -fsanitize=undefined is used. It wouldn't be ideal but it would >> be no worse than what GCC does today. In 8 the warning could be >> made smarter to avoid the problem in general. > > Or apply the patch I've posted which doesn't suffer from this problem, > or revert the -Wnonnull changes and resolve somehow in GCC 8. I would prefer your patch if it solves the problem. In fact, as I said before, I'm fine with your patch in any case. I also still have the patch that I never posted that adds the two levels to -Wnonnull (keeping -Wnonnull=1 as the default). And I now have another patch that simply suppresses the late -Wnonnull warning when Ubsan checks for null pointers. I could see about putting together yet another one to implement the approach I suggested above but I hesitate to put too much more time into it before knowing which approach is preferable. Martin
On 2016.12.19 at 09:34 -0700, Martin Sebor wrote: > On 12/19/2016 09:17 AM, Jakub Jelinek wrote: > > Or apply the patch I've posted which doesn't suffer from this problem, > > or revert the -Wnonnull changes and resolve somehow in GCC 8. > > I would prefer your patch if it solves the problem. In fact, > as I said before, I'm fine with your patch in any case. Then please lets go with Jakub's patch. And also please revert r243736 as Richi suggested, because it isn't necessary anymore. -- Markus
On Mon, Dec 19, 2016 at 05:43:38PM +0100, Markus Trippelsdorf wrote: > On 2016.12.19 at 09:34 -0700, Martin Sebor wrote: > > On 12/19/2016 09:17 AM, Jakub Jelinek wrote: > > > Or apply the patch I've posted which doesn't suffer from this problem, > > > or revert the -Wnonnull changes and resolve somehow in GCC 8. > > > > I would prefer your patch if it solves the problem. In fact, > > as I said before, I'm fine with your patch in any case. > > Then please lets go with Jakub's patch. And also please revert r243736 > as Richi suggested, because it isn't necessary anymore. Well, somebody has to ack that, I can't review my own patches in this area. Jakub
On Mon, Dec 19, 2016 at 09:34:44AM -0700, Martin Sebor wrote: > > That would be just weird, have one behavior for selected subset of functions > > and another for the rest? Ugh. > > The selected set of the string built-ins are special -- they are > known not to recover from null pointers so I think treating them > differently would be reasonable (and useful) irrespective of > the -Wnonnull warning. We don't know what any arbitrary user- > defined nonnull function might do when it gets a null pointer so > skipping those may not make as much sense. The problem is that then -fsanitize=undefined changes behavior of the program, which wasn't part of the design. It should either terminate the program after reporting (and before it happens) the first fatal UB, or just report UB before they happen and continue working as without the instrumentation. If the program segfaults without instrumentation, so be it even with instrumentation. Jakub
On 12/19/2016 09:51 AM, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 09:34:44AM -0700, Martin Sebor wrote: >>> That would be just weird, have one behavior for selected subset of functions >>> and another for the rest? Ugh. >> >> The selected set of the string built-ins are special -- they are >> known not to recover from null pointers so I think treating them >> differently would be reasonable (and useful) irrespective of >> the -Wnonnull warning. We don't know what any arbitrary user- >> defined nonnull function might do when it gets a null pointer so >> skipping those may not make as much sense. > > The problem is that then -fsanitize=undefined changes behavior of the > program, which wasn't part of the design. It should either terminate the > program after reporting (and before it happens) the first fatal UB, or > just report UB before they happen and continue working as without the > instrumentation. If the program segfaults without instrumentation, so be it > even with instrumentation. Right. I think as a fundamental design decision UB sanitization shouldn't change the behavior of the code. Report and terminate at first UB just remote UBs. Deviations from that design should be looked at as bugs. jeff
On 12/17/2016 02:55 PM, Martin Sebor wrote: > On 12/17/2016 01:01 PM, Markus Trippelsdorf wrote: > > I agree that these warnings should probably not be issued, though > it's interesting to see where they come from. The calls are in > the code emitted by GCC, are reachable, and end up taking place > with the right Ubsan runtime recovery options. It turns out that > Ubsan transforms calls to nonnull functions into conditional > branches testing the argument for null, like so: > > if (s == 0) > __builtin___ubsan_handle_nonnull_arg(); > n = strlen (s); > > and GCC then transforms those into > > if (s == 0) > { > __builtin___ubsan_handle_nonnull_arg(); > n = strlen (NULL); > } > > When the ubsan_handle_nonnull_arg function returns to the caller > the call to strlen(NULL) is made. So I'd like to see more complete dumps here. > > This doesn't happen when -fno-sanitize-recover=undefined is used > when compiling the file because then Ubsan inserts calls to > __builtin___ubsan_handle_nonnull_arg_abort instead which is declared > noreturn. Right. That's what I would expect. If we're going to halt the process at first UB, then we want to abort. Obviously in that case we're calling a noreturn function and the strlen never executes. Otherwise the strlen still needs to be called and whateve action strlen has when passed a NULL must be preserved. So the only question in my mind is what was the larger context so that we can look at why we isolated the paths (which brings the strlen into the conditional rather than leaving it at the merge point). jeff
On 12/19/2016 02:42 AM, Jakub Jelinek wrote: > > The ubsan pass runs before IPA, so not sure how do you want to do that > (and it is needed to run it early). > > One question is if we should perform path isolation in this case at all, > since the branches to __builtin___ubsan_handle_nonnull_arg are with > PROB_VERY_UNLIKELY, so the question is if we should be growing the > paths which is really cold. It has some small benefit (removing one > cheap comparison from the hot path), but the grows cold block. Growing a cold path isn't terribly problematical, particularly if it allows optimization on the hot path. In an ideal world these uber-cold paths belong on their own pages and are never even paged in. But I'd really like to see the full dump in this case. I've got a fragment above and it looks like dumb duplication. ie, why did we isolate the path. > > BTW, in the testcase from the Linux kernel it is also path isolation > for error recovery path, something that ought to be predicted unlikely > (though, probably not very unlikely unlike this case), so the question is > if we want to isolate that or not too. Agreed. jeff
On 12/16/2016 01:17 PM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 01:01:13PM -0700, Jeff Law wrote: >>> Thanks. Reduced to something like: >>> int >>> foo (const char *name) >>> { >>> if (name) >>> return 6; >>> return __builtin_strlen (name); >>> } >>> This is warned about both with Martin's late warning and my after ccp2 >>> warning version. We should include it in gcc testsuite. >> I'll note this is an example of a case where Andrew's work would likely help >> because it allows us to ask for a range of name_XX at the return statement >> and it'll give us back a range that is constrained by the path(s) which >> reach the return statement. >> >> Contrast to the current VRP work where each SSA_NAME has a range, but that >> range must be valid for every context in which that SSA_NAME appears. > > Well, inside the current VRP it has separate ranges for the different paths > and actually replaces the name in the strlen argument with NULL during evrp, > so doesn't suffer from the current VRP limitations. It'll do that sometimes, but it's not consistent and its a conscious design decision (which I may not necessarily agree with, but I'm not going to open that can-o-worms). > > Anyway, let's consider the warning from the linux kernel with the closedir, > I guess it can be simplified to something along the lines of: > void baz (char *) __attribute__((nonnull)); > char *bar (int); > > int > foo (void) > { > char *p = bar (1); > int ret = 0; > if (p == 0) > { > bar (2); > bar (3); > bar (4); > ret = 1; > goto out; > } > bar (5); > bar (6); > bar (7); > bar (8); > out: > baz (p); > if (ret) > bar (10); > return ret; > } > > Here we jump thread it and with Martin's warning position warn, with > my patch don't warn. But if the if (ret) bar (10); is removed, the > code has the same problem that on the error path p will be NULL, but it is > not going to be diagnosed. For -Wmaybe-nonnull we could e.g. look at if > the argument is a PHI that has NULL at any of the edges; but that doesn't > cover the above case, because p has just one def and so there will be no > PHIs. Yea, and so what if the warning changes if that statement is removed. That kind of change is inherent in any late running warning. But late warnings, in general, generate fewer false positives than early warnings. I don't see this as a reason to summarily reject Martin's work. This whole discussion highlights the primary reason why I stopped working on uninitialized warnings many years ago and focused strictly on the optimization side of the question. Everyone has a different view on what is acceptable and what is not for a false positive. Everyone has a different view on whether or not it is acceptable that a warning disappear when seemingly innocent changes are made to the source. Should we warn early and generate more false positives, or late, reducing false positives, but missing warnings in unreachable code. There is no single right answer and the debates are endless and frustrating as hell. Jeff
On 12/16/2016 10:27 AM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 10:10:00AM -0700, Martin Sebor wrote: >>> No. The first call to sm_read_sector just doesn't exit. So it is warning >>> about dead code. >> >> If the code is dead then GCC should eliminate it. With it eliminated > > There is (especially with jump threading, but not limited to that, other > optimizations may result in that too), code that even very smart optimizing > compiler isn't able to prove that is dead. > >> the warning would be gone. The warning isn't smart and it doesn't >> try to be. It only works with what GCC gives it. In this case the >> dump shows that GCC thinks the code is reachable. If it isn't that >> would seem to highlight a missed optimization opportunity, not a need >> to make the warning smarter than the optimizer. > > No, it highlights that the warning is done in a wrong place where it suffers > from too many false positives. I don't inherently see this as generating "too many false positives". And as Martin says, the warning works with precisely what it is presented. I think the particular stumbling point is path isolation at some point as resulted in a NULL explicitly in calls at various places. That is a *GOOD* thing to detect and warn against as it represents cases that are often well hidden and often difficult for a human to analyze (based on my work with NULL pointer dereference warnings). > >>> None look like real bugs to me. >> >> They do to me. There are calls in gengtype.c to a function decorated >> with attribute nonnull (lbasename) that pass to it a pointer that's >> potentially null. For example below. get_input_file_name returns > > Most pointers passed to functions with nonnull attributes are, from the > compiler POV, potentially NULL. Usually the compiler just can't prove it > can't be non-NULL, it is an exception if it can. True. But what's happening here IIUC is that there is an explict NULL for those arguments in the IL, most likely due to path isolation. I would agree that a random pointer which we know nothing about could be NULL, but we shouldn't warn for it. That would generate too many false positives. What those guidelines do mean is that various transformations and optimizations may make warnings appear or disappear seemingly randomly. That's unfortunate, but inherent in this class of problems until we have a real static analyzer. jeff
On 12/19/2016 10:31 AM, Jeff Law wrote: > On 12/17/2016 02:55 PM, Martin Sebor wrote: >> On 12/17/2016 01:01 PM, Markus Trippelsdorf wrote: >> >> I agree that these warnings should probably not be issued, though >> it's interesting to see where they come from. The calls are in >> the code emitted by GCC, are reachable, and end up taking place >> with the right Ubsan runtime recovery options. It turns out that >> Ubsan transforms calls to nonnull functions into conditional >> branches testing the argument for null, like so: >> >> if (s == 0) >> __builtin___ubsan_handle_nonnull_arg(); >> n = strlen (s); >> >> and GCC then transforms those into >> >> if (s == 0) >> { >> __builtin___ubsan_handle_nonnull_arg(); >> n = strlen (NULL); >> } >> >> When the ubsan_handle_nonnull_arg function returns to the caller >> the call to strlen(NULL) is made. > So I'd like to see more complete dumps here. The -Wnonnull warning can be reproduced with this C test case and -fsantize=undefined: char* f (const char *s) { unsigned n = __builtin_strlen (s) + 1; char *d = __builtin_malloc (n); if (!d) return 0; __builtin_memcpy (d, s, n); return d; } The sanitizer emits the following code (I snipped the rest after the call to malloc): <bb 2> [0.00%]: if (s_8(D) == 0B) goto <bb 7>; [0.04%] else goto <bb 6>; [99.96%] <bb 7> [0.00%]: __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data0); <bb 6> [0.00%]: _1 = __builtin_strlen (s_8(D)); _2 = (unsigned int) _1; n_9 = _2 + 1; _3 = (long unsigned int) n_9; d_11 = __builtin_malloc (_3); ... This is then transformed by the third thread jumping pass into: <bb 2> [100.00%]: if (s_7(D) == 0B) goto <bb 3>; [0.04%] else goto <bb 8>; [99.96%] <bb 3> [0.04%]: __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data0); _24 = __builtin_strlen (0B); _25 = (unsigned int) _24; n_26 = _25 + 1; _27 = (long unsigned int) n_26; d_29 = __builtin_malloc (_27); if (d_29 == 0B) goto <bb 4>; [4.07%] else goto <bb 5>; [95.93%] <bb 4> [4.07%]: goto <bb 7>; [100.00%] <bb 5> [0.04%]: __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data2); <bb 6> [95.93%]: # _30 = PHI <_19(8), _27(5)> # d_31 = PHI <d_22(8), d_29(5)> __builtin_memcpy (d_31, s_7(D), _30); <bb 7> [100.00%]: # _4 = PHI <0B(4), d_31(6)> return _4; <bb 8> [99.96%]: _16 = __builtin_strlen (s_7(D)); _21 = (unsigned int) _16; n_20 = _21 + 1; _19 = (long unsigned int) n_20; d_22 = __builtin_malloc (_19); if (d_22 == 0B) goto <bb 4>; [4.07%] else goto <bb 6>; [95.93%] (If you'd like to see more context please let me know.) Martin
On Mon, Dec 19, 2016 at 10:52:13AM -0700, Jeff Law wrote: > > > > None look like real bugs to me. > > > > > > They do to me. There are calls in gengtype.c to a function decorated > > > with attribute nonnull (lbasename) that pass to it a pointer that's > > > potentially null. For example below. get_input_file_name returns > > > > Most pointers passed to functions with nonnull attributes are, from the > > compiler POV, potentially NULL. Usually the compiler just can't prove it > > can't be non-NULL, it is an exception if it can. > True. But what's happening here IIUC is that there is an explict NULL for > those arguments in the IL, most likely due to path isolation. > > I would agree that a random pointer which we know nothing about could be > NULL, but we shouldn't warn for it. That would generate too many false > positives. > > What those guidelines do mean is that various transformations and > optimizations may make warnings appear or disappear seemingly randomly. > That's unfortunate, but inherent in this class of problems until we have a > real static analyzer. From the testcases posted, there is a clear difference between "pointer is compared to NULL in the current function and soon after that is passed to a function which expects non-NULL", everything in one function, from a NULL pointer checked in inline function called from 3 other inline functions. If you test for a NULL pointer in the same function, the likelyhood that you actually do it because NULL could be passed in is much higher, over when somebody else wrote some other function that just happens to test for NULL somewhere. For path isolation it is the same thing, but IMHO not so for warnings. So, if we want to catch the first case, we shouldn't rely on path isolation to sometimes trigger if it is beneficial, but rather than have infrastructure which allows us to answer whether there is a high probability user expects a pointer might be NULL (or value might be 0 or whatever other constant), and use that in the warning for some -Wmaybe-* variant of selected warnings. We'd then warn regardless if path isolation is beneficial or not, but wouldn't warn if it is unlikely useful to warn (or there could be different levels between what we want to catch and what amount of false positives we allow). Jakub
On Mon, Dec 19, 2016 at 10:52:13AM -0700, Jeff Law wrote: > > No, it highlights that the warning is done in a wrong place where it suffers > > from too many false positives. > I don't inherently see this as generating "too many false positives". And as > Martin says, the warning works with precisely what it is presented. > > I think the particular stumbling point is path isolation at some point as > resulted in a NULL explicitly in calls at various places. That is a *GOOD* > thing to detect and warn against as it represents cases that are often well > hidden and often difficult for a human to analyze (based on my work with > NULL pointer dereference warnings). Please see e.g. PR78859 for just two recently reported issues (there are more from gathering what has been said on IRC etc., David said powerpc* bootstrap is still broken, ...). One has the non-NULL tests just as a weirdo programming style, not actually a sign that NULL will ever show there. So this one could be fixed in theory rather than adding hacks to assert it is non-NULL just remove all those NULL tests. The other is just too cryptic, there is not even locus printed for the strlen, so nobody can guess where it is coming from, guessing why will be hard even if location is provided. Jakub
On 12/19/2016 11:00 AM, Martin Sebor wrote: > On 12/19/2016 10:31 AM, Jeff Law wrote: >> On 12/17/2016 02:55 PM, Martin Sebor wrote: >>> On 12/17/2016 01:01 PM, Markus Trippelsdorf wrote: >>> >>> I agree that these warnings should probably not be issued, though >>> it's interesting to see where they come from. The calls are in >>> the code emitted by GCC, are reachable, and end up taking place >>> with the right Ubsan runtime recovery options. It turns out that >>> Ubsan transforms calls to nonnull functions into conditional >>> branches testing the argument for null, like so: >>> >>> if (s == 0) >>> __builtin___ubsan_handle_nonnull_arg(); >>> n = strlen (s); >>> >>> and GCC then transforms those into >>> >>> if (s == 0) >>> { >>> __builtin___ubsan_handle_nonnull_arg(); >>> n = strlen (NULL); >>> } >>> >>> When the ubsan_handle_nonnull_arg function returns to the caller >>> the call to strlen(NULL) is made. >> So I'd like to see more complete dumps here. > > The -Wnonnull warning can be reproduced with this C test case and > -fsantize=undefined: > > char* f (const char *s) > { > unsigned n = __builtin_strlen (s) + 1; > char *d = __builtin_malloc (n); > > if (!d) > return 0; > > __builtin_memcpy (d, s, n); > return d; > } > > The sanitizer emits the following code (I snipped the rest after > the call to malloc): [ snip.] THanks. So this is a clear case of jump threading doing exactly what it is supposed to do. We've duplicated code on the cold path to enable removal of a test on the hot path. That is *exactly* what we want. The fact that doing so creates a false positive is inherent in the transformation. The question is does this happen enough to warrant moving the warning early in the pipeline -- knowing that doing so will likely reduce this kind of false positive, but may also introduce missed warnings. --- details follow -- So the key thing to note is that sanitization wants to add the test s != NULL before the strlen call and the memcpy call. Looking at the .thread2 dump: ;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) if (s_7(D) == 0B) goto <bb 3>; [0.0%] else goto <bb 4>; [100.0%] ;; succ: 4 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 3 [0.0%] (TRUE_VALUE,EXECUTABLE) ;; basic block 3, loop depth 0, count 0, freq 4 ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [0.0%] (TRUE_VALUE,EXECUTABLE) __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data0); ;; succ: 4 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 4, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 3 [100.0%] (FALLTHRU,EXECUTABLE) _1 = __builtin_strlen (s_7(D)); _2 = (unsigned int) _1; n_8 = _2 + 1; _3 = (long unsigned int) n_8; d_10 = __builtin_malloc (_3); if (d_10 == 0B) goto <bb 8>; [4.1%] else goto <bb 5>; [95.9%] ;; succ: 8 [4.1%] (TRUE_VALUE,EXECUTABLE) ;; 5 [95.9%] (FALSE_VALUE,EXECUTABLE) ;; basic block 5, loop depth 0, count 0, freq 9593, maybe hot ;; prev block 4, next block 6, flags: (NEW, REACHABLE, VISITED) ;; pred: 4 [95.9%] (FALSE_VALUE,EXECUTABLE) if (s_7(D) == 0B) goto <bb 6>; [0.0%] else goto <bb 7>; [100.0%] ;; succ: 7 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 6 [0.0%] (TRUE_VALUE,EXECUTABLE) ;; basic block 6, loop depth 0, count 0, freq 4 ;; prev block 5, next block 7, flags: (NEW, REACHABLE, VISITED) ;; pred: 5 [0.0%] (TRUE_VALUE,EXECUTABLE) __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data2); ;; succ: 7 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 7, loop depth 0, count 0, freq 9593, maybe hot ;; prev block 6, next block 8, flags: (NEW, REACHABLE, VISITED) ;; pred: 5 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 6 [100.0%] (FALLTHRU,EXECUTABLE) __builtin_memcpy (d_10, s_7(D), _3); ;; succ: 8 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 8, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 7, next block 1, flags: (NEW, REACHABLE, VISITED) ;; pred: 4 [4.1%] (TRUE_VALUE,EXECUTABLE) ;; 7 [100.0%] (FALLTHRU,EXECUTABLE) # _4 = PHI <0B(4), d_10(7)> return _4; ;; succ: EXIT [100.0%] } We can see the redundant test showing up in bb2 and bb5. Note the tests are on the hot path. Also note that 2->4->5 will result in a control transfer to 7 and 3->4->5 will result in a control transfer to 6. As seen in the .dom2 dump: Registering jump thread: (3, 4) incoming edge; (4, 5) joiner; (5, 6) nocopy; Registering jump thread: (2, 4) incoming edge; (4, 5) joiner; (5, 7) nocopy; The "joiner" note means that we don't know where BB4 will go. But if it goes to 5, then we will unconditionally transfer to 6 or 7 depending on how we got to BB4. We can't modify BB4 with the CFG in that state (again, remember, we don't know where it will go). To optimize this case we have to copy BB4 (call it BB4'). We redirect the edge 3->4 to 3->4'. That has the side effect of making BB4 only reachable from BB2. We've now isolated the two paths. BB4 happens to be the cold path in this instance, BB4' is the hot path. Showing just the first few blocks after some cleanups: ;; basic block 2, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED) ;; pred: ENTRY [100.0%] (FALLTHRU,EXECUTABLE) if (s_7(D) == 0B) goto <bb 3>; [0.0%] else goto <bb 4>; [100.0%] ;; succ: 4 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 3 [0.0%] (TRUE_VALUE,EXECUTABLE) ;; basic block 3, loop depth 0, count 0, freq 4 ;; prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [0.0%] (TRUE_VALUE,EXECUTABLE) __builtin___ubsan_handle_nonnull_arg (&*.Lubsan_data0); ;; succ: 4 [100.0%] (FALLTHRU,EXECUTABLE) ;; basic block 4, loop depth 0, count 0, freq 10000, maybe hot ;; prev block 3, next block 5, flags: (NEW, REACHABLE, VISITED) ;; pred: 2 [100.0%] (FALSE_VALUE,EXECUTABLE) ;; 3 [100.0%] (FALLTHRU,EXECUTABLE) _1 = __builtin_strlen (s_7(D)); _2 = (unsigned int) _1; n_8 = _2 + 1; _3 = (long unsigned int) n_8; d_10 = __builtin_malloc (_3); if (d_10 == 0B) goto <bb 8>; [4.1%] else goto <bb 5>; [95.9%] ;; succ: 8 [4.1%] (TRUE_VALUE,EXECUTABLE) ;; 5 [95.9%] (FALSE_VALUE,EXECUTABLE) We'll end up merging BB3 and BB4 (cold path). And you'll easily see that the only value of s_7 going into that path is 0 which gets propagated to the __builtin_strlen call and we eventually trigger the warning. If you look at the full dump you'll see that we removed the second test of s_7 (which was on the hot path). Jeff
On 12/19/2016 11:30 AM, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 10:52:13AM -0700, Jeff Law wrote: >>> No, it highlights that the warning is done in a wrong place where it suffers >>> from too many false positives. >> I don't inherently see this as generating "too many false positives". And as >> Martin says, the warning works with precisely what it is presented. >> >> I think the particular stumbling point is path isolation at some point as >> resulted in a NULL explicitly in calls at various places. That is a *GOOD* >> thing to detect and warn against as it represents cases that are often well >> hidden and often difficult for a human to analyze (based on my work with >> NULL pointer dereference warnings). > > Please see e.g. PR78859 for just two recently reported issues (there are more > from gathering what has been said on IRC etc., David said powerpc* bootstrap > is still broken, ...). > One has the non-NULL tests just as a weirdo programming style, not actually > a sign that NULL will ever show there. So this one could be fixed in theory rather > than adding hacks to assert it is non-NULL just remove all those NULL tests. > The other is just too cryptic, there is not even locus printed for the > strlen, so nobody can guess where it is coming from, guessing why will be > hard even if location is provided. But I don't see that as inherently blocking this patch. It's pointing out a bad API interface. It's no different than when I added teh NULL pointer dereference warnings a while ago -- we had the exact same kinds of problems. The question is how many of them are there. We *know* this kind of thing is going to happen. Again, at this point I don't see 78859 as inherently meaning Martin's patch should be reverted. Jeff
On 12/16/2016 09:46 AM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 09:36:25AM -0700, Martin Sebor wrote: >>> It does for me with an allmodconf. At -O2 I get three warnings, and at >>> -O3 I get two additional warnings. Now these additional ones happen way >>> too deep into the pipeline to be reliable. (For a reduced testcase see: >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c8) >> >> The warning looks quite justified in this case. The first call >> to sm_read_sector exits with the pointer being null and so the >> second call is diagnosed. That seems like a great example of >> when the warning is useful. > > No. The first call to sm_read_sector just doesn't exit. So it is warning > about dead code. Correct. It's a false positive. Effectively the loop is never going to terminate. > >> I don't claim it can't be improved but it seems pretty good as >> it is already. Among the 6 instances it's found in GCC three >> look like real bugs. > > None look like real bugs to me. But is the warning rate so high that we need to revert/reject the warning as implemented. That's my question. 6 across GCC doesn't sound bad across a multi-million line codebase. jeff
On Mon, Dec 19, 2016 at 11:46:24AM -0700, Jeff Law wrote: > But I don't see that as inherently blocking this patch. It's pointing out a > bad API interface. It's no different than when I added teh NULL pointer > dereference warnings a while ago -- we had the exact same kinds of problems. > > The question is how many of them are there. We *know* this kind of thing is > going to happen. Again, at this point I don't see 78859 as inherently > meaning Martin's patch should be reverted. profiledbootstrap is meant to be supported without --disable-werror, has been working that way for at least last 10 years. And so is normal bootstrap on powerpc* or hppa*. So broken bootstrap is a strong reason for having to do something. Richard expressed his dissatisfaction with the vec.h change: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c17 By moving the warning earlier, we'll still warn for the most cases, but won't warn in the more convoluted cases. We can perhaps work on it further in GCC 8. If we keep it as is, I think most users will just -Wno-nonnull as soon as they run into some warning that will be hard to figure out what is going on. At that point they will not get warnings even for the obvious cases that we used to warn. Look at how the Linux kernel folks disable most of warnings even for smaller reasons. Jakub
On Mon, Dec 19, 2016 at 11:54:06AM -0700, Jeff Law wrote: > > > I don't claim it can't be improved but it seems pretty good as > > > it is already. Among the 6 instances it's found in GCC three > > > look like real bugs. > > > > None look like real bugs to me. > But is the warning rate so high that we need to revert/reject the warning as > implemented. That's my question. 6 across GCC doesn't sound bad across a > multi-million line codebase. It isn't 6 across GCC, it is 6 across a single target and single set of compiler options. Other targets and other options have different sets, there is some overlap, but only partial. Jakub
On Mon, Dec 19, 2016 at 07:58:54PM +0100, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 11:54:06AM -0700, Jeff Law wrote: > > > > I don't claim it can't be improved but it seems pretty good as > > > > it is already. Among the 6 instances it's found in GCC three > > > > look like real bugs. > > > > > > None look like real bugs to me. > > But is the warning rate so high that we need to revert/reject the warning as > > implemented. That's my question. 6 across GCC doesn't sound bad across a > > multi-million line codebase. > > It isn't 6 across GCC, it is 6 across a single target and single set of > compiler options. Other targets and other options have different sets, > there is some overlap, but only partial. Unrelated to where the warning is issued, it might be a good idea to use %K to emit it with inlining stack, otherwise figuring out why it warns will be harder than needed. Jakub
On 12/19/2016 11:56 AM, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 11:46:24AM -0700, Jeff Law wrote: >> But I don't see that as inherently blocking this patch. It's pointing out a >> bad API interface. It's no different than when I added teh NULL pointer >> dereference warnings a while ago -- we had the exact same kinds of problems. >> >> The question is how many of them are there. We *know* this kind of thing is >> going to happen. Again, at this point I don't see 78859 as inherently >> meaning Martin's patch should be reverted. > > profiledbootstrap is meant to be supported without --disable-werror, has > been working that way for at least last 10 years. But we also have to twiddle things to deal with profiledbootstrap selecting different jump threads to realize and as a result we get different Wuninitialized warnings. This happens all the time And so is normal > bootstrap on powerpc* or hppa*. Agreed, but we need to investigate those at a level deep enough to understand the real problem. So broken bootstrap is a > strong reason for having to do something. Certainly we have to do something. But that doesn't mean flat out reversion and rejection of the patch. It means careful analysis of the costs & benefits. Jumping to reversion & rejection because it triggers some warnings in cases we don't like is too hasty. > By moving the warning earlier, we'll still warn for the most cases, but > won't warn in the more convoluted cases. We can perhaps work on it further > in GCC 8. If we keep it as is, I think most users will just -Wno-nonnull > as soon as they run into some warning that will be hard to figure out what > is going on. At that point they will not get warnings even for the obvious > cases that we used to warn. Look at how the Linux kernel folks disable most > of warnings even for smaller reasons. But again, the user case is no different than other warnings that are sensitive to optimizations. My sense is that we should revert and table until gcc-8 where we can evaluate the space more thoroughly. Jeff
On 12/19/2016 12:12 PM, Jakub Jelinek wrote: > On Mon, Dec 19, 2016 at 07:58:54PM +0100, Jakub Jelinek wrote: >> On Mon, Dec 19, 2016 at 11:54:06AM -0700, Jeff Law wrote: >>>>> I don't claim it can't be improved but it seems pretty good as >>>>> it is already. Among the 6 instances it's found in GCC three >>>>> look like real bugs. >>>> >>>> None look like real bugs to me. >>> But is the warning rate so high that we need to revert/reject the warning as >>> implemented. That's my question. 6 across GCC doesn't sound bad across a >>> multi-million line codebase. >> >> It isn't 6 across GCC, it is 6 across a single target and single set of >> compiler options. Other targets and other options have different sets, >> there is some overlap, but only partial. > > Unrelated to where the warning is issued, it might be a good idea to use > %K to emit it with inlining stack, otherwise figuring out why it warns > will be harder than needed. I would think that would apply to any warning triggered once we've started optimizing the code. Don't you? jeff
On Mon, Dec 19, 2016 at 12:50:00PM -0700, Jeff Law wrote: > > Unrelated to where the warning is issued, it might be a good idea to use > > %K to emit it with inlining stack, otherwise figuring out why it warns > > will be harder than needed. > I would think that would apply to any warning triggered once we've started > optimizing the code. Don't you? Probably just any warning post-einline, yes. Note %K takes a tree, which we have in the expansion (the CALL_EXPR), but for gimple either we need to build some random tree with the location_t of gimple_location (stmt), or add another modifier that takes a location_t and otherwise does the similar thing as %K. Jakub
On 12/16/2016 10:10 AM, Martin Sebor wrote: > On 12/16/2016 09:46 AM, Jakub Jelinek wrote: >> On Fri, Dec 16, 2016 at 09:36:25AM -0700, Martin Sebor wrote: >>>> It does for me with an allmodconf. At -O2 I get three warnings, and at >>>> -O3 I get two additional warnings. Now these additional ones happen way >>>> too deep into the pipeline to be reliable. (For a reduced testcase see: >>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78817#c8) >>> >>> The warning looks quite justified in this case. The first call >>> to sm_read_sector exits with the pointer being null and so the >>> second call is diagnosed. That seems like a great example of >>> when the warning is useful. >> >> No. The first call to sm_read_sector just doesn't exit. So it is >> warning >> about dead code. > > If the code is dead then GCC should eliminate it. With it eliminated > the warning would be gone. The warning isn't smart and it doesn't > try to be. It only works with what GCC gives it. In this case the > dump shows that GCC thinks the code is reachable. If it isn't that > would seem to highlight a missed optimization opportunity, not a need > to make the warning smarter than the optimizer. It's almost always the case that a false positive warning of this nature represents a missed optimization somewhere. However, detecting the missed optimization can be extremely difficult or borderline impossible. They often arise out of unfortunate API practices (our vec implementation and NULL pointer dereferences show many great examples). And in other cases we may detect the potential for optimizing away the path, but to do so simply isn't profitable. Many of these opportunities are path specific. To be able to take advantage of the path specific properties you have to isolate the path (ie, create a duplicate that you can munge). The cost of duplication often exceeds the value in removing the redundancy/dead code. I often wonder if we should look at some of the schemes out there that allow marking of infeasible paths through the CFG. Then we could ignore those paths through the CFG (you essentially build def-use pairs that are ignored). It sounds good in theory, but I don't know if it's ever been tried in practice. > > They do to me. There are calls in gengtype.c to a function decorated > with attribute nonnull (lbasename) that pass to it a pointer that's > potentially null. For example below. get_input_file_name returns > null when inpf is null. There are other calls to get_input_file_name > in the file that check its return value (e.g., get_file_basename) > and so those that don't suggest bugs. If there is no way for the > get_input_file_name function to return null then that suggests > the function should be declared returns_nonnull (and the return > NULL statement from it removed), and all those callers that check > for null simplified. In any case, at a minimum the warning points > out an inconsistency that suggests a possible improvement. That, > in my view, is one of the things that make warnings valuable even > if they don't identify the smoking gun. I think this touches on another (gcc-8) issue. Namely using the IPA propagation engine to propagate more of these attributes so that we're not always mucking around trying to annotate things internally, but instead let the compiler do much of the heavy lifting. jeff
>> By moving the warning earlier, we'll still warn for the most cases, but >> won't warn in the more convoluted cases. We can perhaps work on it >> further >> in GCC 8. If we keep it as is, I think most users will just -Wno-nonnull >> as soon as they run into some warning that will be hard to figure out >> what >> is going on. At that point they will not get warnings even for the >> obvious >> cases that we used to warn. Look at how the Linux kernel folks >> disable most >> of warnings even for smaller reasons. > But again, the user case is no different than other warnings that are > sensitive to optimizations. > > My sense is that we should revert and table until gcc-8 where we can > evaluate the space more thoroughly. I think that would be unfortunate. We have a number of alternatives that seem much preferable. I have a trivial patch that avoids the sanitizer warnings by disabling the late -Wnonnull warning when -fsanitize=undefined is specified. A simple fix for the GCC warnings discovered by profiledbootstrap-O3 and verified on powerpc64 and x86_63 was posted for review last week. Jakub's patch avoids those warnings and obviates any GCC changes (IIUC). There is also the option of introducing -Wnonnull=2 that warns late and not including it in -Wall or -Wextra. All three of this have been tested. All of these seem safe to me and give interested users the ability to experiment with the warning and give us feedback. With alternative (1) users have the option of turning -Wnonnull off. Since the early warning detects just a trivial subset of these problems (and is still prone to false positives) they would be giving up little by doing that. Martin
On 12/19/2016 01:09 PM, Martin Sebor wrote: >>> By moving the warning earlier, we'll still warn for the most cases, but >>> won't warn in the more convoluted cases. We can perhaps work on it >>> further >>> in GCC 8. If we keep it as is, I think most users will just >>> -Wno-nonnull >>> as soon as they run into some warning that will be hard to figure out >>> what >>> is going on. At that point they will not get warnings even for the >>> obvious >>> cases that we used to warn. Look at how the Linux kernel folks >>> disable most >>> of warnings even for smaller reasons. >> But again, the user case is no different than other warnings that are >> sensitive to optimizations. >> >> My sense is that we should revert and table until gcc-8 where we can >> evaluate the space more thoroughly. > > I think that would be unfortunate. We have a number of alternatives > that seem much preferable. Definitely unfortunate, but I don't think we have agreement on the key issue, namely where the warning belongs. > > I have a trivial patch that avoids the sanitizer warnings by disabling > the late -Wnonnull warning when -fsanitize=undefined is specified. > A simple fix for the GCC warnings discovered by profiledbootstrap-O3 > and verified on powerpc64 and x86_63 was posted for review last week. The former seems like a hack. I can't recall which patch the latter was (was it the one that just ripped out the "can't happen code"? > > Jakub's patch avoids those warnings and obviates any GCC changes (IIUC). Yes, but they suffer from missing warnings that are exposed by transformations later in the pipeline. > > There is also the option of introducing -Wnonnull=2 that warns late > and not including it in -Wall or -Wextra. All three of this have > been tested. Understood, but I'm not keep to start adding more levels of warning and code to support them until we have reached some kind of agreement. And it's my sense that we need more time to hammer out that kind of agreement. Jeff
On 12/16/2016 09:41 AM, Jakub Jelinek wrote: > On Fri, Dec 16, 2016 at 11:08:00AM +0100, Jakub Jelinek wrote: >> Here is an untested proof of concept for: >> 1) keeping the warning in the FEs no matter what optimization level is on, >> just making sure TREE_NO_WARNING is set on the CALL_EXPR if we've warned >> 2) moving the rest of the warning shortly post IPA, when we have performed >> inlining already and some constant propagation afterwards, but where >> hopefully the IL still isn't too much different from the original source >> 3) as the nonnull attribute is a type property, it warns about the function >> type of the call, doesn't require a fndecl >> The tree-ssa-ccp.c location is just randomly chosen, the pass could go >> into its own file, or some other file. And I think e.g. the -Walloc-zero >> warning should move there as well. >> >> If you think warning later can be still useful to some users at the expense >> of higher false positive rate, we could have -Wmaybe-nonnull warning that >> would guard that and set the gimple no warning flag when we warn in the >> pass. >> >> If needed, there is always the option on the table to turn >> TREE_NO_WARNING/gimple_no_warning_p into a bit that says on the side hash >> table contains bitmap of disabled warnings for the expression or statement. >> IMHO we want to do that in any case, just not sure if it is urgent to do for >> GCC 7. > > Now successfully bootstrapped/regtested on x86_64-linux and i686-linux, so I > wrote ChangeLog for it as well: > > 2016-12-16 Jakub Jelinek <jakub@redhat.com> > > PR bootstrap/78817 > * tree-pass.h (make_pass_post_ipa_warn): Declare. > * builtins.c (validate_arglist): Adjust get_nonnull_args call. > Check for NULL pointer argument to nonnull arg here. > (validate_arg): Revert 2016-12-14 changes. > * calls.h (get_nonnull_args): Remove declaration. > * tree-ssa-ccp.c: Include diagnostic-core.h. > (pass_data_post_ipa_warn): New variable. > (pass_post_ipa_warn): New class. > (pass_post_ipa_warn::execute): New method. > (make_pass_post_ipa_warn): New function. > * tree.h (get_nonnull_args): Declare. > * tree.c (get_nonnull_args): New function. > * calls.c (maybe_warn_null_arg): Removed. > (maybe_warn_null_arg): Removed. > (initialize_argument_information): Revert 2016-12-14 changes. > * passes.def: Add pass_post_ipa_warn after first ccp after IPA. > c-family/ > * c-common.c (struct nonnull_arg_ctx): New type. > (check_function_nonnull): Return bool instead of void. Use > nonnull_arg_ctx as context rather than just location_t. > (check_nonnull_arg): Adjust for the new context type, set > warned_p to true if a warning has been diagnosed. > (check_function_arguments): Return bool instead of void. > * c-common.h (check_function_arguments): Adjust prototype. > c/ > * c-typeck.c (build_function_call_vec): If check_function_arguments > returns true, set TREE_NO_WARNING on CALL_EXPR. > cp/ > * typeck.c (cp_build_function_call_vec): If check_function_arguments > returns true, set TREE_NO_WARNING on CALL_EXPR. > * call.c (build_over_call): Likewise. So I spoke with Martin yesterday and have been convinced that we ought to go forward now rather than waiting for gcc-8. Essentially the argument is that Jakub's patch is a significant improvement over where the warnings were in prior GCC releases, even if they don't go as far as Martin's work. We can (and should) evaluate whether or not to push things further in gcc-8. So with that in mind... It looks like you could avoid a lot of work in pass_post_ipa_warn::execute by checking if warnings were asked for outside the main loop. Presumably you wrote this with the check inside the loop with the expectation that other warnings might move into this routine, right? Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in a correct position, but it might be more maintainable long term if the allocation/deallocation occur at the same nesting level. OK as-is or with the BITMAP_FREE call moved to the same scoping level as get_nonnull_args. Jeff
On Wed, Dec 21, 2016 at 02:47:49PM -0700, Jeff Law wrote: > It looks like you could avoid a lot of work in pass_post_ipa_warn::execute > by checking if warnings were asked for outside the main loop. Presumably > you wrote this with the check inside the loop with the expectation that > other warnings might move into this routine, right? I had it in mind for the -Walloc-zero warning, yes. And the gate checks whether the warnings are requested: + virtual bool gate (function *) { return warn_nonnull != 0; } If we add further warnings to this pass, they would be added to the main loop and gate. > Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in > a correct position, but it might be more maintainable long term if the > allocation/deallocation occur at the same nesting level. The only case where it makes a difference is where the bitmap is NULL and there is nothing to free (and that is the common case). > OK as-is or with the BITMAP_FREE call moved to the same scoping level as > get_nonnull_args. Thanks. Jakub
On 12/21/2016 03:11 PM, Jakub Jelinek wrote: > On Wed, Dec 21, 2016 at 02:47:49PM -0700, Jeff Law wrote: >> It looks like you could avoid a lot of work in pass_post_ipa_warn::execute >> by checking if warnings were asked for outside the main loop. Presumably >> you wrote this with the check inside the loop with the expectation that >> other warnings might move into this routine, right? > > I had it in mind for the -Walloc-zero warning, yes. > And the gate checks whether the warnings are requested: > + virtual bool gate (function *) { return warn_nonnull != 0; } > If we add further warnings to this pass, they would be added to the main > loop and gate. Ah, missed that in the gate. > >> Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in >> a correct position, but it might be more maintainable long term if the >> allocation/deallocation occur at the same nesting level. > > The only case where it makes a difference is where the bitmap is NULL and > there is nothing to free (and that is the common case). Yes, I know. It's more a style issue -- having the allocation/deallocation at the same scoping level is simply easier for humans to process and thus makes it much less likely for someone to muck it up later (particularly if that routine grows). Jeff
On Wed, Dec 21, 2016 at 03:13:29PM -0700, Jeff Law wrote: > > > Also in pass_post_ipa_warn::execute, the BITMAP_FREE call is technically in > > > a correct position, but it might be more maintainable long term if the > > > allocation/deallocation occur at the same nesting level. > > > > The only case where it makes a difference is where the bitmap is NULL and > > there is nothing to free (and that is the common case). > Yes, I know. It's more a style issue -- having the allocation/deallocation > at the same scoping level is simply easier for humans to process and thus > makes it much less likely for someone to muck it up later (particularly if > that routine grows). But in this case the allocation is just in a different routine, and is only conditional, and the deallocation is conditional as well. Jakub
PR bootstrap/78817 - stage2 bootstrap failure in vec.h:1613:5: error: argument 1 null where non-null expected after r243661 gcc/ChangeLog: PR bootstrap/78817 * calls.c (maybe_warn_null_arg): Add a missing '='. * gengtype-parse.c (type): Guard against dereferncing a null pointer. (get_file_realbasename): Avoid passing a null pointer to a function expecting non-null. * tree-vect-data-refs.c (vect_permute_store_chain): Assert pointer is non-null. (vect_permute_load_chain): Same. * vec.h (vec<T, va_heap, vl_ptr>::safe_grow_cleared): Assert a pointer is non-null. diff --git a/gcc/calls.c b/gcc/calls.c index 8466427..463f99c 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -1573,7 +1573,7 @@ maybe_warn_null_arg (tree fndecl, tree exp, tree arg, ++argpos; - location_t exploc EXPR_LOCATION (exp); + location_t exploc = EXPR_LOCATION (exp); if (warning_at (exploc, OPT_Wnonnull, "argument %u null where non-null expected", argpos)) diff --git a/gcc/gengtype-parse.c b/gcc/gengtype-parse.c index 954ac2a..b11da84 100644 --- a/gcc/gengtype-parse.c +++ b/gcc/gengtype-parse.c @@ -946,15 +946,18 @@ type (options_p *optsp, bool nested) inheritance specifications. We require single-inheritance from a non-template type. */ advance (); - const char *basename = require (ID); - /* This may be either an access specifier, or the base name. */ - if (0 == strcmp (basename, "public") - || 0 == strcmp (basename, "protected") - || 0 == strcmp (basename, "private")) - basename = require (ID); - base_class = find_structure (basename, TYPE_STRUCT); - if (!base_class) - parse_error ("unrecognized base class: %s", basename); + if (const char *basename = require (ID)) + { + /* This may be either an access specifier, or the base + name. */ + if (0 == strcmp (basename, "public") + || 0 == strcmp (basename, "protected") + || 0 == strcmp (basename, "private")) + basename = require (ID); + base_class = find_structure (basename, TYPE_STRUCT); + if (!base_class) + parse_error ("unrecognized base class: %s", basename); + } require_without_advance ('{'); } else diff --git a/gcc/gengtype.c b/gcc/gengtype.c index dcc2ff5..c63bd8e 100644 --- a/gcc/gengtype.c +++ b/gcc/gengtype.c @@ -1747,7 +1747,10 @@ open_base_files (void) static const char * get_file_realbasename (const input_file *inpf) { - return lbasename (get_input_file_name (inpf)); + if (const char *fname = get_input_file_name (inpf)) + return lbasename (fname); + + return NULL; } /* For INPF a filename, return the relative path to INPF from @@ -1756,12 +1759,13 @@ get_file_realbasename (const input_file *inpf) const char * get_file_srcdir_relative_path (const input_file *inpf) { - const char *f = get_input_file_name (inpf); - if (strlen (f) > srcdir_len - && IS_DIR_SEPARATOR (f[srcdir_len]) - && strncmp (f, srcdir, srcdir_len) == 0) - return f + srcdir_len + 1; - else + if (const char *f = get_input_file_name (inpf)) + { + if (strlen (f) > srcdir_len + && IS_DIR_SEPARATOR (f[srcdir_len]) + && strncmp (f, srcdir, srcdir_len) == 0) + return f + srcdir_len + 1; + } return NULL; } diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 1b9c3b3..967fa44 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -4954,7 +4954,8 @@ vect_permute_store_chain (vec<tree> dr_chain, } else { - /* If length is not equal to 3 then only power of 2 is supported. */ + /* If length is not equal to 3 then only positive powers of 2 are + supported. */ gcc_assert (pow2p_hwi (length)); for (i = 0, n = nelt / 2; i < n; i++) @@ -4994,6 +4995,11 @@ vect_permute_store_chain (vec<tree> dr_chain, vect_finish_stmt_generation (stmt, perm_stmt, gsi); (*result_chain)[2*j+1] = low; } + + /* The assert below is strictly redundant because RESULT_CHAIN + has a size equal to LENGTH which is asserted to be positive + above. Unfortunately, GCC doesn't see that. */ + gcc_assert (result_chain->address ()); memcpy (dr_chain.address (), result_chain->address (), length * sizeof (tree)); } @@ -5557,6 +5563,11 @@ vect_permute_load_chain (vec<tree> dr_chain, vect_finish_stmt_generation (stmt, perm_stmt, gsi); (*result_chain)[j/2+length/2] = data_ref; } + + /* The assert below is strictly redundant because RESULT_CHAIN + has a size equal to LENGTH which is asserted to be positive + above. Unfortunately, GCC doesn't see that. */ + gcc_assert (result_chain->address ()); memcpy (dr_chain.address (), result_chain->address (), length * sizeof (tree)); } diff --git a/gcc/vec.h b/gcc/vec.h index aa93411..b6b54ef 100644 --- a/gcc/vec.h +++ b/gcc/vec.h @@ -1607,10 +1608,16 @@ inline void vec<T, va_heap, vl_ptr>::safe_grow_cleared (unsigned len MEM_STAT_DECL) { unsigned oldlen = length (); - size_t sz = sizeof (T) * (len - oldlen); - safe_grow (len PASS_MEM_STAT); - if (sz != 0) - memset (&(address ()[oldlen]), 0, sz); + gcc_checking_assert (oldlen <= len); + + if (size_t sz = sizeof (T) * (len - oldlen)) + { + safe_grow (len PASS_MEM_STAT); + + T *p = address (); + gcc_assert (p != NULL); + memset (p + oldlen, 0, sz); + } }