Message ID | 1515548724-31869-3-git-send-email-yamada.masahiro@socionext.com |
---|---|
State | New |
Headers | show |
Series | assert() is almost used in the same way as BUG_ON(), except: | expand |
Bin, Marek, Would you take a look at xHCI code? Lots of xHCI code invokes BUG_ON()/BUG() when the hardware access fails, then halts the system completely. This is not a software bug. 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>: > xhci_wait_for_event() is supposed to return a pointer to union xhci_trb, > but it does not return anything at the end of the function. > > This relies on that the end of the function is unreachable due to BUG(). > > We are planning to make BUG() no-op for platforms with strong image size > constraint. Doing so would cause compiler warning: > > drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type] > } > ^ > > So, this function must return something. From the error message just > above, ERR_PTR(-ETIMEDOUT) seems a good choice. > > The use of BUG() looks suspicious here in the first place; no response > from hardware is not a bug. It should be treated as a normal error. > So, this function must return an error pointer instead of BUG(), then > the caller must handle it properly. > > I am not fixing the code because this is not the only place that stops > the system. Just one failure of xHCI halts the system, here and there. > > I left a comment block, hoping somebody will take a look. > > Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> > --- > > Changes in v3: > - newly added > > Changes in v2: None > > drivers/usb/host/xhci-ring.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 579e670..d780367 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -471,7 +471,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected) > return NULL; > > printf("XHCI timeout on event type %d... cannot recover.\n", expected); > + > + /* > + * CHECK: > + * Is this software bug? Is this a good reason to halt the system? > + */ > BUG(); > + > + return ERR_PTR(-ETIMEDOUT); > } > > /* > -- > 2.7.4 > > _______________________________________________ > U-Boot mailing list > U-Boot@lists.denx.de > https://lists.denx.de/listinfo/u-boot
On 01/28/2018 11:18 AM, Masahiro Yamada wrote: > Bin, Marek, Hi, > Would you take a look at xHCI code? I was kinda hoping Bin would, oh well ... > Lots of xHCI code invokes BUG_ON()/BUG() > when the hardware access fails, > then halts the system completely. > This is not a software bug. > > > 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>: >> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb, >> but it does not return anything at the end of the function. >> >> This relies on that the end of the function is unreachable due to BUG(). >> >> We are planning to make BUG() no-op for platforms with strong image size >> constraint. But BUG() should hang the system because it means an unrecoverable issue happened. Changing it to nop would cause a lot of weird misbehavior, so that is IMO a bad idea. Just change it to hang(), that should be present even on size-constrained systems. >> Doing so would cause compiler warning: >> >> drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type] >> } >> ^ >> >> So, this function must return something. From the error message just >> above, ERR_PTR(-ETIMEDOUT) seems a good choice. This is an imported code from Linux, so you might want to check what they do/did there. >> The use of BUG() looks suspicious here in the first place; no response >> from hardware is not a bug. It should be treated as a normal error. >> So, this function must return an error pointer instead of BUG(), then >> the caller must handle it properly. >> >> I am not fixing the code because this is not the only place that stops >> the system. Just one failure of xHCI halts the system, here and there. >> >> I left a comment block, hoping somebody will take a look. >> >> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> >> --- >> >> Changes in v3: >> - newly added >> >> Changes in v2: None >> >> drivers/usb/host/xhci-ring.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c >> index 579e670..d780367 100644 >> --- a/drivers/usb/host/xhci-ring.c >> +++ b/drivers/usb/host/xhci-ring.c >> @@ -471,7 +471,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected) >> return NULL; >> >> printf("XHCI timeout on event type %d... cannot recover.\n", expected); >> + >> + /* >> + * CHECK: >> + * Is this software bug? Is this a good reason to halt the system? >> + */ >> BUG(); >> + >> + return ERR_PTR(-ETIMEDOUT); >> } >> >> /* >> -- >> 2.7.4 >> >> _______________________________________________ >> U-Boot mailing list >> U-Boot@lists.denx.de >> https://lists.denx.de/listinfo/u-boot > > >
Hi Marek, 2018-01-28 21:43 GMT+09:00 Marek Vasut <marex@denx.de>: > On 01/28/2018 11:18 AM, Masahiro Yamada wrote: >> Bin, Marek, > > Hi, > >> Would you take a look at xHCI code? > > I was kinda hoping Bin would, oh well ... > >> Lots of xHCI code invokes BUG_ON()/BUG() >> when the hardware access fails, >> then halts the system completely. >> This is not a software bug. >> >> >> 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>: >>> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb, >>> but it does not return anything at the end of the function. >>> >>> This relies on that the end of the function is unreachable due to BUG(). >>> >>> We are planning to make BUG() no-op for platforms with strong image size >>> constraint. > > But BUG() should hang the system because it means an unrecoverable issue > happened. Changing it to nop would cause a lot of weird misbehavior, so > that is IMO a bad idea. Just change it to hang(), that should be present > even on size-constrained systems. In my opinion, BUG_ON(!x) and assert(x) are equivalent. Both check software bugs that should never happen (but useful for debugging). When releasing software, we can turn them into no-op. Actually, assert() works like that in user-space programs. (assert() is no-op if NDEBUG is defined) >>> Doing so would cause compiler warning: >>> >>> drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type] >>> } >>> ^ >>> >>> So, this function must return something. From the error message just >>> above, ERR_PTR(-ETIMEDOUT) seems a good choice. > > This is an imported code from Linux, so you might want to check what > they do/did there. > I grepped xhci_wait_for_event in the kernel source, but did not get any hit.
On 01/28/2018 03:30 PM, Masahiro Yamada wrote: > Hi Marek, > > 2018-01-28 21:43 GMT+09:00 Marek Vasut <marex@denx.de>: >> On 01/28/2018 11:18 AM, Masahiro Yamada wrote: >>> Bin, Marek, >> >> Hi, >> >>> Would you take a look at xHCI code? >> >> I was kinda hoping Bin would, oh well ... >> >>> Lots of xHCI code invokes BUG_ON()/BUG() >>> when the hardware access fails, >>> then halts the system completely. >>> This is not a software bug. >>> >>> >>> 2018-01-10 10:45 GMT+09:00 Masahiro Yamada <yamada.masahiro@socionext.com>: >>>> xhci_wait_for_event() is supposed to return a pointer to union xhci_trb, >>>> but it does not return anything at the end of the function. >>>> >>>> This relies on that the end of the function is unreachable due to BUG(). >>>> >>>> We are planning to make BUG() no-op for platforms with strong image size >>>> constraint. >> >> But BUG() should hang the system because it means an unrecoverable issue >> happened. Changing it to nop would cause a lot of weird misbehavior, so >> that is IMO a bad idea. Just change it to hang(), that should be present >> even on size-constrained systems. > > > In my opinion, BUG_ON(!x) and assert(x) are equivalent. > > Both check software bugs that should never happen > (but useful for debugging). > > > When releasing software, we can turn them into no-op. > > Actually, assert() works like that in user-space programs. > (assert() is no-op if NDEBUG is defined) Except in kernel, BUG() means something really bad happened and is not optimized away, ever.
Hi Masahiro, On Sun, Jan 28, 2018 at 6:18 PM, Masahiro Yamada <yamada.masahiro@socionext.com> wrote: > Bin, Marek, > > Would you take a look at xHCI code? > Sorry I missed this before. > Lots of xHCI code invokes BUG_ON()/BUG() > when the hardware access fails, > then halts the system completely. > This is not a software bug. Given xHCI codes were originally from Linux where BUG_ON()/BUG() was once there but now disappears, I agree let's do it to clear the compiler warnings for now. Something needs to be done in the U-Boot xHCI codes to handle the error path properly. Regards, Bin
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 579e670..d780367 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -471,7 +471,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected) return NULL; printf("XHCI timeout on event type %d... cannot recover.\n", expected); + + /* + * CHECK: + * Is this software bug? Is this a good reason to halt the system? + */ BUG(); + + return ERR_PTR(-ETIMEDOUT); } /*
xhci_wait_for_event() is supposed to return a pointer to union xhci_trb, but it does not return anything at the end of the function. This relies on that the end of the function is unreachable due to BUG(). We are planning to make BUG() no-op for platforms with strong image size constraint. Doing so would cause compiler warning: drivers/usb/host/xhci-ring.c:475:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ So, this function must return something. From the error message just above, ERR_PTR(-ETIMEDOUT) seems a good choice. The use of BUG() looks suspicious here in the first place; no response from hardware is not a bug. It should be treated as a normal error. So, this function must return an error pointer instead of BUG(), then the caller must handle it properly. I am not fixing the code because this is not the only place that stops the system. Just one failure of xHCI halts the system, here and there. I left a comment block, hoping somebody will take a look. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> --- Changes in v3: - newly added Changes in v2: None drivers/usb/host/xhci-ring.c | 7 +++++++ 1 file changed, 7 insertions(+)