Message ID | 20250328104930.2179123-1-fisaksen@baylibre.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] usb: dwc3: gadget: check that event count does not exceed event buffer length | expand |
On 4/1/2025 5:38 PM, Frode Isaksen wrote: > On 4/1/25 7:43 AM, Krishna Kurapati wrote: >> >> >> On 3/28/2025 4:14 PM, Frode Isaksen wrote: >>> From: Frode Isaksen <frode@meta.com> >>> >>> The event count is read from register DWC3_GEVNTCOUNT. >>> There is a check for the count being zero, but not for exceeding the >>> event buffer length. >>> Check that event count does not exceed event buffer length, >>> avoiding an out-of-bounds access when memcpy'ing the event. >>> Crash log: >>> Unable to handle kernel paging request at virtual address >>> ffffffc0129be000 >>> pc : __memcpy+0x114/0x180 >>> lr : dwc3_check_event_buf+0xec/0x348 >>> x3 : 0000000000000030 x2 : 000000000000dfc4 >>> x1 : ffffffc0129be000 x0 : ffffff87aad60080 >>> Call trace: >>> __memcpy+0x114/0x180 >>> dwc3_interrupt+0x24/0x34 >>> >>> Signed-off-by: Frode Isaksen <frode@meta.com> >>> Fixes: ebbb2d59398f ("usb: dwc3: gadget: use evt->cache for >>> processing events") >>> Cc: stable@vger.kernel.org >>> --- >>> v1 -> v2: Added Fixes and Cc tag. >>> >>> This bug was discovered, tested and fixed (no more crashes seen) on >>> Meta Quest 3 device. >>> Also tested on T.I. AM62x board. >>> >>> drivers/usb/dwc3/gadget.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 63fef4a1a498..548e112167f3 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -4564,7 +4564,7 @@ static irqreturn_t dwc3_check_event_buf(struct >>> dwc3_event_buffer *evt) >>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); >>> count &= DWC3_GEVNTCOUNT_MASK; >>> - if (!count) >>> + if (!count || count > evt->length) >>> return IRQ_NONE; >>> evt->count = count; >> >> >> I did see this issue previously ([1] on 5.10) on SAR2130 (upstreamed >> recently). Can you help check if the issue is same on your end if you >> can reproduce it easily. Thinh also provided some debug pointers to >> check suspecting it to be a HW issue. > > Seems to be exactly the same issue, and your fix looks OK as well. I'm > happy to abandon my patch and let yo provide the fix. > NAK. I tried to skip copying data beyond 4K which is not the right approach. Thinh was tending more towards your line of code changes. So your code looks fine, but an error log indicating the presence of this issue might be helpful. > Note that I am not able to reproduce this locally and it happens very > seldom. > It was very hard to reproduce this issue. Only two instances reported on SAR2130 on my end. > Where can I find the upstream'ed version ? > The upstreamed version I was referring to was that SAR2130 DT is present on open-source. Regards, Krishna, >> >> As per the comments from Thinh, he suggested to add a error log as >> well when this happens [2]. >> >> [1]: >> https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com/ >> >> [2]: >> https://lore.kernel.org/all/20230525001822.ane3zcyyifj2kuwx@synopsys.com/ >> >> Regards, >> Krishna, > >
Hi Frode, On Tue, Apr 01, 2025, Krishna Kurapati wrote: > > > On 4/1/2025 5:38 PM, Frode Isaksen wrote: > > On 4/1/25 7:43 AM, Krishna Kurapati wrote: > > > > > > > > > On 3/28/2025 4:14 PM, Frode Isaksen wrote: > > > > From: Frode Isaksen <frode@meta.com> > > > > > > > > The event count is read from register DWC3_GEVNTCOUNT. > > > > There is a check for the count being zero, but not for exceeding the > > > > event buffer length. > > > > Check that event count does not exceed event buffer length, > > > > avoiding an out-of-bounds access when memcpy'ing the event. > > > > Crash log: > > > > Unable to handle kernel paging request at virtual address > > > > ffffffc0129be000 > > > > pc : __memcpy+0x114/0x180 > > > > lr : dwc3_check_event_buf+0xec/0x348 > > > > x3 : 0000000000000030 x2 : 000000000000dfc4 > > > > x1 : ffffffc0129be000 x0 : ffffff87aad60080 > > > > Call trace: > > > > __memcpy+0x114/0x180 > > > > dwc3_interrupt+0x24/0x34 > > > > > > > > Signed-off-by: Frode Isaksen <frode@meta.com> > > > > Fixes: ebbb2d59398f ("usb: dwc3: gadget: use evt->cache for > > > > processing events") > > > > Cc: stable@vger.kernel.org > > > > --- > > > > v1 -> v2: Added Fixes and Cc tag. > > > > > > > > This bug was discovered, tested and fixed (no more crashes seen) > > > > on Meta Quest 3 device. > > > > Also tested on T.I. AM62x board. > > > > > > > > drivers/usb/dwc3/gadget.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > > > > index 63fef4a1a498..548e112167f3 100644 > > > > --- a/drivers/usb/dwc3/gadget.c > > > > +++ b/drivers/usb/dwc3/gadget.c > > > > @@ -4564,7 +4564,7 @@ static irqreturn_t > > > > dwc3_check_event_buf(struct dwc3_event_buffer *evt) > > > > count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); > > > > count &= DWC3_GEVNTCOUNT_MASK; > > > > - if (!count) > > > > + if (!count || count > evt->length) > > > > return IRQ_NONE; > > > > evt->count = count; > > > > > > > > > I did see this issue previously ([1] on 5.10) on SAR2130 (upstreamed > > > recently). Can you help check if the issue is same on your end if > > > you can reproduce it easily. Thinh also provided some debug pointers > > > to check suspecting it to be a HW issue. > > > > Seems to be exactly the same issue, and your fix looks OK as well. I'm > > happy to abandon my patch and let yo provide the fix. > > > > NAK. I tried to skip copying data beyond 4K which is not the right approach. > Thinh was tending more towards your line of code changes. So your code looks > fine, but an error log indicating the presence of this issue might be > helpful. > > > Note that I am not able to reproduce this locally and it happens very > > seldom. > > > > It was very hard to reproduce this issue. Only two instances reported on > SAR2130 on my end. > I still wonder what's current behavior of the HW to properly respond here. If the device is dead, register read often returns all Fs, which may be the case you're seeing here. If so, we should properly prevent the driver from accessing the device and properly teardown the driver. If this is a momentary bleep/lost of power in the device, perhaps your change here is sufficient and the driver can continue to access the device. With the difficulty of reproducing this issue, can you confirm that the device still operates properly after this change? Thanks, Thinh
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 63fef4a1a498..548e112167f3 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4564,7 +4564,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); count &= DWC3_GEVNTCOUNT_MASK; - if (!count) + if (!count || count > evt->length) return IRQ_NONE; evt->count = count;