Message ID | 20210830123704.221494-2-verdre@v0yd.nl |
---|---|
State | New |
Headers | show |
Series | mwifiex: Work around firmware bugs on 88W8897 chip | expand |
On Wed, Sep 1, 2021 at 5:02 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > On 8/30/21 2:49 PM, Andy Shevchenko wrote: > > On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: ... > > Thanks for all this work! > > > > Nevertheless, do we have any commits that may be a good candidate to > > be in the Fixes tag here? > I don't think there's any commit we could point to, given that the bug > is probably somewhere in the firmware code. Here (in the commit message)... > >> + /* Do a read-back, which makes the write non-posted, > ensuring the > >> + * completion before returning. > > > >> + * The firmware of the 88W8897 card is buggy and this avoids > crashes. > > > > Any firmware version reference? Would be nice to have just for the > > sake of record. > > Pretty sure the crash is present in every firmware that has been > released, I've tried most of them. FTR, the current firmware version is > 15.68.19.p21. ...and here it would be nice to state this version, so in the future we will have a right landmark. > >> + */ -- With Best Regards, Andy Shevchenko
On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote: > On 8/30/21 2:49 PM, Andy Shevchenko wrote: > > On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > >> > >> On the 88W8897 card it's very important the TX ring write pointer is > >> updated correctly to its new value before setting the TX ready > >> interrupt, otherwise the firmware appears to crash (probably because > >> it's trying to DMA-read from the wrong place). > >> > >> Since PCI uses "posted writes" when writing to a register, it's not > >> guaranteed that a write will happen immediately. That means the pointer > >> might be outdated when setting the TX ready interrupt, leading to > >> firmware crashes especially when ASPM L1 and L1 substates are enabled > >> (because of the higher link latency, the write will probably take > >> longer). > >> > >> So fix those firmware crashes by always forcing non-posted writes. We do > >> that by simply reading back the register after writing it, just as a lot > >> of other drivers do. > >> > >> There are two reproducers that are fixed with this patch: > >> > >> 1) During rx/tx traffic and with ASPM L1 substates enabled (the enabled > >> substates are platform dependent), the firmware crashes and eventually a > >> command timeout appears in the logs. That crash is fixed by using a > >> non-posted write in mwifiex_pcie_send_data(). > >> > >> 2) When sending lots of commands to the card, waking it up from sleep in > >> very quick intervals, the firmware eventually crashes. That crash > >> appears to be fixed by some other non-posted write included here. > > > > Thanks for all this work! > > > > Nevertheless, do we have any commits that may be a good candidate to > > be in the Fixes tag here? > > > > I don't think there's any commit we could point to, given that the bug is > probably somewhere in the firmware code. Then please add Cc: stable@vger.kernel.org tag into commit message. Such bugfix is a good candidate for backporting into stable releases. > >> Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
On 01.09.2021 17:51, Pali Rohár wrote: > On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote: >> On 8/30/21 2:49 PM, Andy Shevchenko wrote: >>> On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: >>>> >>>> On the 88W8897 card it's very important the TX ring write pointer is >>>> updated correctly to its new value before setting the TX ready >>>> interrupt, otherwise the firmware appears to crash (probably because >>>> it's trying to DMA-read from the wrong place). >>>> This sounds somehow like the typical case where you write DMA descriptors and then ring the doorbell. This normally requires a dma_wmb(). Maybe something like that is missing here? Reading back all register writes may cause a certain performance impact, and if it can be avoided we should try to avoid it. >>>> Since PCI uses "posted writes" when writing to a register, it's not >>>> guaranteed that a write will happen immediately. That means the pointer >>>> might be outdated when setting the TX ready interrupt, leading to >>>> firmware crashes especially when ASPM L1 and L1 substates are enabled >>>> (because of the higher link latency, the write will probably take >>>> longer). >>>> >>>> So fix those firmware crashes by always forcing non-posted writes. We do >>>> that by simply reading back the register after writing it, just as a lot >>>> of other drivers do. >>>> >>>> There are two reproducers that are fixed with this patch: >>>> >>>> 1) During rx/tx traffic and with ASPM L1 substates enabled (the enabled >>>> substates are platform dependent), the firmware crashes and eventually a >>>> command timeout appears in the logs. That crash is fixed by using a >>>> non-posted write in mwifiex_pcie_send_data(). >>>> >>>> 2) When sending lots of commands to the card, waking it up from sleep in >>>> very quick intervals, the firmware eventually crashes. That crash >>>> appears to be fixed by some other non-posted write included here. >>> >>> Thanks for all this work! >>> >>> Nevertheless, do we have any commits that may be a good candidate to >>> be in the Fixes tag here? >>> >> >> I don't think there's any commit we could point to, given that the bug is >> probably somewhere in the firmware code. > > Then please add Cc: stable@vger.kernel.org tag into commit message. Such > bugfix is a good candidate for backporting into stable releases. > >>>> Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
On Wed, 2021-09-01 at 18:51 +0200, Heiner Kallweit wrote: > On 01.09.2021 17:51, Pali Rohár wrote: > > On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote: > > > On 8/30/21 2:49 PM, Andy Shevchenko wrote: > > > > On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > > > > > > > > > On the 88W8897 card it's very important the TX ring write pointer is > > > > > updated correctly to its new value before setting the TX ready > > > > > interrupt, otherwise the firmware appears to crash (probably because > > > > > it's trying to DMA-read from the wrong place). > > > > > > > This sounds somehow like the typical case where you write DMA descriptors > and then ring the doorbell. This normally requires a dma_wmb(). > Maybe something like that is missing here? But it looks like this "TX ring write pointer" is actually the register? However, I would agree that doing it in mwifiex_write_reg() is possibly too big a hammer - could be done only for reg->tx_wrptr, not all the registers? Actually, can two writes actually cross on PCI? johannes
On Mon, Aug 30, 2021 at 5:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: > > On the 88W8897 card it's very important the TX ring write pointer is > updated correctly to its new value before setting the TX ready > interrupt, otherwise the firmware appears to crash (probably because > it's trying to DMA-read from the wrong place). > > Since PCI uses "posted writes" when writing to a register, it's not > guaranteed that a write will happen immediately. That means the pointer > might be outdated when setting the TX ready interrupt, leading to > firmware crashes especially when ASPM L1 and L1 substates are enabled > (because of the higher link latency, the write will probably take > longer). > > So fix those firmware crashes by always forcing non-posted writes. We do > that by simply reading back the register after writing it, just as a lot > of other drivers do. > > There are two reproducers that are fixed with this patch: > > 1) During rx/tx traffic and with ASPM L1 substates enabled (the enabled > substates are platform dependent), the firmware crashes and eventually a > command timeout appears in the logs. That crash is fixed by using a > non-posted write in mwifiex_pcie_send_data(). > > 2) When sending lots of commands to the card, waking it up from sleep in > very quick intervals, the firmware eventually crashes. That crash > appears to be fixed by some other non-posted write included here. > > Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> This might be good for many cases, but please read this commit: https://git.kernel.org/linus/062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 mwifiex: pcie: use posted write to wake up firmware It's very much intentional that this is a posted write in some cases. Without ensuring this doesn't regress, NAK from me. Brian
On Wed, Sep 1, 2021 at 11:25 PM Brian Norris <briannorris@chromium.org> wrote: > On Mon, Aug 30, 2021 at 5:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: ... > This might be good for many cases, but please read this commit: > > https://git.kernel.org/linus/062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 > mwifiex: pcie: use posted write to wake up firmware > > It's very much intentional that this is a posted write in some cases. > > Without ensuring this doesn't regress, NAK from me. Can you ensure that from Chrome / Google perspective, please?
On Wed, Sep 1, 2021 at 1:41 PM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > > On Wed, Sep 1, 2021 at 11:25 PM Brian Norris <briannorris@chromium.org> wrote: > > On Mon, Aug 30, 2021 at 5:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: > > ... > > > This might be good for many cases, but please read this commit: > > > > https://git.kernel.org/linus/062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 > > mwifiex: pcie: use posted write to wake up firmware > > > > It's very much intentional that this is a posted write in some cases. > > > > Without ensuring this doesn't regress, NAK from me. > > Can you ensure that from Chrome / Google perspective, please? I mean, the context link should be pretty obvious. But sure: with this patch: (a) all my least-favorite audio issues to debug return (audio is especially latency-sensitive on the Rockchip RK3399 systems I have) (b) latency spikes to ~6ms: # trace-cmd record -p function_graph -O funcgraph-abstime -l mwifiex_pm_wakeup_card # trace-cmd report kworker/u13:0-199 [003] 348.987306: funcgraph_entry: # 6219.500 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [003] 349.316312: funcgraph_entry: # 6267.625 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [001] 352.238530: funcgraph_entry: # 6184.250 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [002] 356.626366: funcgraph_entry: # 6553.166 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [002] 356.709389: funcgraph_entry: # 6212.500 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [002] 356.847215: funcgraph_entry: # 6230.292 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [000] 356.897576: funcgraph_entry: # 6451.667 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [004] 357.175025: funcgraph_entry: # 6204.042 us | mwifiex_pm_wakeup_card(); whereas it used to look more like: kworker/u13:1-173 [005] 212.230542: funcgraph_entry: 7.000 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 213.886063: funcgraph_entry: 9.334 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [002] 214.473273: funcgraph_entry: + 11.375 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 214.530705: funcgraph_entry: 5.542 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [002] 215.050168: funcgraph_entry: + 13.125 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [002] 215.106492: funcgraph_entry: + 11.959 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 215.484807: funcgraph_entry: 8.459 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [003] 215.515238: funcgraph_entry: + 15.166 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [001] 217.175691: funcgraph_entry: + 11.083 us | mwifiex_pm_wakeup_card(); I think it would be enough to only modify mwifiex_pm_wakeup_card() to retain the posted version, but I'm not sure how that fits in with the rest of the author's work. I suppose I can give a later version a run if needed. Brian
Apologies for the brain-dead mailer. I forget that I should only reply via web when I _want_ text wrapping: On Wed, Sep 01, 2021 at 02:04:04PM -0700, Brian Norris wrote: > (b) latency spikes to ~6ms: > # trace-cmd record -p function_graph -O funcgraph-abstime -l > mwifiex_pm_wakeup_card > # trace-cmd report > kworker/u13:0-199 [003] 348.987306: funcgraph_entry: # > 6219.500 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [003] 349.316312: funcgraph_entry: # > 6267.625 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [001] 352.238530: funcgraph_entry: # > 6184.250 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [002] 356.626366: funcgraph_entry: # > 6553.166 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [002] 356.709389: funcgraph_entry: # > 6212.500 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [002] 356.847215: funcgraph_entry: # > 6230.292 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [000] 356.897576: funcgraph_entry: # > 6451.667 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [004] 357.175025: funcgraph_entry: # > 6204.042 us | mwifiex_pm_wakeup_card(); > > whereas it used to look more like: > > kworker/u13:1-173 [005] 212.230542: funcgraph_entry: > 7.000 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 213.886063: funcgraph_entry: > 9.334 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [002] 214.473273: funcgraph_entry: + > 11.375 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 214.530705: funcgraph_entry: > 5.542 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [002] 215.050168: funcgraph_entry: + > 13.125 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [002] 215.106492: funcgraph_entry: + > 11.959 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 215.484807: funcgraph_entry: > 8.459 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [003] 215.515238: funcgraph_entry: + > 15.166 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [001] 217.175691: funcgraph_entry: + > 11.083 us | mwifiex_pm_wakeup_card(); That should read: # trace-cmd record -p function_graph -O funcgraph-abstime -l mwifiex_pm_wakeup_card # trace-cmd report kworker/u13:0-199 [003] 348.987306: funcgraph_entry: # 6219.500 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [003] 349.316312: funcgraph_entry: # 6267.625 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [001] 352.238530: funcgraph_entry: # 6184.250 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [002] 356.626366: funcgraph_entry: # 6553.166 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [002] 356.709389: funcgraph_entry: # 6212.500 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [002] 356.847215: funcgraph_entry: # 6230.292 us | mwifiex_pm_wakeup_card(); kworker/u13:3-4057 [000] 356.897576: funcgraph_entry: # 6451.667 us | mwifiex_pm_wakeup_card(); kworker/u13:0-199 [004] 357.175025: funcgraph_entry: # 6204.042 us | mwifiex_pm_wakeup_card(); vs. kworker/u13:1-173 [005] 212.230542: funcgraph_entry: 7.000 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 213.886063: funcgraph_entry: 9.334 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [002] 214.473273: funcgraph_entry: + 11.375 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 214.530705: funcgraph_entry: 5.542 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [002] 215.050168: funcgraph_entry: + 13.125 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [002] 215.106492: funcgraph_entry: + 11.959 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [005] 215.484807: funcgraph_entry: 8.459 us | mwifiex_pm_wakeup_card(); kworker/u13:1-173 [003] 215.515238: funcgraph_entry: + 15.166 us | mwifiex_pm_wakeup_card(); kworker/u13:3-1768 [001] 217.175691: funcgraph_entry: + 11.083 us | mwifiex_pm_wakeup_card(); Brian
On Wed, Sep 01, 2021 at 07:07:58PM +0200, Johannes Berg wrote: > On Wed, 2021-09-01 at 18:51 +0200, Heiner Kallweit wrote: > > On 01.09.2021 17:51, Pali Rohár wrote: > > > On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote: > > > > On 8/30/21 2:49 PM, Andy Shevchenko wrote: > > > > > On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > > > > > > > > > > > On the 88W8897 card it's very important the TX ring write pointer is > > > > > > updated correctly to its new value before setting the TX ready > > > > > > interrupt, otherwise the firmware appears to crash (probably because > > > > > > it's trying to DMA-read from the wrong place). > > > > > > > > > > This sounds somehow like the typical case where you write DMA descriptors > > and then ring the doorbell. This normally requires a dma_wmb(). > > Maybe something like that is missing here? > > But it looks like this "TX ring write pointer" is actually the register? > > However, I would agree that doing it in mwifiex_write_reg() is possibly > too big a hammer - could be done only for reg->tx_wrptr, not all the > registers? > > Actually, can two writes actually cross on PCI? Per PCIe r5.0, sec 2.4.1, A2a A Posted Request must not pass another Posted Request unless A2b applies. A2b A Posted Request with RO Set is permitted to pass another Posted Request. A Posted Request with IDO Set is permitted to pass another Posted Request if the two Requester IDs are different or if both Requests contain a PASID TLP Prefix and the two PASID values are different. A few drivers enable RO (Relaxed Ordering) for their devices, which means the *device* is permitted to set the RO bit in transactions it initiates. BUt IIUC we're talking about MMIO writes initiated by a CPU, and they won't have the RO bit set unless the Root Port has Relaxed Ordering enabled, and Linux generally does not enable that. So A2a should apply, and writes should be ordered on PCI. There are a few wrinkles that I worry about: d1e714db8129 ("mtip32xx: Fix ERO and NoSnoop values in PCIe upstream on AMD systems") [1] turns off RO for some AMD Root Ports, which makes me think BIOS might be enabling RO in these Root Ports. c56d4450eb68 ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum") [2] turns off RO for all Root Ports leading to Chelsio T5 devices, which again makes me think there's firmware that enables RO in Root Ports. Follow-up [3]. 77ffc1465cec ("tegra: add PCI Express support") [4] (see tegra_pcie_relax_enable()) enables RO for Tegra Root Ports due to some hardware issue. I don't whether these Root Ports every actually *set* RO in the PCIe transactions they generate. Follow-up [5]. These concern me because I don't think we have a way for drivers to specify whether their writes should use strong ordering or relaxed ordering, and I think they depend on strong ordering. If Root Ports have RO enabled, I think we are at risk, so I suspect Linux should actively *disable* RO for Root Ports. [1] https://git.kernel.org/linus/d1e714db8129 [2] https://git.kernel.org/linus/c56d4450eb68 [3] https://lore.kernel.org/r/20210901222353.GA251391@bjorn-Precision-5520 [4] https://git.kernel.org/linus/77ffc1465cec [5] https://lore.kernel.org/r/20210901204045.GA236987@bjorn-Precision-5520
[+cc Keith] On Wed, Sep 01, 2021 at 05:41:56PM -0500, Bjorn Helgaas wrote: > On Wed, Sep 01, 2021 at 07:07:58PM +0200, Johannes Berg wrote: > > On Wed, 2021-09-01 at 18:51 +0200, Heiner Kallweit wrote: > > > On 01.09.2021 17:51, Pali Rohár wrote: > > > > On Wednesday 01 September 2021 16:01:54 Jonas Dreßler wrote: > > > > > On 8/30/21 2:49 PM, Andy Shevchenko wrote: > > > > > > On Mon, Aug 30, 2021 at 3:38 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > > > > > > > > > > > > > On the 88W8897 card it's very important the TX ring write pointer is > > > > > > > updated correctly to its new value before setting the TX ready > > > > > > > interrupt, otherwise the firmware appears to crash (probably because > > > > > > > it's trying to DMA-read from the wrong place). > > > > > > > > > > > > > This sounds somehow like the typical case where you write DMA descriptors > > > and then ring the doorbell. This normally requires a dma_wmb(). > > > Maybe something like that is missing here? > > > > But it looks like this "TX ring write pointer" is actually the register? > > > > However, I would agree that doing it in mwifiex_write_reg() is possibly > > too big a hammer - could be done only for reg->tx_wrptr, not all the > > registers? > > > > Actually, can two writes actually cross on PCI? > > Per PCIe r5.0, sec 2.4.1, > > A2a A Posted Request must not pass another Posted Request unless A2b > applies. > > A2b A Posted Request with RO Set is permitted to pass another > Posted Request. A Posted Request with IDO Set is permitted to > pass another Posted Request if the two Requester IDs are > different or if both Requests contain a PASID TLP Prefix and > the two PASID values are different. > > A few drivers enable RO (Relaxed Ordering) for their devices, which > means the *device* is permitted to set the RO bit in transactions it > initiates. > > BUt IIUC we're talking about MMIO writes initiated by a CPU, and they > won't have the RO bit set unless the Root Port has Relaxed Ordering > enabled, and Linux generally does not enable that. So A2a should > apply, and writes should be ordered on PCI. > > There are a few wrinkles that I worry about: > > d1e714db8129 ("mtip32xx: Fix ERO and NoSnoop values in PCIe upstream > on AMD systems") [1] turns off RO for some AMD Root Ports, which > makes me think BIOS might be enabling RO in these Root Ports. > > c56d4450eb68 ("PCI: Turn off Request Attributes to avoid Chelsio T5 > Completion erratum") [2] turns off RO for all Root Ports leading to > Chelsio T5 devices, which again makes me think there's firmware that > enables RO in Root Ports. Follow-up [3]. > > 77ffc1465cec ("tegra: add PCI Express support") [4] (see > tegra_pcie_relax_enable()) enables RO for Tegra Root Ports due to > some hardware issue. I don't whether these Root Ports every > actually *set* RO in the PCIe transactions they generate. Follow-up > [5]. > > These concern me because I don't think we have a way for drivers to > specify whether their writes should use strong ordering or relaxed > ordering, and I think they depend on strong ordering. Sorry, I was completely wrong about this part. Keith Busch reminded me that writel_relaxed() is an obvious way for drivers to specify that they don't need strong ordering. I apologize for the distraction. > If Root Ports have RO enabled, I think we are at risk, so I suspect > Linux should actively *disable* RO for Root Ports. > > [1] https://git.kernel.org/linus/d1e714db8129 > [2] https://git.kernel.org/linus/c56d4450eb68 > [3] https://lore.kernel.org/r/20210901222353.GA251391@bjorn-Precision-5520 > [4] https://git.kernel.org/linus/77ffc1465cec > [5] https://lore.kernel.org/r/20210901204045.GA236987@bjorn-Precision-5520
On 9/1/21 11:07 PM, Brian Norris wrote: > Apologies for the brain-dead mailer. I forget that I should only reply > via web when I _want_ text wrapping: > > On Wed, Sep 01, 2021 at 02:04:04PM -0700, Brian Norris wrote: >> (b) latency spikes to ~6ms: >> # trace-cmd record -p function_graph -O funcgraph-abstime -l >> mwifiex_pm_wakeup_card >> # trace-cmd report >> kworker/u13:0-199 [003] 348.987306: funcgraph_entry: # >> 6219.500 us | mwifiex_pm_wakeup_card(); >> kworker/u13:0-199 [003] 349.316312: funcgraph_entry: # >> 6267.625 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-4057 [001] 352.238530: funcgraph_entry: # >> 6184.250 us | mwifiex_pm_wakeup_card(); >> kworker/u13:0-199 [002] 356.626366: funcgraph_entry: # >> 6553.166 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-4057 [002] 356.709389: funcgraph_entry: # >> 6212.500 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-4057 [002] 356.847215: funcgraph_entry: # >> 6230.292 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-4057 [000] 356.897576: funcgraph_entry: # >> 6451.667 us | mwifiex_pm_wakeup_card(); >> kworker/u13:0-199 [004] 357.175025: funcgraph_entry: # >> 6204.042 us | mwifiex_pm_wakeup_card(); >> >> whereas it used to look more like: >> >> kworker/u13:1-173 [005] 212.230542: funcgraph_entry: >> 7.000 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-1768 [005] 213.886063: funcgraph_entry: >> 9.334 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-1768 [002] 214.473273: funcgraph_entry: + >> 11.375 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-1768 [005] 214.530705: funcgraph_entry: >> 5.542 us | mwifiex_pm_wakeup_card(); >> kworker/u13:1-173 [002] 215.050168: funcgraph_entry: + >> 13.125 us | mwifiex_pm_wakeup_card(); >> kworker/u13:1-173 [002] 215.106492: funcgraph_entry: + >> 11.959 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-1768 [005] 215.484807: funcgraph_entry: >> 8.459 us | mwifiex_pm_wakeup_card(); >> kworker/u13:1-173 [003] 215.515238: funcgraph_entry: + >> 15.166 us | mwifiex_pm_wakeup_card(); >> kworker/u13:3-1768 [001] 217.175691: funcgraph_entry: + >> 11.083 us | mwifiex_pm_wakeup_card(); > > That should read: > > # trace-cmd record -p function_graph -O funcgraph-abstime -l mwifiex_pm_wakeup_card > # trace-cmd report > kworker/u13:0-199 [003] 348.987306: funcgraph_entry: # 6219.500 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [003] 349.316312: funcgraph_entry: # 6267.625 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [001] 352.238530: funcgraph_entry: # 6184.250 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [002] 356.626366: funcgraph_entry: # 6553.166 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [002] 356.709389: funcgraph_entry: # 6212.500 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [002] 356.847215: funcgraph_entry: # 6230.292 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-4057 [000] 356.897576: funcgraph_entry: # 6451.667 us | mwifiex_pm_wakeup_card(); > kworker/u13:0-199 [004] 357.175025: funcgraph_entry: # 6204.042 us | mwifiex_pm_wakeup_card(); > > vs. > > kworker/u13:1-173 [005] 212.230542: funcgraph_entry: 7.000 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 213.886063: funcgraph_entry: 9.334 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [002] 214.473273: funcgraph_entry: + 11.375 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 214.530705: funcgraph_entry: 5.542 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [002] 215.050168: funcgraph_entry: + 13.125 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [002] 215.106492: funcgraph_entry: + 11.959 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [005] 215.484807: funcgraph_entry: 8.459 us | mwifiex_pm_wakeup_card(); > kworker/u13:1-173 [003] 215.515238: funcgraph_entry: + 15.166 us | mwifiex_pm_wakeup_card(); > kworker/u13:3-1768 [001] 217.175691: funcgraph_entry: + 11.083 us | mwifiex_pm_wakeup_card(); > > Brian > Thanks for the pointer to that commit Brian, it turns out this is actually the change that causes the "Firmware wakeup failed" issues that I'm trying to fix with the second patch here. Also my approach is a lot messier than just reverting 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 and also appears to be blocking even longer... Does anyone have an idea what could be the reason for the posted write not going through, or could that also be a potential firmware bug in the chip? Jonas
On Sat, Sep 18, 2021 at 12:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: > Thanks for the pointer to that commit Brian, it turns out this is > actually the change that causes the "Firmware wakeup failed" issues that > I'm trying to fix with the second patch here. Huh. That's interesting, although I guess it makes some sense given your theory of "dropped writes". FWIW, this strategy (post a single write, then wait for wakeup) is the same used by some other chips/drivers too (e.g., ath10k/pci), although in those cases card wakeup is much much faster. But if the bus was dropping writes somehow, those strategies would fail too. > Also my approach is a lot messier than just reverting > 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 and also appears to be blocking > even longer... For the record, in case you're talking about my data ("blocking even longer"): I was only testing patch 1. Patch 2 isn't really relevant to my particular systems (Rockchip RK3399 + Marvell 8997/PCIe), because (a) I'm pretty sure my system isn't "dropping" any reads or writes (b) all my delay is in the read-back; the Rockchip PCIe bus is waiting indefinitely for the card to wake up, instead of timing out and reporting all-1's like many x86 systems appear to do (I've tested this). So, the 6ms delay is entirely sitting in the ioread32(), not a delay loop. I haven't yet tried your version 2 (which avoids the blocking read to wake up; good!), but it sounds like in theory it could solve your problem while avoiding 6ms delays for me. I intend to test your v2 this week. > Does anyone have an idea what could be the reason for the posted write > not going through, or could that also be a potential firmware bug in the > chip? I have no clue about that. That does sound downright horrible, but so are many things when dealing with this family of hardware/firmware. I'm not sure how to prove out whether this is a host bus problem, or an endpoint/firmware problem, other than perhaps trying the same module/firmware on another system, if that's possible. Anyway, to reiterate: I'm not fundamentally opposed to v2 (pending a test run here), even if it is a bit ugly and perhaps not 100% understood. Brian
On 9/22/21 2:50 PM, Jonas Dreßler wrote: > On 9/20/21 7:48 PM, Brian Norris wrote: >> On Sat, Sep 18, 2021 at 12:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: >>> Thanks for the pointer to that commit Brian, it turns out this is >>> actually the change that causes the "Firmware wakeup failed" issues that >>> I'm trying to fix with the second patch here. >> >> Huh. That's interesting, although I guess it makes some sense given >> your theory of "dropped writes". FWIW, this strategy (post a single >> write, then wait for wakeup) is the same used by some other >> chips/drivers too (e.g., ath10k/pci), although in those cases card >> wakeup is much much faster. But if the bus was dropping writes >> somehow, those strategies would fail too. >> >>> Also my approach is a lot messier than just reverting >>> 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 and also appears to be blocking >>> even longer... >> >> For the record, in case you're talking about my data ("blocking even >> longer"): I was only testing patch 1. Patch 2 isn't really relevant to >> my particular systems (Rockchip RK3399 + Marvell 8997/PCIe), because >> (a) I'm pretty sure my system isn't "dropping" any reads or writes >> (b) all my delay is in the read-back; the Rockchip PCIe bus is waiting >> indefinitely for the card to wake up, instead of timing out and >> reporting all-1's like many x86 systems appear to do (I've tested >> this). >> >> So, the 6ms delay is entirely sitting in the ioread32(), not a delay >> loop. >> >> I haven't yet tried your version 2 (which avoids the blocking read to >> wake up; good!), but it sounds like in theory it could solve your >> problem while avoiding 6ms delays for me. I intend to test your v2 >> this week. >> > > With "blocking even longer" I meant that (on my system) the delay-loop > blocks even longer than waking up the card via mwifiex_read_reg() (both > are in the orders of milliseconds). And given that in certain cases the > card wakeup (or a write getting through to the card, I have no idea) can > take extremely long, I'd feel more confident going with the > mwifiex_read_reg() method to wake up the card. > > Anyway, you know what's even weirder with all this: I've been testing > the first commit of patch v2 (so just the single read-back instead of > the big hammer) together with 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 > reverted for a good week now and haven't seen any wakeup failure yet. > Otoh I'm fairly sure the big hammer with reading back every write wasn't > enough to fix the wakeup failures, otherwise I wouldn't even have > started working on the second commit. > > So that would mean there's a difference between writing and then reading > back vs only reading to wake up the card: Only the latter fixes the > wakeup failures. > >>> Does anyone have an idea what could be the reason for the posted write >>> not going through, or could that also be a potential firmware bug in the >>> chip? >> >> I have no clue about that. That does sound downright horrible, but so >> are many things when dealing with this family of hardware/firmware. >> I'm not sure how to prove out whether this is a host bus problem, or >> an endpoint/firmware problem, other than perhaps trying the same >> module/firmware on another system, if that's possible. >> >> Anyway, to reiterate: I'm not fundamentally opposed to v2 (pending a >> test run here), even if it is a bit ugly and perhaps not 100% >> understood. >> > > I'm not 100% sure about all this yet, I think I'm gonna try to confirm > my older findings once again now and then we'll see. FTR, would you be > fine with using the mwifiex_read_reg() method to wake up the card and > somehow quirking your system to use write_reg()? > >> Brian >> > Okay, so I finally managed to find my exact reproducer for the bug again: 1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on) 2) Connect to any wifi network (makes firmware go into wifi powersaving mode, not deep sleep) 3) Make sure bluetooth is turned off (to ensure the firmware actually enters powersave mode and doesn't keep the radio active doing bluetooth stuff) 4) To confirm that wifi powersaving is entered ping a device on the LAN, pings should be a few ms higher than without powersaving 5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and suspends the firmware extremely often 6) Wait until things explode, for me it consistently takes <5 minutes Using this reproducer I was able to clear things up a bit: - There still are wakeup failures when using (only) mwifiex_read_reg() to wake the card, so there's no weird difference between waking up using read vs write+read-back - Just calling mwifiex_write_reg() once and then blocking until the card wakes up using my delay-loop doesn't fix the issue, it's actually writing multiple times that fixes the issue These observations sound a lot like writes (and even reads) are actually being dropped, don't they?
On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > On 9/22/21 2:50 PM, Jonas Dreßler wrote: > > On 9/20/21 7:48 PM, Brian Norris wrote: > >> On Sat, Sep 18, 2021 at 12:37 AM Jonas Dreßler <verdre@v0yd.nl> wrote: > >>> Thanks for the pointer to that commit Brian, it turns out this is > >>> actually the change that causes the "Firmware wakeup failed" issues that > >>> I'm trying to fix with the second patch here. > >> > >> Huh. That's interesting, although I guess it makes some sense given > >> your theory of "dropped writes". FWIW, this strategy (post a single > >> write, then wait for wakeup) is the same used by some other > >> chips/drivers too (e.g., ath10k/pci), although in those cases card > >> wakeup is much much faster. But if the bus was dropping writes > >> somehow, those strategies would fail too. > >> > >>> Also my approach is a lot messier than just reverting > >>> 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 and also appears to be blocking > >>> even longer... > >> > >> For the record, in case you're talking about my data ("blocking even > >> longer"): I was only testing patch 1. Patch 2 isn't really relevant to > >> my particular systems (Rockchip RK3399 + Marvell 8997/PCIe), because > >> (a) I'm pretty sure my system isn't "dropping" any reads or writes > >> (b) all my delay is in the read-back; the Rockchip PCIe bus is waiting > >> indefinitely for the card to wake up, instead of timing out and > >> reporting all-1's like many x86 systems appear to do (I've tested > >> this). > >> > >> So, the 6ms delay is entirely sitting in the ioread32(), not a delay > >> loop. > >> > >> I haven't yet tried your version 2 (which avoids the blocking read to > >> wake up; good!), but it sounds like in theory it could solve your > >> problem while avoiding 6ms delays for me. I intend to test your v2 > >> this week. > >> > > > > With "blocking even longer" I meant that (on my system) the delay-loop > > blocks even longer than waking up the card via mwifiex_read_reg() (both > > are in the orders of milliseconds). And given that in certain cases the > > card wakeup (or a write getting through to the card, I have no idea) can > > take extremely long, I'd feel more confident going with the > > mwifiex_read_reg() method to wake up the card. > > > > Anyway, you know what's even weirder with all this: I've been testing > > the first commit of patch v2 (so just the single read-back instead of > > the big hammer) together with 062e008a6e83e7c4da7df0a9c6aefdbc849e2bb3 > > reverted for a good week now and haven't seen any wakeup failure yet. > > Otoh I'm fairly sure the big hammer with reading back every write wasn't > > enough to fix the wakeup failures, otherwise I wouldn't even have > > started working on the second commit. > > > > So that would mean there's a difference between writing and then reading > > back vs only reading to wake up the card: Only the latter fixes the > > wakeup failures. > > > >>> Does anyone have an idea what could be the reason for the posted write > >>> not going through, or could that also be a potential firmware bug in the > >>> chip? > >> > >> I have no clue about that. That does sound downright horrible, but so > >> are many things when dealing with this family of hardware/firmware. > >> I'm not sure how to prove out whether this is a host bus problem, or > >> an endpoint/firmware problem, other than perhaps trying the same > >> module/firmware on another system, if that's possible. > >> > >> Anyway, to reiterate: I'm not fundamentally opposed to v2 (pending a > >> test run here), even if it is a bit ugly and perhaps not 100% > >> understood. > >> > > > > I'm not 100% sure about all this yet, I think I'm gonna try to confirm > > my older findings once again now and then we'll see. FTR, would you be > > fine with using the mwifiex_read_reg() method to wake up the card and > > somehow quirking your system to use write_reg()? > > > >> Brian > >> > > > > Okay, so I finally managed to find my exact reproducer for the bug again: > > 1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on) > 2) Connect to any wifi network (makes firmware go into wifi powersaving > mode, not deep sleep) > 3) Make sure bluetooth is turned off (to ensure the firmware actually > enters powersave mode and doesn't keep the radio active doing bluetooth > stuff) > 4) To confirm that wifi powersaving is entered ping a device on the LAN, > pings should be a few ms higher than without powersaving > 5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and > suspends the firmware extremely often > 6) Wait until things explode, for me it consistently takes <5 minutes > > Using this reproducer I was able to clear things up a bit: > > - There still are wakeup failures when using (only) mwifiex_read_reg() > to wake the card, so there's no weird difference between waking up using > read vs write+read-back > > - Just calling mwifiex_write_reg() once and then blocking until the card > wakes up using my delay-loop doesn't fix the issue, it's actually > writing multiple times that fixes the issue > > These observations sound a lot like writes (and even reads) are actually > being dropped, don't they? -- With Best Regards, Andy Shevchenko
On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > On 9/22/21 2:50 PM, Jonas Dreßler wrote: ... > - Just calling mwifiex_write_reg() once and then blocking until the card > wakes up using my delay-loop doesn't fix the issue, it's actually > writing multiple times that fixes the issue > > These observations sound a lot like writes (and even reads) are actually > being dropped, don't they? It sounds like you're writing into a not ready (fully powered on) device. To check this, try to put a busy loop for reading and check the value till it gets 0. Something like unsigned int count = 1000; do { if (mwifiex_read_reg(...) == 0) break; } while (--count); -- With Best Regards, Andy Shevchenko
On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote: > On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > On 9/22/21 2:50 PM, Jonas Dreßler wrote: > > ... > > > - Just calling mwifiex_write_reg() once and then blocking until the card > > wakes up using my delay-loop doesn't fix the issue, it's actually > > writing multiple times that fixes the issue > > > > These observations sound a lot like writes (and even reads) are actually > > being dropped, don't they? > > It sounds like you're writing into a not ready (fully powered on) device. This reminds me a discussion with Bjorn about CRS response returned after firmware crash / reset when device is not ready yet: https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/ Could not be this similar issue? You could check it via reading PCI_VENDOR_ID register from config space. And if it is not valid value then card is not really ready yet. > To check this, try to put a busy loop for reading and check the value > till it gets 0. > > Something like > > unsigned int count = 1000; > > do { > if (mwifiex_read_reg(...) == 0) > break; > } while (--count); > > > -- > With Best Regards, > Andy Shevchenko
On Thursday 30 September 2021 17:38:43 Jonas Dreßler wrote: > On 9/23/21 10:22 PM, Pali Rohár wrote: > > On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote: > > > On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > > > On 9/22/21 2:50 PM, Jonas Dreßler wrote: > > > > > > ... > > > > > > > - Just calling mwifiex_write_reg() once and then blocking until the card > > > > wakes up using my delay-loop doesn't fix the issue, it's actually > > > > writing multiple times that fixes the issue > > > > > > > > These observations sound a lot like writes (and even reads) are actually > > > > being dropped, don't they? > > > > > > It sounds like you're writing into a not ready (fully powered on) device. > > > > This reminds me a discussion with Bjorn about CRS response returned > > after firmware crash / reset when device is not ready yet: > > https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/ > > > > Could not be this similar issue? You could check it via reading > > PCI_VENDOR_ID register from config space. And if it is not valid value > > then card is not really ready yet. > > > > > To check this, try to put a busy loop for reading and check the value > > > till it gets 0. > > > > > > Something like > > > > > > unsigned int count = 1000; > > > > > > do { > > > if (mwifiex_read_reg(...) == 0) > > > break; > > > } while (--count); > > > > > > > > > -- > > > With Best Regards, > > > Andy Shevchenko > > I've tried both reading PCI_VENDOR_ID and the firmware status using a busy > loop now, but sadly none of them worked. It looks like the card always > replies with the correct values even though it sometimes won't wake up after > that. > > I do have one new observation though, although I've no clue what could be > happening here: When reading PCI_VENDOR_ID 1000 times to wakeup we can > "predict" the wakeup failure because exactly one (usually around the 20th) > of those 1000 reads will fail. What does "fail" means here? > Maybe the firmware actually tries to wake up, > encounters an error somewhere in its wakeup routines and then goes down a > special failure code path. That code path keeps the cards CPU so busy that > at some point a PCI_VENDOR_ID request times out? > > Or well, maybe the card actually wakes up fine, but we don't receive the > interrupt on our end, so many possibilities...
On 9/30/21 5:42 PM, Pali Rohár wrote: > On Thursday 30 September 2021 17:38:43 Jonas Dreßler wrote: >> On 9/23/21 10:22 PM, Pali Rohár wrote: >>> On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote: >>>> On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: >>>>> On 9/22/21 2:50 PM, Jonas Dreßler wrote: >>>> >>>> ... >>>> >>>>> - Just calling mwifiex_write_reg() once and then blocking until the card >>>>> wakes up using my delay-loop doesn't fix the issue, it's actually >>>>> writing multiple times that fixes the issue >>>>> >>>>> These observations sound a lot like writes (and even reads) are actually >>>>> being dropped, don't they? >>>> >>>> It sounds like you're writing into a not ready (fully powered on) device. >>> >>> This reminds me a discussion with Bjorn about CRS response returned >>> after firmware crash / reset when device is not ready yet: >>> https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/ >>> >>> Could not be this similar issue? You could check it via reading >>> PCI_VENDOR_ID register from config space. And if it is not valid value >>> then card is not really ready yet. >>> >>>> To check this, try to put a busy loop for reading and check the value >>>> till it gets 0. >>>> >>>> Something like >>>> >>>> unsigned int count = 1000; >>>> >>>> do { >>>> if (mwifiex_read_reg(...) == 0) >>>> break; >>>> } while (--count); >>>> >>>> >>>> -- >>>> With Best Regards, >>>> Andy Shevchenko >> >> I've tried both reading PCI_VENDOR_ID and the firmware status using a busy >> loop now, but sadly none of them worked. It looks like the card always >> replies with the correct values even though it sometimes won't wake up after >> that. >> >> I do have one new observation though, although I've no clue what could be >> happening here: When reading PCI_VENDOR_ID 1000 times to wakeup we can >> "predict" the wakeup failure because exactly one (usually around the 20th) >> of those 1000 reads will fail. > > What does "fail" means here? ioread32() returns all ones, that's interpreted as failure by mwifiex_read_reg(). > >> Maybe the firmware actually tries to wake up, >> encounters an error somewhere in its wakeup routines and then goes down a >> special failure code path. That code path keeps the cards CPU so busy that >> at some point a PCI_VENDOR_ID request times out? >> >> Or well, maybe the card actually wakes up fine, but we don't receive the >> interrupt on our end, so many possibilities...
On 9/30/21 6:19 PM, Pali Rohár wrote: > On Thursday 30 September 2021 18:14:04 Jonas Dreßler wrote: >> On 9/30/21 5:42 PM, Pali Rohár wrote: >>> On Thursday 30 September 2021 17:38:43 Jonas Dreßler wrote: >>>> On 9/23/21 10:22 PM, Pali Rohár wrote: >>>>> On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote: >>>>>> On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: >>>>>>> On 9/22/21 2:50 PM, Jonas Dreßler wrote: >>>>>> >>>>>> ... >>>>>> >>>>>>> - Just calling mwifiex_write_reg() once and then blocking until the card >>>>>>> wakes up using my delay-loop doesn't fix the issue, it's actually >>>>>>> writing multiple times that fixes the issue >>>>>>> >>>>>>> These observations sound a lot like writes (and even reads) are actually >>>>>>> being dropped, don't they? >>>>>> >>>>>> It sounds like you're writing into a not ready (fully powered on) device. >>>>> >>>>> This reminds me a discussion with Bjorn about CRS response returned >>>>> after firmware crash / reset when device is not ready yet: >>>>> https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/ >>>>> >>>>> Could not be this similar issue? You could check it via reading >>>>> PCI_VENDOR_ID register from config space. And if it is not valid value >>>>> then card is not really ready yet. >>>>> >>>>>> To check this, try to put a busy loop for reading and check the value >>>>>> till it gets 0. >>>>>> >>>>>> Something like >>>>>> >>>>>> unsigned int count = 1000; >>>>>> >>>>>> do { >>>>>> if (mwifiex_read_reg(...) == 0) >>>>>> break; >>>>>> } while (--count); >>>>>> >>>>>> >>>>>> -- >>>>>> With Best Regards, >>>>>> Andy Shevchenko >>>> >>>> I've tried both reading PCI_VENDOR_ID and the firmware status using a busy >>>> loop now, but sadly none of them worked. It looks like the card always >>>> replies with the correct values even though it sometimes won't wake up after >>>> that. >>>> >>>> I do have one new observation though, although I've no clue what could be >>>> happening here: When reading PCI_VENDOR_ID 1000 times to wakeup we can >>>> "predict" the wakeup failure because exactly one (usually around the 20th) >>>> of those 1000 reads will fail. >>> >>> What does "fail" means here? >> >> ioread32() returns all ones, that's interpreted as failure by >> mwifiex_read_reg(). > > Ok. And can you check if PCI Bridge above this card has enabled CRSSVE > bit (CRSVisible in RootCtl+RootCap in lspci output)? To determinate if > Bridge could convert CRS response to all-ones as failed transaction. > Seems like that bit is disabled: > RootCap: CRSVisible- > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- >>> >>>> Maybe the firmware actually tries to wake up, >>>> encounters an error somewhere in its wakeup routines and then goes down a >>>> special failure code path. That code path keeps the cards CPU so busy that >>>> at some point a PCI_VENDOR_ID request times out? >>>> >>>> Or well, maybe the card actually wakes up fine, but we don't receive the >>>> interrupt on our end, so many possibilities...
On Thursday 30 September 2021 18:22:42 Jonas Dreßler wrote: > On 9/30/21 6:19 PM, Pali Rohár wrote: > > On Thursday 30 September 2021 18:14:04 Jonas Dreßler wrote: > > > On 9/30/21 5:42 PM, Pali Rohár wrote: > > > > On Thursday 30 September 2021 17:38:43 Jonas Dreßler wrote: > > > > > On 9/23/21 10:22 PM, Pali Rohár wrote: > > > > > > On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote: > > > > > > > On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@v0yd.nl> wrote: > > > > > > > > On 9/22/21 2:50 PM, Jonas Dreßler wrote: > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > - Just calling mwifiex_write_reg() once and then blocking until the card > > > > > > > > wakes up using my delay-loop doesn't fix the issue, it's actually > > > > > > > > writing multiple times that fixes the issue > > > > > > > > > > > > > > > > These observations sound a lot like writes (and even reads) are actually > > > > > > > > being dropped, don't they? > > > > > > > > > > > > > > It sounds like you're writing into a not ready (fully powered on) device. > > > > > > > > > > > > This reminds me a discussion with Bjorn about CRS response returned > > > > > > after firmware crash / reset when device is not ready yet: > > > > > > https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/ > > > > > > > > > > > > Could not be this similar issue? You could check it via reading > > > > > > PCI_VENDOR_ID register from config space. And if it is not valid value > > > > > > then card is not really ready yet. > > > > > > > > > > > > > To check this, try to put a busy loop for reading and check the value > > > > > > > till it gets 0. > > > > > > > > > > > > > > Something like > > > > > > > > > > > > > > unsigned int count = 1000; > > > > > > > > > > > > > > do { > > > > > > > if (mwifiex_read_reg(...) == 0) > > > > > > > break; > > > > > > > } while (--count); > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > With Best Regards, > > > > > > > Andy Shevchenko > > > > > > > > > > I've tried both reading PCI_VENDOR_ID and the firmware status using a busy > > > > > loop now, but sadly none of them worked. It looks like the card always > > > > > replies with the correct values even though it sometimes won't wake up after > > > > > that. > > > > > > > > > > I do have one new observation though, although I've no clue what could be > > > > > happening here: When reading PCI_VENDOR_ID 1000 times to wakeup we can > > > > > "predict" the wakeup failure because exactly one (usually around the 20th) > > > > > of those 1000 reads will fail. > > > > > > > > What does "fail" means here? > > > > > > ioread32() returns all ones, that's interpreted as failure by > > > mwifiex_read_reg(). > > > > Ok. And can you check if PCI Bridge above this card has enabled CRSSVE > > bit (CRSVisible in RootCtl+RootCap in lspci output)? To determinate if > > Bridge could convert CRS response to all-ones as failed transaction. > > > > Seems like that bit is disabled: > > RootCap: CRSVisible- > > RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible- So it means that CRSSVE is unsupported by upper bridge. In case card returns CRS response to system (via bridge) that is not ready yet, bridge re-issue read request, and after some failures it returns to system all-ones to indicate failed transaction. But all-ones can be returned also by bridge when card does not send any response. So from this test we do not know what happened. It would be nice to know it, but such test requires to connect this card into system which supports CRSSVE, in which case CRS response it passed directly to OS as value 0xffff0001. Look at the link above where I discussed with Bjorn about buggy wifi cards which resets internally, for more details. But in this setup when CRSSVE is not supported, I think there is no other option than just adding sleep prior accessing card... For debugging such issues I got the only advice to use PCIe analyzer and look at what is really happening on the bus. But required equipment for this debugging is not cheap... > > > > > > > > > Maybe the firmware actually tries to wake up, > > > > > encounters an error somewhere in its wakeup routines and then goes down a > > > > > special failure code path. That code path keeps the cards CPU so busy that > > > > > at some point a PCI_VENDOR_ID request times out? > > > > > > > > > > Or well, maybe the card actually wakes up fine, but we don't receive the > > > > > interrupt on our end, so many possibilities...
diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c index c6ccce426b49..bfd6e135ed99 100644 --- a/drivers/net/wireless/marvell/mwifiex/pcie.c +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c @@ -237,6 +237,12 @@ static int mwifiex_write_reg(struct mwifiex_adapter *adapter, int reg, u32 data) iowrite32(data, card->pci_mmap1 + reg); + /* Do a read-back, which makes the write non-posted, ensuring the + * completion before returning. + * The firmware of the 88W8897 card is buggy and this avoids crashes. + */ + ioread32(card->pci_mmap1 + reg); + return 0; }
On the 88W8897 card it's very important the TX ring write pointer is updated correctly to its new value before setting the TX ready interrupt, otherwise the firmware appears to crash (probably because it's trying to DMA-read from the wrong place). Since PCI uses "posted writes" when writing to a register, it's not guaranteed that a write will happen immediately. That means the pointer might be outdated when setting the TX ready interrupt, leading to firmware crashes especially when ASPM L1 and L1 substates are enabled (because of the higher link latency, the write will probably take longer). So fix those firmware crashes by always forcing non-posted writes. We do that by simply reading back the register after writing it, just as a lot of other drivers do. There are two reproducers that are fixed with this patch: 1) During rx/tx traffic and with ASPM L1 substates enabled (the enabled substates are platform dependent), the firmware crashes and eventually a command timeout appears in the logs. That crash is fixed by using a non-posted write in mwifiex_pcie_send_data(). 2) When sending lots of commands to the card, waking it up from sleep in very quick intervals, the firmware eventually crashes. That crash appears to be fixed by some other non-posted write included here. Signed-off-by: Jonas Dreßler <verdre@v0yd.nl> --- drivers/net/wireless/marvell/mwifiex/pcie.c | 6 ++++++ 1 file changed, 6 insertions(+)