Message ID | 20221128-spi-mt65xx-v1-0-509266830665@chromium.org |
---|---|
State | Accepted |
Commit | c6f7874687f7027d7c4b2f53ff6e4d22850f915d |
Headers | show |
Series | spi: mediatek: Enable irq when pdata is ready | expand |
Hi Ricardo, Hi Mark, On Mon, Nov 28, 2022 at 05:28:51PM +0000, Mark Brown wrote: > On Mon, 28 Nov 2022 12:00:01 +0100, Ricardo Ribalda wrote: > > If the device does not come straight from reset, we might receive an IRQ > > before we are ready to handle it. > > > > Fixes: > > > > [ 0.832328] Unable to handle kernel read from unreadable memory at virtual address 0000000000000010 > > [ 1.040343] Call trace: > > [ 1.040347] mtk_spi_can_dma+0xc/0x40 > > ... > > [ 1.262265] start_kernel+0x338/0x42c > > > > [...] > > Applied to > > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next > > Thanks! > > [1/1] spi: mediatek: Enable irq when pdata is ready > commit: c6f7874687f7027d7c4b2f53ff6e4d22850f915d > Testing next-20221220 on the BananaPi BPi-R3 MT7986 board it turns out that this commit is *causing* a kernel oops and strack trace output actually quite similar to what is described in the commit message: ... [ 2.074496] mtk-spi 1100a000.spi: spi-mem transfer timeout [ 2.080064] spi-nor: probe of spi0.0 failed with error -110 [ 2.085830] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 [ 2.094624] Mem abort info: [ 2.097418] ESR = 0x0000000096000005 [ 2.101162] EC = 0x25: DABT (current EL), IL = 32 bits [ 2.106466] SET = 0, FnV = 0 [ 2.109521] EA = 0, S1PTW = 0 [ 2.111511] mtk_soc_eth 15100000.ethernet eth0: mediatek frame engine at 0xffffffc00e280000, irq 121 [ 2.112653] FSC = 0x05: level 1 translation fault [ 2.112659] Data abort info: [ 2.121921] mtk_soc_eth 15100000.ethernet: generated random MAC address ba:0e:79:b6:0a:83 [ 2.126628] ISV = 0, ISS = 0x00000005 [ 2.126631] CM = 0, WnR = 0 [ 2.126635] [0000000000000010] user address but active_mm is swapper [ 2.126724] Internal error: Oops: 0000000096000005 [#1] SMP [ 2.130647] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc00e280000, irq 121 [ 2.137739] Modules linked in: [ 2.137748] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-next-20221220+ #0 [ 2.137753] Hardware name: Bananapi BPI-R3 (DT) [ 2.137756] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.142036] i2c_dev: i2c /dev entries driver [ 2.144530] pc : mtk_spi_can_dma+0x0/0x2c [ 2.144541] lr : mtk_spi_interrupt+0x60/0x350 [ 2.144545] sp : ffffffc008003ec0 [ 2.144547] x29: ffffffc008003ec0 x28: ffffffc008eb67c8 x27: 0000000000012400 [ 2.152967] NET: Registered PF_INET6 protocol family [ 2.156435] x26: 0000000000000400 x25: 0000000000000076 x24: ffffff80072dac00 [ 2.156443] x23: 0000000000000000 x22: ffffffc008d33008 x21: 0000000000000000 [ 2.156450] x20: ffffff8007b67000 x19: ffffff8007b67880 x18: 0000000000000002 [ 2.156457] x17: ffffffc076e23000 [ 2.166540] Segment Routing with IPv6 [ 2.168599] x16: ffffffc008004000 x15: 0000000000004000 [ 2.168605] x14: 0000000000000000 x13: 00000000000001dd [ 2.175768] In-situ OAM (IOAM) with IPv6 [ 2.180229] x12: 00000000000001dd [ 2.187232] NET: Registered PF_PACKET protocol family [ 2.191417] x11: 0000000000000040 x10: 0000000000000000 [ 2.195594] 8021q: 802.1Q VLAN Support v1.8 [ 2.199745] x9 : ffffffc076e23000 [ 2.216379] pstore: Using crash dump compression: deflate [ 2.222225] x8 : 00000000f04bc9a3 x7 : 0000000000000000 x6 : 0000000000000002 [ 2.222233] x5 : 0000000000000003 x4 : 0000000000000001 [ 2.245403] mtk-pcie-gen3 11280000.pcie: host bridge /soc/pcie@11280000 ranges: [ 2.248702] x3 : ffffffc008672704 [ 2.248705] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff8007b67000 [ 2.253927] mtk-pcie-gen3 11280000.pcie: Parsing ranges property... [ 2.257818] [ 2.257820] Call trace: [ 2.257822] mtk_spi_can_dma+0x0/0x2c [ 2.257828] __handle_irq_event_percpu+0xa8/0x340 [ 2.257835] handle_irq_event+0x48/0xd0 [ 2.257840] handle_fasteoi_irq+0xa0/0x1dc [ 2.261241] mtk-pcie-gen3 11280000.pcie: MEM 0x0020000000..0x002fffffff -> 0x0020000000 [ 2.266255] generic_handle_domain_irq+0x28/0x40 [ 2.266260] gic_handle_irq+0x4c/0x124 [ 2.357698] call_on_irq_stack+0x2c/0x40 [ 2.361612] do_interrupt_handler+0x78/0x84 [ 2.365783] el1_interrupt+0x30/0x4c [ 2.369351] el1h_64_irq_handler+0x14/0x20 [ 2.373434] el1h_64_irq+0x64/0x68 [ 2.376825] arch_cpu_idle+0x14/0x20 [ 2.380389] default_idle_call+0x68/0xe0 [ 2.384302] do_idle+0xe8/0x140 [ 2.387434] cpu_startup_entry+0x24/0x30 [ 2.391345] rest_init+0x120/0x1a0 [ 2.394735] arch_post_acpi_subsys_init+0x0/0x8 [ 2.399254] start_kernel+0x68c/0x6ac [ 2.402905] __primary_switched+0xb4/0xbc [ 2.406909] Code: 17ffffd3 12003c23 12103c24 17ffffe7 (b9401041) [ 2.412987] ---[ end trace 0000000000000000 ]--- [ 2.419401] pstore: backend (ramoops) writing error (-28) [ 2.420889] mtk-pcie-gen3 11280000.pcie: set MEM trans window[0]: cpu_addr = 0x20000000, pci_addr = 0x20000000, size = 0x10000000 [ 2.424785] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 2.424788] SMP: stopping secondary CPUs [ 2.424794] Kernel Offset: disabled [ 2.424796] CPU features: 0x00000,00000008,0000400b [ 2.424799] Memory Limit: none [ 2.460293] Rebooting in 1 seconds.. Reverting the above commit fixes the issue and results in normal boot: [daniel@box linux.git]$ git revert c6f7874687f70 Auto-merging drivers/spi/spi-mt65xx.c hint: Waiting for your editor to close the file... [mt7986-wip a1aafac27c46d] Revert "spi: mediatek: Enable irq when pdata is ready" 1 file changed, 5 insertions(+), 7 deletions(-) ... [ 1.045904] spi-nor spi0.0: w25q256 (32768 Kbytes) [ 1.063603] 6 fixed-partitions partitions found on MTD device spi0.0 [ 1.070099] OF: Bad cell count for /soc/spi@1100a000/flash@0/partitions [ 1.076799] OF: Bad cell count for /soc/spi@1100a000/flash@0/partitions [ 1.084180] Creating 6 MTD partitions on "spi0.0": [ 1.089011] 0x000000c00000-0x000002000000 : "fit" [ 1.096514] 0x000000180000-0x000000c00000 : "recovery" [ 1.103853] 0x000000100000-0x000000180000 : "fip" [ 1.110685] 0x000000080000-0x000000100000 : "reserved2" [ 1.118762] 0x000000040000-0x000000080000 : "u-boot-env" [ 1.126593] 0x000000000000-0x000000040000 : "bl2" ... Best regards Daniel
[Note: this mail contains only information for Linux kernel regression tracking. Mails like these contain '#forregzbot' in the subject to make then easy to spot and filter out. The author also tried to remove most or all individuals from the list of recipients to spare them the hassle.] Hi, this is your Linux kernel regression tracker. On 24.12.22 20:02, Daniel Golle wrote: >> [1/1] spi: mediatek: Enable irq when pdata is ready >> commit: c6f7874687f7027d7c4b2f53ff6e4d22850f915d >> > > Testing next-20221220 on the BananaPi BPi-R3 MT7986 board it turns > out that this commit is *causing* a kernel oops and strack trace output > actually quite similar to what is described in the commit message: Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot: #regzbot ^introduced c6f7874687f7027d7c4b2f53ff6e4d228 #regzbot title spi: mediatek: oops due to race #regzbot monitor: https://lore.kernel.org/all/Y6hZA19rWuDkGJa2@makrotopia.org/ #regzbot fix: spi: mediatek: Enable irq before the spi registration #regzbot ignore-activity Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
diff --git a/drivers/spi/spi-mt65xx.c b/drivers/spi/spi-mt65xx.c index d6aff909fc36..6de8360e5c2a 100644 --- a/drivers/spi/spi-mt65xx.c +++ b/drivers/spi/spi-mt65xx.c @@ -1192,11 +1192,6 @@ static int mtk_spi_probe(struct platform_device *pdev) else dma_set_max_seg_size(dev, SZ_256K); - ret = devm_request_irq(dev, irq, mtk_spi_interrupt, - IRQF_TRIGGER_NONE, dev_name(dev), master); - if (ret) - return dev_err_probe(dev, ret, "failed to register irq\n"); - mdata->parent_clk = devm_clk_get(dev, "parent-clk"); if (IS_ERR(mdata->parent_clk)) return dev_err_probe(dev, PTR_ERR(mdata->parent_clk), @@ -1266,6 +1261,13 @@ static int mtk_spi_probe(struct platform_device *pdev) return dev_err_probe(dev, ret, "failed to register master\n"); } + ret = devm_request_irq(dev, irq, mtk_spi_interrupt, + IRQF_TRIGGER_NONE, dev_name(dev), master); + if (ret) { + pm_runtime_disable(dev); + return dev_err_probe(dev, ret, "failed to register irq\n"); + } + return 0; }
If the device does not come straight from reset, we might receive an IRQ before we are ready to handle it. Fixes: [ 0.832328] Unable to handle kernel read from unreadable memory at virtual address 0000000000000010 [ 1.040343] Call trace: [ 1.040347] mtk_spi_can_dma+0xc/0x40 ... [ 1.262265] start_kernel+0x338/0x42c Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> --- To: Mark Brown <broonie@kernel.org> To: Matthias Brugger <matthias.bgg@gmail.com> Cc: linux-spi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Cc: linux-kernel@vger.kernel.org --- drivers/spi/spi-mt65xx.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) --- base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 change-id: 20221128-spi-mt65xx-b92c9ca96ffa Best regards,