Message ID | 20230407200551.12660-1-michael.christie@oracle.com |
---|---|
Headers | show |
Series | Use block pr_ops in LIO | expand |
[sorry for the adding you in CC] While running LTP controllers test suite on this patch set applied on top of the next-20230406 and the following kernel panic noticed on qemu-i386. crash log: --------- pids 1 TINFO: timeout per run is 0h 25m 0s pids 1 TINFO: test starts with cgroup version 1 pids 1 TINFO: Running testcase 6 with 10 processes pids 1 TINFO: set a limit that is smaller than current number of pids <6>[ 782.211806] cgroup: fork rejected by pids controller in /ltp/test-2724 pids 1 TPASS: fork failed as expected /opt/ltp/testcases/bin/tst_test.sh: line 150: 2760 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2761 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2762 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2763 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2764 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2765 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2766 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2767 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2768 Killed pids_task2 /opt/ltp/testcases/bin/tst_test.sh: line 150: 2769 Killed pids_task2 <4>[ 782.594441] int3: 0000 [#1] PREEMPT SMP <4>[ 782.594783] CPU: 1 PID: 2724 Comm: pids.sh Not tainted 6.3.0-rc5-next-20230406 #1 <4>[ 782.594915] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 <4>[ 782.595168] EIP: get_page_from_freelist+0x157/0xccc <4>[ 782.595745] Code: 48 04 8d 43 08 85 c9 0f 85 fe 07 00 00 3b 73 0c 0f 82 f5 07 00 00 89 45 c8 8b 45 c8 8b 00 89 45 e0 85 c0 0f 84 fc 07 00 00 3e <8d> 74 26 00 8b 45 cc 80 78 14 00 0f 84 90 05 00 00 8b 45 e0 8b 58 <4>[ 782.595850] EAX: 00000000 EBX: c7ab5ce4 ECX: 00000800 EDX: 00000000 <4>[ 782.595889] ESI: 00400dc0 EDI: 00000000 EBP: c7ab5c78 ESP: c7ab5c04 <4>[ 782.595928] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000297 <4>[ 782.596017] CR0: 80050033 CR2: 086d1d54 CR3: 06eaa000 CR4: 000006d0 <4>[ 782.596188] Call Trace: <4>[ 782.596503] ? exc_int3+0x10/0x130 <4>[ 782.596630] ? __irq_exit_rcu+0x15/0xcc <4>[ 782.596655] ? sysvec_call_function+0x3c/0x3c <4>[ 782.596677] ? __irq_exit_rcu+0x15/0xcc <4>[ 782.596694] ? irqentry_exit+0x26/0x58 <4>[ 782.596733] __alloc_pages+0x156/0xf50 <4>[ 782.596783] ? handle_exception+0x133/0x133 <4>[ 782.596817] ? __rcu_read_unlock+0x1e/0x30 <4>[ 782.596843] ? sysvec_call_function+0x3c/0x3c <4>[ 782.596880] ? __irq_exit_rcu+0x15/0xcc <4>[ 782.596903] pte_alloc_one+0x23/0x88 <4>[ 782.596928] __pte_alloc+0x21/0xb0 <4>[ 782.596981] ? trace_hardirqs_on+0x2c/0x88 <4>[ 782.597002] copy_page_range+0x67d/0xb40 <4>[ 782.597049] ? mas_wr_modify+0x10e/0x364 <4>[ 782.597069] ? mod_objcg_state+0x99/0x378 <4>[ 782.597103] ? mas_wr_store_entry.isra.0+0x10c/0x534 <4>[ 782.597127] ? mas_store+0x45/0xb0 <4>[ 782.597160] copy_process+0x1de6/0x1f9c <4>[ 782.597179] ? lockref_get_not_dead+0x2c/0x38 <4>[ 782.597246] kernel_clone+0xc1/0x3dc <4>[ 782.597279] __ia32_sys_clone+0x71/0x8c <4>[ 782.597320] __do_fast_syscall_32+0x4c/0xb8 <4>[ 782.597340] do_fast_syscall_32+0x32/0x74 <4>[ 782.597361] do_SYSENTER_32+0x15/0x24 <4>[ 782.597381] entry_SYSENTER_32+0x98/0xf1 <4>[ 782.597484] EIP: 0xb7f3e579 <4>[ 782.597838] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 <4>[ 782.597856] EAX: ffffffda EBX: 01200011 ECX: 00000000 EDX: 00000000 <4>[ 782.597876] ESI: 00000000 EDI: b7f398e8 EBP: b7f01e3c ESP: bfa4599c <4>[ 782.597889] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000286 <4>[ 782.598167] Modules linked in: <4>[ 782.616852] ---[ end trace 0000000000000000 ] --- <4>[ 782.616998] EIP: get_page_from_freelist+0x157/0xccc <4>[ 782.617213] Code: 48 04 8d 43 08 85 c9 0f 85 fe 07 00 00 3b 73 0c 0f 82 f5 07 00 00 89 45 c8 8b 45 c8 8b 00 89 45 e0 85 c0 0f 84 fc 07 00 00 3e <8d> 74 26 00 8b 45 cc 80 78 14 00 0f 84 90 05 00 00 8b 45 e0 8b 58 <4>[ 782.617234] EAX: 00000000 EBX: c7ab5ce4 ECX: 00000800 EDX: 00000000 <4>[ 782.617247] ESI: 00400dc0 EDI: 00000000 EBP: c7ab5c78 ESP: c7ab5c04 <4>[ 782.617260] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000297 <4>[ 782.617276] CR0: 80050033 CR2: 086d1d54 CR3: 06eaa000 CR4: 000006d0 <0>[ 782.617472] Kernel panic - not syncing: Fatal exception in interrupt <0>[ 782.619115] Kernel Offset: disabled Crash log details, https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171777/suite/log-parser-test/test/check-kernel-panic/details/ https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171777/suite/log-parser-test/test/check-kernel-panic/log Build artifacts links, https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpRflnb14kT3aJdPte4NdjKoT/ vmlinux: https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpRflnb14kT3aJdPte4NdjKoT/vmlinux.xz System.map: https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpRflnb14kT3aJdPte4NdjKoT/System.map steps to reproduce: # To install tuxrun on your system globally: # sudo pip3 install -U tuxrun==0.41.0 # # See https://tuxrun.org/ for complete documentation. tuxrun \ --runtime podman \ --device qemu-i386 \ --kernel https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpRflnb14kT3aJdPte4NdjKoT/bzImage \ --modules https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpRflnb14kT3aJdPte4NdjKoT/modules.tar.xz \ --rootfs https://storage.tuxsuite.com/public/linaro/lkft/oebuilds/2OFRZUbhWDZYvEcYrKKj1AJ618K/images/intel-core2-32/lkft-tux-image-intel-core2-32-20230410193126.rootfs.ext4.gz \ --parameters SKIPFILE=skipfile-lkft.yaml \ --parameters SHARD_NUMBER=10 \ --parameters SHARD_INDEX=10 \ --image docker.io/lavasoftware/lava-dispatcher:2023.01.0020.gc1598238f \ --tests ltp-controllers \ --timeouts boot=15 ltp-controllers=80 -- Linaro LKFT https://lkft.linaro.org
On Wed, 12 Apr 2023 at 15:06, Naresh Kamboju <naresh.kamboju@linaro.org> wrote: > > [sorry for the adding you in CC] > > While running LTP controllers test suite on this patch set applied on top of > the next-20230406 and the following kernel panic noticed on qemu-i386. Also noticed on qemu-x86_64. Crash log link, ------------------ - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/test/check-kernel-panic/log - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/tests/ lore link, https://lore.kernel.org/linux-block/20230407200551.12660-1-michael.christie@oracle.com/ -- Linaro LKFT https://lkft.linaro.org
[sorry for the adding you in CC] While running LTP controllers test suite on this patch set applied on top of the next-20230406 and the following kernel panic noticed on qemu-x86_64. Lore link: https://lore.kernel.org/linux-block/20230407200551.12660-1-michael.christie@oracle.com/ cpuset_inherit 31 TPASS: mem_exclusive: Inherited information is right! cpuset_inherit 33 TPASS: mem_hardwall: Inherited information is right! <4>[ 1234.875309] int3: 0000 [#1] PREEMPT SMP PTI <4>[ 1234.875748] CPU: 1 PID: 32990 Comm: umount Not tainted 6.3.0-rc5-next-20230406 #1 <4>[ 1234.875946] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 <4>[ 1234.876062] RIP: 0010:__alloc_pages+0xdf/0x310 <4>[ 1234.876666] Code: 4c 89 45 b0 83 3d b8 1c cb 01 00 0f 85 a2 01 00 00 44 89 f0 c1 e8 03 83 e0 03 89 75 a0 89 45 c0 be 01 00 00 00 4c 89 45 98 0f <1f> 44 00 00 4d 89 c5 44 89 f0 44 89 75 a4 41 f7 c6 00 04 00 00 74 <4>[ 1234.876885] RSP: 0000:ffff8b9305f47c10 EFLAGS: 00000202 <4>[ 1234.877418] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 <4>[ 1234.877433] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000140cca <4>[ 1234.877446] RBP: ffff8b9305f47c78 R08: 0000000000000000 R09: 0000000000000000 <4>[ 1234.877460] R10: 0000000000100cca R11: 0000000000000000 R12: 0000000000000003 <4>[ 1234.877472] R13: ffff88d28b753b00 R14: 0000000000140cca R15: ffff88d2ffffb300 <4>[ 1234.877518] FS: 0000000000000000(0000) GS:ffff88d2fbd00000(0000) knlGS:0000000000000000 <4>[ 1234.877547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 1234.877568] CR2: 0000000000408ee0 CR3: 0000000111432000 CR4: 00000000000006e0 <4>[ 1234.877662] Call Trace: <4>[ 1234.877828] <TASK> <4>[ 1234.877954] __folio_alloc+0x1e/0x50 <4>[ 1234.878051] vma_alloc_folio+0x4af/0x520 <4>[ 1234.878066] ? _raw_spin_unlock+0x1a/0x40 <4>[ 1234.878079] ? do_wp_page+0x164/0xdd0 <4>[ 1234.878093] ? trace_preempt_on+0x1e/0x80 <4>[ 1234.878105] ? preempt_count_sub+0x63/0x80 <4>[ 1234.878118] do_wp_page+0x3cb/0xdd0 <4>[ 1234.878129] ? handle_mm_fault+0x739/0x19e0 <4>[ 1234.878142] ? _raw_spin_lock+0x23/0x50 <4>[ 1234.878159] handle_mm_fault+0x770/0x19e0 <4>[ 1234.878170] ? up_write+0x52/0xe0 <4>[ 1234.878196] do_user_addr_fault+0x4d4/0x6c0 <4>[ 1234.878210] ? trace_hardirqs_off_finish+0x38/0x90 <4>[ 1234.878222] exc_page_fault+0x80/0x1d0 <4>[ 1234.878234] asm_exc_page_fault+0x2b/0x30 <4>[ 1234.878291] RIP: 0033:0x7f3f910e65f6 <4>[ 1234.878542] Code: 2f 0a 00 00 e8 5b 23 ff ff 48 89 85 60 fd ff ff 48 8b 85 88 fd ff ff 48 8b 80 e8 00 00 00 48 85 c0 74 0b 48 8b 9d 68 fd ff ff <48> 89 58 08 48 8b 85 68 fd ff ff c7 40 18 01 00 00 00 e8 43 2b fe <4>[ 1234.878552] RSP: 002b:00007fff6f8dce50 EFLAGS: 00000206 <4>[ 1234.878563] RAX: 0000000000408ed8 RBX: 00007f3f910fe0d8 RCX: 00007f3f910c6270 <4>[ 1234.878569] RDX: 0000000000000000 RSI: 00007f3f910fe2a0 RDI: 00007fff6f8dcf10 <4>[ 1234.878575] RBP: 00007fff6f8dd110 R08: 0000000000000000 R09: 0000000000000007 <4>[ 1234.878581] R10: 0000000000000000 R11: 0000000000000008 R12: 0000000000000000 <4>[ 1234.878585] R13: 00007f3f910fe2a0 R14: 00007fff6f8dced0 R15: 0000000000000000 <4>[ 1234.878636] </TASK> <4>[ 1234.878693] Modules linked in: <4>[ 1234.894886] ---[ end trace 0000000000000000 ]--- <4>[ 1234.895879] RIP: 0010:__alloc_pages+0xdf/0x310 <4>[ 1234.895921] Code: 4c 89 45 b0 83 3d b8 1c cb 01 00 0f 85 a2 01 00 00 44 89 f0 c1 e8 03 83 e0 03 89 75 a0 89 45 c0 be 01 00 00 00 4c 89 45 98 0f <1f> 44 00 00 4d 89 c5 44 89 f0 44 89 75 a4 41 f7 c6 00 04 00 00 74 <4>[ 1234.895936] RSP: 0000:ffff8b9305f47c10 EFLAGS: 00000202 <4>[ 1234.895956] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 <4>[ 1234.895966] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000140cca <4>[ 1234.895977] RBP: ffff8b9305f47c78 R08: 0000000000000000 R09: 0000000000000000 <4>[ 1234.895985] R10: 0000000000100cca R11: 0000000000000000 R12: 0000000000000003 <4>[ 1234.895993] R13: ffff88d28b753b00 R14: 0000000000140cca R15: ffff88d2ffffb300 <4>[ 1234.896003] FS: 0000000000000000(0000) GS:ffff88d2fbd00000(0000) knlGS:0000000000000000 <4>[ 1234.896016] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 1234.896026] CR2: 0000000000408ee0 CR3: 0000000111432000 CR4: 00000000000006e0 <0>[ 1234.896379] Kernel panic - not syncing: Fatal exception in interrupt <0>[ 1234.900046] Kernel Offset: 0x8800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Links: https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230407200551_12660-1-michael_christie_oracle_com/testrun/16172098/suite/log-parser-test/test/check-kernel-panic/log https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230407200551_12660-1-michael_christie_oracle_com/testrun/16172061/suite/log-parser-test/test/check-kernel-panic/log https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230407200551_12660-1-michael_christie_oracle_com/testrun/16172098/suite/log-parser-test/tests/ https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230407200551_12660-1-michael_christie_oracle_com/testrun/16172061/suite/log-parser-test/tests/ Steps to reproduce: ------------------- # To install tuxrun on your system globally: # sudo pip3 install -U tuxrun==0.41.0 # # See https://tuxrun.org/ for complete documentation. tuxrun \ --runtime podman \ --device qemu-x86_64 \ --kernel https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpdVeBBG4Gj6aACnSdSGva2LN/bzImage \ --modules https://storage.tuxsuite.com/public/linaro/anders/builds/2OGpdVeBBG4Gj6aACnSdSGva2LN/modules.tar.xz \ --rootfs https://storage.tuxsuite.com/public/linaro/lkft/oebuilds/2OFRZUnV2Q9jOFdE3gH3Gq2v692/images/intel-corei7-64/lkft-tux-image-intel-corei7-64-20230410193144.rootfs.ext4.gz \ --parameters SKIPFILE=skipfile-lkft.yaml \ --parameters SHARD_NUMBER=10 \ --parameters SHARD_INDEX=10 \ --image docker.io/lavasoftware/lava-dispatcher:2023.01.0020.gc1598238f \ --tests ltp-controllers \ --timeouts boot=15 ltp-controllers=80 -- Linaro LKFT https://lkft.linaro.org
On 4/12/23 5:25 AM, Naresh Kamboju wrote: > On Wed, 12 Apr 2023 at 15:06, Naresh Kamboju <naresh.kamboju@linaro.org> wrote: >> >> [sorry for the adding you in CC] >> >> While running LTP controllers test suite on this patch set applied on top of >> the next-20230406 and the following kernel panic noticed on qemu-i386. > > Also noticed on qemu-x86_64. > > Crash log link, > ------------------ > - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/test/check-kernel-panic/log > - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/tests/ Can you point me to the original report? I don't think my patches are the cause of the failure, or if they are there is a crazy bug. Above, I think you pointed me to the wrong link above because it looks like that's for a different patchset. Or did I misunderstand the testing and that link has my patches included? I did see my patches tested: https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/ but they seem to fail in similar places as other failures that day, and the failures don't seem related to my patches. It doesn't look like you are doing anything nvme or block pr ioctl related and just failing on forks and OOM. It looks like you are booting from a scsi device but I only touched the scsi layer's code for persistent reservations and the tests don't seem to be using that code. > > lore link, > https://lore.kernel.org/linux-block/20230407200551.12660-1-michael.christie@oracle.com/ > > > -- > Linaro LKFT > https://lkft.linaro.org
On Wed, 12 Apr 2023 at 23:59, Mike Christie <michael.christie@oracle.com> wrote: > > On 4/12/23 5:25 AM, Naresh Kamboju wrote: > > On Wed, 12 Apr 2023 at 15:06, Naresh Kamboju <naresh.kamboju@linaro.org> wrote: > >> > >> [sorry for the adding you in CC] > >> > >> While running LTP controllers test suite on this patch set applied on top of > >> the next-20230406 and the following kernel panic noticed on qemu-i386. > > > > Also noticed on qemu-x86_64. > > > > Crash log link, > > ------------------ > > - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/test/check-kernel-panic/log > > - https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-block_20230404140835_25166-1-sergei_shtepa_veeam_com/testrun/16171908/suite/log-parser-test/tests/ > > Can you point me to the original report? I don't think my patches are the cause of > the failure, or if they are there is a crazy bug. > > Above, I think you pointed me to the wrong link above because it looks like that's > for a different patchset. Or did I misunderstand the testing and that link has my > patches included? > > I did see my patches tested: > > https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/ > > but they seem to fail in similar places as other failures that day, and the > failures don't seem related to my patches. It doesn't look like you are doing > anything nvme or block pr ioctl related and just failing on forks and OOM. > It looks like you are booting from a scsi device but I only touched the scsi > layer's code for persistent reservations and the tests don't seem to be > using that code. Thanks for the analysis on these reports. Since it is based on top of Linux next-20230306, I will re-validate the base and get back to you with my findings. - Naresh > > > > > > > lore link, > > https://lore.kernel.org/linux-block/20230407200551.12660-1-michael.christie@oracle.com/ > > > > > > -- > > Linaro LKFT > > https://lkft.linaro.org >