Message ID | 20161212163518.GA20728@quack2.suse.cz |
---|---|
State | New |
Headers | show |
On Mon, Dec 12, 2016 at 05:35:18PM +0100, Jan Kara wrote: > On Mon 12-12-16 18:13:21, kernel test robot wrote: > > FYI, we noticed the following commit: > > > > commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure") > > https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev > > > > in testcase: nvml > > with following parameters: > > > > group: vmem > > test: pmem > > nr_pmem: 1 > > fs: ext4 > > mount_option: dax > > > > > > > > on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory > > > > caused below changes: > > > > > > +------------------------------------------------+------------+------------+ > > | | 96f8ba3dd6 | e2ae766c1b | > > +------------------------------------------------+------------+------------+ > > | boot_successes | 2 | 2 | > > | boot_failures | 2 | 2 | > > | BUG:kernel_hang_in_test_stage | 2 | | > > | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 2 | > > | calltrace:parport_pc_init | 0 | 2 | > > | calltrace:SyS_finit_module | 0 | 2 | > > | WARNING:at_lib/kobject.c:#kobject_add_internal | 0 | 2 | > > +------------------------------------------------+------------+------------+ > > > > > > > > user :notice: [ 325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug) > > > > user :notice: [ 325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc > > kern :err : [ 325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51 > > kern :err : [ 325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al > > kern :warn : [ 325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G O 4.9.0-rc4-00045-ge2ae766 #1 > > kern :warn : [ 325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012 > > kern :warn : [ 325.608922] ffffc9002c1f7be0 > > kern :warn : [ 325.608923] ffffffff81466af9 > > kern :warn : [ 325.608924] ffff880fea2425c0 > > I think this is actually a bug introduced by Ross' PMD support. Attached > patch should fix it. Ross, can you check it please? Yep, that patch looks good. You can add: Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> And I turned on CONFIG_DEBUG_ATOMIC_SLEEP in my test config. :)
Is this problem likely to happen in other file systems? Should I take this path through the ext4 tree, or would it be better via some other git tree? - Ted
On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote: > Is this problem likely to happen in other file systems? Should I take > this path through the ext4 tree, or would it be better via some other > git tree? > > - Ted The problem is in the generic DAX code and affects ext4 and xfs equally (ext2 doesn't support PMDs).
On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote: > On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote: > > Is this problem likely to happen in other file systems? Should I take > > this path through the ext4 tree, or would it be better via some other > > git tree? > > > > - Ted > > The problem is in the generic DAX code and affects ext4 and xfs equally (ext2 > doesn't support PMDs). Any preferences about how to send this patch to Linus? This issue is the only thing that was causing me to hold off on sending a pull request to Linus.... - Ted
On Mon, Dec 12, 2016 at 06:00:20PM -0500, Theodore Ts'o wrote: > On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote: > > On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote: > > > Is this problem likely to happen in other file systems? Should I take > > > this path through the ext4 tree, or would it be better via some other > > > git tree? > > > > > > - Ted > > > > The problem is in the generic DAX code and affects ext4 and xfs equally (ext2 > > doesn't support PMDs). > > Any preferences about how to send this patch to Linus? This issue is > the only thing that was causing me to hold off on sending a pull > request to Linus.... Personally I'm happy to have you send it. Thanks!
Jan Kara <jack@suse.cz> writes: > On Mon 12-12-16 18:13:21, kernel test robot wrote: >> FYI, we noticed the following commit: >> >> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure") >> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev >> >> in testcase: nvml >> with following parameters: >> >> group: vmem >> test: pmem >> nr_pmem: 1 >> fs: ext4 >> mount_option: dax >> >> >> >> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory >> >> caused below changes: >> >> >> +------------------------------------------------+------------+------------+ >> | | 96f8ba3dd6 | e2ae766c1b | >> +------------------------------------------------+------------+------------+ >> | boot_successes | 2 | 2 | >> | boot_failures | 2 | 2 | >> | BUG:kernel_hang_in_test_stage | 2 | | >> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 2 | >> | calltrace:parport_pc_init | 0 | 2 | >> | calltrace:SyS_finit_module | 0 | 2 | >> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0 | 2 | >> +------------------------------------------------+------------+------------+ >> >> >> >> user :notice: [ 325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug) >> >> user :notice: [ 325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc >> kern :err : [ 325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51 >> kern :err : [ 325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al >> kern :warn : [ 325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G O 4.9.0-rc4-00045-ge2ae766 #1 >> kern :warn : [ 325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012 >> kern :warn : [ 325.608922] ffffc9002c1f7be0 >> kern :warn : [ 325.608923] ffffffff81466af9 >> kern :warn : [ 325.608924] ffff880fea2425c0 > > I think this is actually a bug introduced by Ross' PMD support. Attached > patch should fix it. Ross, can you check it please? Hi, Jan Could you provide a git tree commit for me to test it? If you want it to be tested by 0day. Best Regards, Huang, Ying > Honza
On Mon 12-12-16 18:00:20, Ted Tso wrote: > On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote: > > On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote: > > > Is this problem likely to happen in other file systems? Should I take > > > this path through the ext4 tree, or would it be better via some other > > > git tree? > > > > > > - Ted > > > > The problem is in the generic DAX code and affects ext4 and xfs equally (ext2 > > doesn't support PMDs). > > Any preferences about how to send this patch to Linus? This issue is > the only thing that was causing me to hold off on sending a pull > request to Linus.... Yeah, just take it unless Dave already did. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On Tue 13-12-16 09:27:51, Huang, Ying wrote: > Jan Kara <jack@suse.cz> writes: > > > On Mon 12-12-16 18:13:21, kernel test robot wrote: > >> FYI, we noticed the following commit: > >> > >> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure") > >> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev > >> > >> in testcase: nvml > >> with following parameters: > >> > >> group: vmem > >> test: pmem > >> nr_pmem: 1 > >> fs: ext4 > >> mount_option: dax > >> > >> > >> > >> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory > >> > >> caused below changes: > >> > >> > >> +------------------------------------------------+------------+------------+ > >> | | 96f8ba3dd6 | e2ae766c1b | > >> +------------------------------------------------+------------+------------+ > >> | boot_successes | 2 | 2 | > >> | boot_failures | 2 | 2 | > >> | BUG:kernel_hang_in_test_stage | 2 | | > >> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 2 | > >> | calltrace:parport_pc_init | 0 | 2 | > >> | calltrace:SyS_finit_module | 0 | 2 | > >> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0 | 2 | > >> +------------------------------------------------+------------+------------+ > >> > >> > >> > >> user :notice: [ 325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug) > >> > >> user :notice: [ 325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc > >> kern :err : [ 325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51 > >> kern :err : [ 325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al > >> kern :warn : [ 325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G O 4.9.0-rc4-00045-ge2ae766 #1 > >> kern :warn : [ 325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012 > >> kern :warn : [ 325.608922] ffffc9002c1f7be0 > >> kern :warn : [ 325.608923] ffffffff81466af9 > >> kern :warn : [ 325.608924] ffff880fea2425c0 > > > > I think this is actually a bug introduced by Ross' PMD support. Attached > > patch should fix it. Ross, can you check it please? > > Hi, Jan > > Could you provide a git tree commit for me to test it? If you want it > to be tested by 0day. Thanks for the offer! I've pushed out the latest version of my DAX patches including the above fix to git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git dax Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
Jan Kara <jack@suse.cz> writes: > On Tue 13-12-16 09:27:51, Huang, Ying wrote: >> Jan Kara <jack@suse.cz> writes: >> >> > On Mon 12-12-16 18:13:21, kernel test robot wrote: >> >> FYI, we noticed the following commit: >> >> >> >> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure") >> >> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev >> >> >> >> in testcase: nvml >> >> with following parameters: >> >> >> >> group: vmem >> >> test: pmem >> >> nr_pmem: 1 >> >> fs: ext4 >> >> mount_option: dax >> >> >> >> >> >> >> >> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory >> >> >> >> caused below changes: >> >> >> >> >> >> +------------------------------------------------+------------+------------+ >> >> | | 96f8ba3dd6 | e2ae766c1b | >> >> +------------------------------------------------+------------+------------+ >> >> | boot_successes | 2 | 2 | >> >> | boot_failures | 2 | 2 | >> >> | BUG:kernel_hang_in_test_stage | 2 | | >> >> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 2 | >> >> | calltrace:parport_pc_init | 0 | 2 | >> >> | calltrace:SyS_finit_module | 0 | 2 | >> >> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0 | 2 | >> >> +------------------------------------------------+------------+------------+ >> >> >> >> >> >> >> >> user :notice: [ 325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug) >> >> >> >> user :notice: [ 325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc >> >> kern :err : [ 325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51 >> >> kern :err : [ 325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al >> >> kern :warn : [ 325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G O 4.9.0-rc4-00045-ge2ae766 #1 >> >> kern :warn : [ 325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012 >> >> kern :warn : [ 325.608922] ffffc9002c1f7be0 >> >> kern :warn : [ 325.608923] ffffffff81466af9 >> >> kern :warn : [ 325.608924] ffff880fea2425c0 >> > >> > I think this is actually a bug introduced by Ross' PMD support. Attached >> > patch should fix it. Ross, can you check it please? >> >> Hi, Jan >> >> Could you provide a git tree commit for me to test it? If you want it >> to be tested by 0day. > > Thanks for the offer! I've pushed out the latest version of my DAX patches > including the above fix to > > git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git dax The test result of the head and its parent is as follow, ========================================================================================= compiler/fs/group/kconfig/mount_option/nr_pmem/rootfs/tbox_group/test/testcase: gcc-6/ext4/vmem/x86_64-rhel-7.2/dax/1/debian-x86_64-2016-08-31.cgz/lkp-sbx04/pmem/nvml commit: c6207e9b753f466bc2e41455dc7611869d439d4e 4393b9bdd5043e550f1bcaf7f4e9c413d0088425 c6207e9b753f466b 4393b9bdd5043e550f1bcaf7f4 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | 3:3 -100% :3 dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c 3:3 -100% :3 kmsg.in_atomic():#,irqs_disabled():#,pid:#,name:vmem_aligned_al 3:3 -100% :3 kmsg.in_atomic():#,irqs_disabled():#,pid:#,name:vmem_multiple_p The bug has been fixed. Feel free to add, Tested-by: "Huang, Ying" <ying.huang@intel.com> Best Regards, Huang, Ying
From c3d67dc7543abc03161f6cf357039ad9e56783ca Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Mon, 12 Dec 2016 16:32:23 +0100 Subject: [PATCH] dax: Fix sleep in atomic contex in grab_mapping_entry() Commit 7b5b8c9c4ac9 "dax: add struct iomap based DAX PMD support" has introduced unmapping of page tables if huge page needs to be split in grab_mapping_entry(). However the unmapping happens after radix_tree_preload() call which disables preemption and thus unmap_mapping_range() tries to acquire i_mmap_lock in atomic context which is a bug. Fix the problem by moving unmapping before radix_tree_preload() call. Fixes: 7b5b8c9c4ac9716fe9d77ec56ae5d962192ba030 Signed-off-by: Jan Kara <jack@suse.cz> --- fs/dax.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 51b03e91d3e2..5c74f60d0a50 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -351,14 +351,6 @@ static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index, } spin_unlock_irq(&mapping->tree_lock); - err = radix_tree_preload( - mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM); - if (err) { - if (pmd_downgrade) - put_locked_mapping_entry(mapping, index, entry); - return ERR_PTR(err); - } - /* * Besides huge zero pages the only other thing that gets * downgraded are empty entries which don't need to be @@ -368,6 +360,13 @@ static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index, unmap_mapping_range(mapping, (index << PAGE_SHIFT) & PMD_MASK, PMD_SIZE, 0); + err = radix_tree_preload( + mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM); + if (err) { + if (pmd_downgrade) + put_locked_mapping_entry(mapping, index, entry); + return ERR_PTR(err); + } spin_lock_irq(&mapping->tree_lock); if (pmd_downgrade) { -- 2.10.2