From patchwork Thu Sep 9 01:10:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 508754 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-20.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99791C433EF for ; Thu, 9 Sep 2021 01:10:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77B2E61164 for ; Thu, 9 Sep 2021 01:10:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348778AbhIIBLM (ORCPT ); Wed, 8 Sep 2021 21:11:12 -0400 Received: from mail.kernel.org ([198.145.29.99]:60346 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232144AbhIIBLL (ORCPT ); Wed, 8 Sep 2021 21:11:11 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 16B4E6113C; Thu, 9 Sep 2021 01:10:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1631149803; bh=PJ4NEu+MkoH6d2xqP5XNeV8XRIRgMGM90hfwQfpLDAk=; h=Date:From:To:Subject:In-Reply-To:From; b=rycjUxSUm5zP0JhlJWXH5LLGMBz0jzUi2rEv6UjN5l0juF7aqix4aVT00Lj94r0XW rf8x91UFa/mMR8bOYVsrqL1jgrla9DQO8eYqajVp8nQfM1wQGTGRgxiy/j8bJ+sn3o LIUfI7MLArsSuND7TXlLxmiQFWSqV53aup+kNSqU= Date: Wed, 08 Sep 2021 18:10:02 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hch@lst.de, jgg@nvidia.com, linux-mm@kvack.org, lizhijian@cn.fujitsu.com, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 1/8] mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled Message-ID: <20210909011002.YtOxlcd0s%akpm@linux-foundation.org> In-Reply-To: <20210908180859.d523d4bb4ad8eec11c61500d@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Li Zhijian Subject: mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled Previously, we noticed the one rpma example was failed[1] since 36f30e486d, where it will use ODP feature to do RDMA WRITE between fsdax files. After digging into the code, we found hmm_vma_handle_pte() will still return EFAULT even though all the its requesting flags has been fulfilled. That's because a DAX page will be marked as (_PAGE_SPECIAL | PAGE_DEVMAP) by pte_mkdevmap(). [1]: https://github.com/pmem/rpma/issues/1142 Link: https://lkml.kernel.org/r/20210830094232.203029-1-lizhijian@cn.fujitsu.com Fixes: 405506274922 ("mm/hmm: add missing call to hmm_pte_need_fault in HMM_PFN_SPECIAL handling") Signed-off-by: Li Zhijian Reviewed-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe Cc: Signed-off-by: Andrew Morton --- mm/hmm.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- a/mm/hmm.c~mm-hmm-bypass-devmap-pte-when-all-pfn-requested-flags-are-fulfilled +++ a/mm/hmm.c @@ -295,10 +295,13 @@ static int hmm_vma_handle_pte(struct mm_ goto fault; /* + * Bypass devmap pte such as DAX page when all pfn requested + * flags(pfn_req_flags) are fulfilled. * Since each architecture defines a struct page for the zero page, just * fall through and treat it like a normal page. */ - if (pte_special(pte) && !is_zero_pfn(pte_pfn(pte))) { + if (pte_special(pte) && !pte_devmap(pte) && + !is_zero_pfn(pte_pfn(pte))) { if (hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0)) { pte_unmap(ptep); return -EFAULT; From patchwork Thu Sep 9 01:10:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 509193 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A9DFC433EF for ; Thu, 9 Sep 2021 01:10:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E4E761164 for ; Thu, 9 Sep 2021 01:10:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348960AbhIIBLP (ORCPT ); Wed, 8 Sep 2021 21:11:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:60432 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348941AbhIIBLO (ORCPT ); Wed, 8 Sep 2021 21:11:14 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 00C9261158; Thu, 9 Sep 2021 01:10:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1631149806; bh=Qi5g0WJd95Lld7Mm36JDS4ey+bIpL2U8i+LE+7t0DLI=; h=Date:From:To:Subject:In-Reply-To:From; b=tjX2+W5MSJQBvZaQvgy8Uog1sLFUictDCV5dS3QSpYfquLIZL36s3y3i02xB/No63 yVlJms9wjDsYwnF2SQTHwpfwnt9HbBRrjjVyezXnZ5n9mzKoqmLRTSamTsx+ocesqM 719qPO22gqbK8XhJuq/EEQCISXpRAsdCNZdUEqyk= Date: Wed, 08 Sep 2021 18:10:05 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, liuzixian4@huawei.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 2/8] mm/hugetlb: initialize hugetlb_usage in mm_init Message-ID: <20210909011005.6UTH8UzKk%akpm@linux-foundation.org> In-Reply-To: <20210908180859.d523d4bb4ad8eec11c61500d@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Liu Zixian Subject: mm/hugetlb: initialize hugetlb_usage in mm_init After fork, the child process will get incorrect (2x) hugetlb_usage. If a process uses 5 2MB hugetlb pages in an anonymous mapping, HugetlbPages: 10240 kB and then forks, the child will show, HugetlbPages: 20480 kB The reason for double the amount is because hugetlb_usage will be copied from the parent and then increased when we copy page tables from parent to child. Child will have 2x actual usage. Fix this by adding hugetlb_count_init in mm_init. Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com Fixes: 5d317b2b6536 ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status") Signed-off-by: Liu Zixian Reviewed-by: Naoya Horiguchi Reviewed-by: Mike Kravetz Cc: Signed-off-by: Andrew Morton --- include/linux/hugetlb.h | 9 +++++++++ kernel/fork.c | 1 + 2 files changed, 10 insertions(+) --- a/include/linux/hugetlb.h~mm-hugetlb-initialize-hugetlb_usage-in-mm_init +++ a/include/linux/hugetlb.h @@ -858,6 +858,11 @@ static inline spinlock_t *huge_pte_lockp void hugetlb_report_usage(struct seq_file *m, struct mm_struct *mm); +static inline void hugetlb_count_init(struct mm_struct *mm) +{ + atomic_long_set(&mm->hugetlb_usage, 0); +} + static inline void hugetlb_count_add(long l, struct mm_struct *mm) { atomic_long_add(l, &mm->hugetlb_usage); @@ -1042,6 +1047,10 @@ static inline spinlock_t *huge_pte_lockp return &mm->page_table_lock; } +static inline void hugetlb_count_init(struct mm_struct *mm) +{ +} + static inline void hugetlb_report_usage(struct seq_file *f, struct mm_struct *m) { } --- a/kernel/fork.c~mm-hugetlb-initialize-hugetlb_usage-in-mm_init +++ a/kernel/fork.c @@ -1063,6 +1063,7 @@ static struct mm_struct *mm_init(struct mm->pmd_huge_pte = NULL; #endif mm_init_uprobes_state(mm); + hugetlb_count_init(mm); if (current->mm) { mm->flags = current->mm->flags & MMF_INIT_MASK; From patchwork Thu Sep 9 01:10:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 508753 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CA7FC433F5 for ; Thu, 9 Sep 2021 01:10:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 64CA26113C for ; Thu, 9 Sep 2021 01:10:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348963AbhIIBLS (ORCPT ); Wed, 8 Sep 2021 21:11:18 -0400 Received: from mail.kernel.org ([198.145.29.99]:60488 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232144AbhIIBLR (ORCPT ); Wed, 8 Sep 2021 21:11:17 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0A4FD6115B; Thu, 9 Sep 2021 01:10:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1631149809; bh=jX/+qe+mshe9NQpY4nv0hiSuyosD7KUERle9R5zkFIM=; h=Date:From:To:Subject:In-Reply-To:From; b=RLch+0XeeuFu40/c2xQWm/8lmGUT5b8L93bgKIn7LyGAgi0pAxqR3e7sQgy87s8xZ XIbE+VfeQgzoauSJs4syOtBqif3uY/SVedTQ5hV/W8SBQZjNfCgQYoL/yCER66P2yP uiPiNHfPuBFgnRVmNEK4pXJYYBiG3PLcGX+eeltg= Date: Wed, 08 Sep 2021 18:10:08 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chris@chrisdown.name, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, riel@surriel.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 3/8] mm,vmscan: fix divide by zero in get_scan_count Message-ID: <20210909011008.OFzJhcuDA%akpm@linux-foundation.org> In-Reply-To: <20210908180859.d523d4bb4ad8eec11c61500d@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Rik van Riel Subject: mm,vmscan: fix divide by zero in get_scan_count Changeset f56ce412a59d ("mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim") introduced a divide by zero corner case when oomd is being used in combination with cgroup memory.low protection. When oomd decides to kill a cgroup, it will force the cgroup memory to be reclaimed after killing the tasks, by writing to the memory.max file for that cgroup, forcing the remaining page cache and reclaimable slab to be reclaimed down to zero. Previously, on cgroups with some memory.low protection that would result in the memory being reclaimed down to the memory.low limit, or likely not at all, having the page cache reclaimed asynchronously later. With f56ce412a59d the oomd write to memory.max tries to reclaim all the way down to zero, which may race with another reclaimer, to the point of ending up with the divide by zero below. This patch implements the obvious fix. Link: https://lkml.kernel.org/r/20210826220149.058089c6@imladris.surriel.com Fixes: f56ce412a59d ("mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim") Signed-off-by: Rik van Riel Acked-by: Roman Gushchin Acked-by: Michal Hocko Acked-by: Johannes Weiner Acked-by: Chris Down Cc: Signed-off-by: Andrew Morton --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmscan.c~mmvmscan-fix-divide-by-zero-in-get_scan_count +++ a/mm/vmscan.c @@ -2715,7 +2715,7 @@ out: cgroup_size = max(cgroup_size, protection); scan = lruvec_size - lruvec_size * protection / - cgroup_size; + (cgroup_size + 1); /* * Minimally target SWAP_CLUSTER_MAX pages to keep From patchwork Thu Sep 9 01:10:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 509192 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E74BC433EF for ; Thu, 9 Sep 2021 01:10:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 604FC61168 for ; Thu, 9 Sep 2021 01:10:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348976AbhIIBLV (ORCPT ); Wed, 8 Sep 2021 21:11:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:60546 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232144AbhIIBLU (ORCPT ); Wed, 8 Sep 2021 21:11:20 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 032F46113C; Thu, 9 Sep 2021 01:10:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1631149812; bh=ZUmJTe6KSaUmTIAXi8tzMYANKTDqYyuW8+p0GpblCVY=; h=Date:From:To:Subject:In-Reply-To:From; b=fZXYl1p6KY6tyXYCde8iJSgM1HDOJzQCHYQG/9cPITbBa/21wfyGoOCJ9NEfIHzl9 jC2aBWqpcU8wxHcczfXi15WMC+cdwDezG9hMMyY4+cEX4W1Jb3mqqC5tvvJuI4Jhy8 UsF6tvPgo03AQun+uk66BzkC5JikNt/9Pu40T3lI= Date: Wed, 08 Sep 2021 18:10:11 -0700 From: Andrew Morton To: akpm@linux-foundation.org, david@redhat.com, linmiaohe@huawei.com, linux-mm@kvack.org, mgorman@techsingularity.net, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 4/8] mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype Message-ID: <20210909011011.ijPe0NI2o%akpm@linux-foundation.org> In-Reply-To: <20210908180859.d523d4bb4ad8eec11c61500d@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Miaohe Lin Subject: mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype If it's not prepared to free unref page, the pcp page migratetype is unset. Thus We will get rubbish from get_pcppage_migratetype() and might list_del &page->lru again after it's already deleted from the list leading to grumble about data corruption. Link: https://lkml.kernel.org/r/20210902115447.57050-1-linmiaohe@huawei.com Fixes: df1acc856923 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock") Signed-off-by: Miaohe Lin Acked-by: Mel Gorman Acked-by: Vlastimil Babka Reviewed-by: David Hildenbrand Cc: Signed-off-by: Andrew Morton --- mm/page_alloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- a/mm/page_alloc.c~mm-page_allocc-avoid-accessing-uninitialized-pcp-page-migratetype +++ a/mm/page_alloc.c @@ -3428,8 +3428,10 @@ void free_unref_page_list(struct list_he /* Prepare pages for freeing */ list_for_each_entry_safe(page, next, list, lru) { pfn = page_to_pfn(page); - if (!free_unref_page_prepare(page, pfn, 0)) + if (!free_unref_page_prepare(page, pfn, 0)) { list_del(&page->lru); + continue; + } /* * Free isolated pages directly to the allocator, see From patchwork Thu Sep 9 01:10:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 508752 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52223C433EF for ; Thu, 9 Sep 2021 01:10:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B99661166 for ; Thu, 9 Sep 2021 01:10:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349086AbhIIBLa (ORCPT ); Wed, 8 Sep 2021 21:11:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:60728 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349033AbhIIBL3 (ORCPT ); Wed, 8 Sep 2021 21:11:29 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id E0AD261167; Thu, 9 Sep 2021 01:10:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1631149821; bh=hQV3I7izmMCYun39hvLMYFdMjP18vrfosOb34WJ/O54=; h=Date:From:To:Subject:In-Reply-To:From; b=LtsZXf55SheMRiWr7peh4Gk0+FrR/YsG5PCq9J1Qwv9Mxqn/2XCEMx2W7Mq1n/DAa GKyY5PpadS4RbFJ79MXmFinIxCldvAoF4sUEJRM5/UV/FOpnfnK8qa9gR8STORxhNN SFLCjamnMwhQFJIUafwVTbJoFhXbV1jBVJNhZ0y4= Date: Wed, 08 Sep 2021 18:10:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, songmuchun@bytedance.com, stable@vger.kernel.org, torvalds@linux-foundation.org, yanghui.def@bytedance.com Subject: [patch 7/8] mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task Message-ID: <20210909011020.5uPjbdM2T%akpm@linux-foundation.org> In-Reply-To: <20210908180859.d523d4bb4ad8eec11c61500d@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: yanghui Subject: mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task Servers happened below panic: Kernel version:5.4.56 BUG: unable to handle page fault for address: 0000000000002c48 RIP: 0010:__next_zones_zonelist+0x1d/0x40 [264003.977696] RAX: 0000000000002c40 RBX: 0000000000100dca RCX: 0000000000000014 [264003.977872] Call Trace: [264003.977888] __alloc_pages_nodemask+0x277/0x310 [264003.977908] alloc_page_interleave+0x13/0x70 [264003.977926] handle_mm_fault+0xf99/0x1390 [264003.977951] __do_page_fault+0x288/0x500 [264003.977979] ? schedule+0x39/0xa0 [264003.977994] do_page_fault+0x30/0x110 [264003.978010] page_fault+0x3e/0x50 The reason for the panic is that MAX_NUMNODES is passed in the third parameter in __alloc_pages_nodemask(preferred_nid). So access to zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic. In offset_il_node(), first_node() returns nid from pol->v.nodes, after this other threads may chang pol->v.nodes before next_node(). This race condition will let next_node return MAX_NUMNODES. So put pol->nodes in a local variable. The race condition is between offset_il_node and cpuset_change_task_nodemask: CPU0: CPU1: alloc_pages_vma() interleave_nid(pol,) offset_il_node(pol,) first_node(pol->v.nodes) cpuset_change_task_nodemask //nodes==0xc mpol_rebind_task mpol_rebind_policy mpol_rebind_nodemask(pol,nodes) //nodes==0x3 next_node(nid, pol->v.nodes)//return MAX_NUMNODES Link: https://lkml.kernel.org/r/20210906034658.48721-1-yanghui.def@bytedance.com Signed-off-by: yanghui Reviewed-by: Muchun Song Cc: Signed-off-by: Andrew Morton --- mm/mempolicy.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) --- a/mm/mempolicy.c~mm-mempolicy-fix-a-race-between-offset_il_node-and-mpol_rebind_task +++ a/mm/mempolicy.c @@ -1876,17 +1876,26 @@ unsigned int mempolicy_slab_node(void) */ static unsigned offset_il_node(struct mempolicy *pol, unsigned long n) { - unsigned nnodes = nodes_weight(pol->nodes); - unsigned target; + nodemask_t nodemask = pol->nodes; + unsigned int target, nnodes; int i; int nid; + /* + * The barrier will stabilize the nodemask in a register or on + * the stack so that it will stop changing under the code. + * + * Between first_node() and next_node(), pol->nodes could be changed + * by other threads. So we put pol->nodes in a local stack. + */ + barrier(); + nnodes = nodes_weight(nodemask); if (!nnodes) return numa_node_id(); target = (unsigned int)n % nnodes; - nid = first_node(pol->nodes); + nid = first_node(nodemask); for (i = 0; i < target; i++) - nid = next_node(nid, pol->nodes); + nid = next_node(nid, nodemask); return nid; }