From patchwork Fri Jun 25 01:39:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01CE4C49EA6 for ; Fri, 25 Jun 2021 01:39:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DBC29613C0 for ; Fri, 25 Jun 2021 01:39:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232937AbhFYBlX (ORCPT ); Thu, 24 Jun 2021 21:41:23 -0400 Received: from mail.kernel.org ([198.145.29.99]:59944 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBlW (ORCPT ); Thu, 24 Jun 2021 21:41:22 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 98EA161220; Fri, 25 Jun 2021 01:39:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585142; bh=DsRLpAhxGXVi4qKVXcSAFnKMtU65qkrjCMA4rNM5AyU=; h=Date:From:To:Subject:In-Reply-To:From; b=ItSNY5CkN//XMHm5uASXsfsrK/eFkZ7jI2tAK4O9UjjE78zLORqHy/YLHKR+4jXZO lVjXs276WP+g1rX3BKoAcHHLis893yqKRqpk34AT8DJBxYtx7h7U73R1ebwsiBloOU Zb2HTgWQPcJCJZwK5otpPJlIjiDC/g+Txw3MXK8Q= Date: Thu, 24 Jun 2021 18:39:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 01/24] mm: page_vma_mapped_walk(): use page for pvmw->page Message-ID: <20210625013901.BV8rXpKTb%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): use page for pvmw->page Patch series "mm: page_vma_mapped_walk() cleanup and THP fixes". I've marked all of these for stable: many are merely cleanups, but I think they are much better before the main fix than after. This patch (of 11): page_vma_mapped_walk() cleanup: sometimes the local copy of pvwm->page was used, sometimes pvmw->page itself: use the local copy "page" throughout. Link: https://lkml.kernel.org/r/589b358c-febc-c88e-d4c2-7834b37fa7bf@google.com Link: https://lkml.kernel.org/r/88e67645-f467-c279-bf5e-af4b5c6b13eb@google.com Signed-off-by: Hugh Dickins Reviewed-by: Alistair Popple Acked-by: Kirill A. Shutemov Reviewed-by: Peter Xu Cc: Yang Shi Cc: Wang Yugui Cc: Matthew Wilcox Cc: Ralph Campbell Cc: Zi Yan Cc: Will Deacon Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-use-page-for-pvmw-page +++ a/mm/page_vma_mapped.c @@ -156,7 +156,7 @@ bool page_vma_mapped_walk(struct page_vm if (pvmw->pte) goto next_pte; - if (unlikely(PageHuge(pvmw->page))) { + if (unlikely(PageHuge(page))) { /* when pud is not present, pte will be NULL */ pvmw->pte = huge_pte_offset(mm, pvmw->address, page_size(page)); if (!pvmw->pte) @@ -217,8 +217,7 @@ restart: * cannot return prematurely, while zap_huge_pmd() has * cleared *pmd but not decremented compound_mapcount(). */ - if ((pvmw->flags & PVMW_SYNC) && - PageTransCompound(pvmw->page)) { + if ((pvmw->flags & PVMW_SYNC) && PageTransCompound(page)) { spinlock_t *ptl = pmd_lock(mm, pvmw->pmd); spin_unlock(ptl); @@ -234,9 +233,9 @@ restart: return true; next_pte: /* Seek to next pte only makes sense for THP */ - if (!PageTransHuge(pvmw->page) || PageHuge(pvmw->page)) + if (!PageTransHuge(page) || PageHuge(page)) return not_found(pvmw); - end = vma_address_end(pvmw->page, pvmw->vma); + end = vma_address_end(page, pvmw->vma); do { pvmw->address += PAGE_SIZE; if (pvmw->address >= end) From patchwork Fri Jun 25 01:39:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3714C49EA5 for ; Fri, 25 Jun 2021 01:39:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A3DBF613C1 for ; Fri, 25 Jun 2021 01:39:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232992AbhFYBl0 (ORCPT ); Thu, 24 Jun 2021 21:41:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:60048 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBlZ (ORCPT ); Thu, 24 Jun 2021 21:41:25 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id D906C613B9; Fri, 25 Jun 2021 01:39:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585145; bh=ZcNubpwxRZpZ8HyeY7zOH9cmVZJhx5kgAiX8YhGJtVQ=; h=Date:From:To:Subject:In-Reply-To:From; b=xiP4xoEs5Siwl49YgDGRHJkAMRZhpi4uJP64pgmsXgykar8Pp5YK9wj3LnpBhX9QM tkbIC0Q8XAR7HJ7qt+vlVXtlH4OqPvUm9Gmvw6q6cGVFYkR1iH4A+Pytq/8oBhPK2a XXPNjUD1YcRZr3LuY0g9EKIWoXNLDIvcQMjl42OE= Date: Thu, 24 Jun 2021 18:39:04 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 02/24] mm: page_vma_mapped_walk(): settle PageHuge on entry Message-ID: <20210625013904.dOSX0-9cG%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): settle PageHuge on entry page_vma_mapped_walk() cleanup: get the hugetlbfs PageHuge case out of the way at the start, so no need to worry about it later. Link: https://lkml.kernel.org/r/e31a483c-6d73-a6bb-26c5-43c3b880a2@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Reviewed-by: Peter Xu Cc: Alistair Popple Cc: "Kirill A. Shutemov" Cc: Matthew Wilcox Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-settle-pagehuge-on-entry +++ a/mm/page_vma_mapped.c @@ -153,10 +153,11 @@ bool page_vma_mapped_walk(struct page_vm if (pvmw->pmd && !pvmw->pte) return not_found(pvmw); - if (pvmw->pte) - goto next_pte; - if (unlikely(PageHuge(page))) { + /* The only possible mapping was handled on last iteration */ + if (pvmw->pte) + return not_found(pvmw); + /* when pud is not present, pte will be NULL */ pvmw->pte = huge_pte_offset(mm, pvmw->address, page_size(page)); if (!pvmw->pte) @@ -168,6 +169,9 @@ bool page_vma_mapped_walk(struct page_vm return not_found(pvmw); return true; } + + if (pvmw->pte) + goto next_pte; restart: pgd = pgd_offset(mm, pvmw->address); if (!pgd_present(*pgd)) @@ -233,7 +237,7 @@ restart: return true; next_pte: /* Seek to next pte only makes sense for THP */ - if (!PageTransHuge(page) || PageHuge(page)) + if (!PageTransHuge(page)) return not_found(pvmw); end = vma_address_end(page, pvmw->vma); do { From patchwork Fri Jun 25 01:39:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E1E6C49EA5 for ; Fri, 25 Jun 2021 01:39:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B1EA613C0 for ; Fri, 25 Jun 2021 01:39:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232996AbhFYBl3 (ORCPT ); Thu, 24 Jun 2021 21:41:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:60122 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232995AbhFYBl3 (ORCPT ); Thu, 24 Jun 2021 21:41:29 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1BF3F61220; Fri, 25 Jun 2021 01:39:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585148; bh=g388FPOFF40U7bgNaSbXUv8OoPGn9DECSaYWswFcrcY=; h=Date:From:To:Subject:In-Reply-To:From; b=hNszIN1dMuGS+4StiUKiQS0blt8w3TjOJH47TqyMEfFndoMrqieH5Usv1Qexj3HZh ZWkh2HVkUH7K3LHPqHdVBqtFotaev7YUxOn0iJMZ4urDxldFtLC7eFv0+Kh6N4YJp6 cqkugeSZLHt1EqEI4opjtXBlwnule3xXZh25g6ZQ= Date: Thu, 24 Jun 2021 18:39:07 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 03/24] mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd Message-ID: <20210625013907.PBDZuqY27%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): use pmde for *pvmw->pmd page_vma_mapped_walk() cleanup: re-evaluate pmde after taking lock, then use it in subsequent tests, instead of repeatedly dereferencing pointer. Link: https://lkml.kernel.org/r/53fbc9d-891e-46b2-cb4b-468c3b19238e@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Reviewed-by: Peter Xu Cc: Alistair Popple Cc: Matthew Wilcox Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-use-pmde-for-pvmw-pmd +++ a/mm/page_vma_mapped.c @@ -191,18 +191,19 @@ restart: pmde = READ_ONCE(*pvmw->pmd); if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { pvmw->ptl = pmd_lock(mm, pvmw->pmd); - if (likely(pmd_trans_huge(*pvmw->pmd))) { + pmde = *pvmw->pmd; + if (likely(pmd_trans_huge(pmde))) { if (pvmw->flags & PVMW_MIGRATION) return not_found(pvmw); - if (pmd_page(*pvmw->pmd) != page) + if (pmd_page(pmde) != page) return not_found(pvmw); return true; - } else if (!pmd_present(*pvmw->pmd)) { + } else if (!pmd_present(pmde)) { if (thp_migration_supported()) { if (!(pvmw->flags & PVMW_MIGRATION)) return not_found(pvmw); - if (is_migration_entry(pmd_to_swp_entry(*pvmw->pmd))) { - swp_entry_t entry = pmd_to_swp_entry(*pvmw->pmd); + if (is_migration_entry(pmd_to_swp_entry(pmde))) { + swp_entry_t entry = pmd_to_swp_entry(pmde); if (migration_entry_to_page(entry) != page) return not_found(pvmw); From patchwork Fri Jun 25 01:39:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E486FC49EA7 for ; Fri, 25 Jun 2021 01:39:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CFFE1613B9 for ; Fri, 25 Jun 2021 01:39:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232995AbhFYBlc (ORCPT ); Thu, 24 Jun 2021 21:41:32 -0400 Received: from mail.kernel.org ([198.145.29.99]:60196 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBlb (ORCPT ); Thu, 24 Jun 2021 21:41:31 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 45CB5613BA; Fri, 25 Jun 2021 01:39:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585151; bh=LkMc3KPEQZHA+Vv1gO+cu/pm5hDzvAxe+suYMEagwGQ=; h=Date:From:To:Subject:In-Reply-To:From; b=H6MiERQUV/SDhk1l53LqOjJ552XrD/v3euanqOSAXb9Nx9ARMjZsGPlGHgVy/JfEd sBGdrv4nFR7ynF93nDvyyeMlCue9mHlWsnxWhRl6Sf//IbiNvvz+Z3MCx6PLz0ft7S yq0VLasboRsB7+ekCy/YqurO6VeiZF9T2/zVwFEE= Date: Thu, 24 Jun 2021 18:39:10 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 04/24] mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block Message-ID: <20210625013910.2BnPsIODO%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION block page_vma_mapped_walk() cleanup: rearrange the !pmd_present() block to follow the same "return not_found, return not_found, return true" pattern as the block above it (note: returning not_found there is never premature, since existence or prior existence of huge pmd guarantees good alignment). Link: https://lkml.kernel.org/r/378c8650-1488-2edf-9647-32a53cf2e21@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Reviewed-by: Peter Xu Cc: Alistair Popple Cc: Matthew Wilcox Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-prettify-pvmw_migration-block +++ a/mm/page_vma_mapped.c @@ -198,24 +198,22 @@ restart: if (pmd_page(pmde) != page) return not_found(pvmw); return true; - } else if (!pmd_present(pmde)) { - if (thp_migration_supported()) { - if (!(pvmw->flags & PVMW_MIGRATION)) - return not_found(pvmw); - if (is_migration_entry(pmd_to_swp_entry(pmde))) { - swp_entry_t entry = pmd_to_swp_entry(pmde); + } + if (!pmd_present(pmde)) { + swp_entry_t entry; - if (migration_entry_to_page(entry) != page) - return not_found(pvmw); - return true; - } - } - return not_found(pvmw); - } else { - /* THP pmd was split under us: handle on pte level */ - spin_unlock(pvmw->ptl); - pvmw->ptl = NULL; + if (!thp_migration_supported() || + !(pvmw->flags & PVMW_MIGRATION)) + return not_found(pvmw); + entry = pmd_to_swp_entry(pmde); + if (!is_migration_entry(entry) || + migration_entry_to_page(entry) != page) + return not_found(pvmw); + return true; } + /* THP pmd was split under us: handle on pte level */ + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; } else if (!pmd_present(pmde)) { /* * If PVMW_SYNC, take and drop THP pmd lock so that we From patchwork Fri Jun 25 01:39:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467402 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95AD2C49EA5 for ; Fri, 25 Jun 2021 01:39:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 80337613C1 for ; Fri, 25 Jun 2021 01:39:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233010AbhFYBlf (ORCPT ); Thu, 24 Jun 2021 21:41:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:60260 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBlf (ORCPT ); Thu, 24 Jun 2021 21:41:35 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 78B696044F; Fri, 25 Jun 2021 01:39:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585154; bh=JmjujSDruL4vbxq24MCV9pysNj4XsLMyQsStxkqC7tQ=; h=Date:From:To:Subject:In-Reply-To:From; b=WkKOZBrqJVyp+TzG1Q81ohiPUzMdlt7JrqazSVZC+Nb2os29j3SU6ewEhMsMngx9J ugdy16NhzhB/mQZYQyRqOKghM67NgjwsCKc2GsHsETdet2NZ2GCcScbEj/W4sCTz+W VWKXxSwDmHWUpqZ7OlFSUr1o5prve7FJvidm/ndY= Date: Thu, 24 Jun 2021 18:39:14 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 05/24] mm: page_vma_mapped_walk(): crossing page table boundary Message-ID: <20210625013914.f4FSy3CBd%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): crossing page table boundary page_vma_mapped_walk() cleanup: adjust the test for crossing page table boundary - I believe pvmw->address is always page-aligned, but nothing else here assumed that; and remember to reset pvmw->pte to NULL after unmapping the page table, though I never saw any bug from that. Link: https://lkml.kernel.org/r/799b3f9c-2a9e-dfef-5d89-26e9f76fd97@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-crossing-page-table-boundary +++ a/mm/page_vma_mapped.c @@ -244,16 +244,16 @@ next_pte: if (pvmw->address >= end) return not_found(pvmw); /* Did we cross page table boundary? */ - if (pvmw->address % PMD_SIZE == 0) { - pte_unmap(pvmw->pte); + if ((pvmw->address & (PMD_SIZE - PAGE_SIZE)) == 0) { if (pvmw->ptl) { spin_unlock(pvmw->ptl); pvmw->ptl = NULL; } + pte_unmap(pvmw->pte); + pvmw->pte = NULL; goto restart; - } else { - pvmw->pte++; } + pvmw->pte++; } while (pte_none(*pvmw->pte)); if (!pvmw->ptl) { From patchwork Fri Jun 25 01:39:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C657C49EA6 for ; Fri, 25 Jun 2021 01:39:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 13EE861220 for ; Fri, 25 Jun 2021 01:39:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233011AbhFYBlj (ORCPT ); Thu, 24 Jun 2021 21:41:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:60348 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBli (ORCPT ); Thu, 24 Jun 2021 21:41:38 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A0638613C0; Fri, 25 Jun 2021 01:39:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585158; bh=z/DEKBCzLKRc2DKKLmysqHv25sO/iXavAcvhtVq+Jhs=; h=Date:From:To:Subject:In-Reply-To:From; b=sqkegw2FyJC7iO/2FeFVuMvOgdFgP9IgUFWunkDlxvKQVjVuTKkHary5bfsd6ZD/p nP2LR7p2MUwU3iaVTJGlOWEV/i8KMf8lMewxG4LLRBGTFznzH6yH3dl3jU2XZPgEuM OOkdmwVolW2kUTxtcqoB600CByRH9nTjcN7ufr0I= Date: Thu, 24 Jun 2021 18:39:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 06/24] mm: page_vma_mapped_walk(): add a level of indentation Message-ID: <20210625013917.v1e2AXMQO%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): add a level of indentation page_vma_mapped_walk() cleanup: add a level of indentation to much of the body, making no functional change in this commit, but reducing the later diff when this is all converted to a loop. [hughd@google.com: : page_vma_mapped_walk(): add a level of indentation fix] Link: https://lkml.kernel.org/r/7f817555-3ce1-c785-e438-87d8efdcaf26@google.com Link: https://lkml.kernel.org/r/efde211-f3e2-fe54-977-ef481419e7f3@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 105 +++++++++++++++++++++-------------------- 1 file changed, 55 insertions(+), 50 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-add-a-level-of-indentation +++ a/mm/page_vma_mapped.c @@ -173,62 +173,67 @@ bool page_vma_mapped_walk(struct page_vm if (pvmw->pte) goto next_pte; restart: - pgd = pgd_offset(mm, pvmw->address); - if (!pgd_present(*pgd)) - return false; - p4d = p4d_offset(pgd, pvmw->address); - if (!p4d_present(*p4d)) - return false; - pud = pud_offset(p4d, pvmw->address); - if (!pud_present(*pud)) - return false; - pvmw->pmd = pmd_offset(pud, pvmw->address); - /* - * Make sure the pmd value isn't cached in a register by the - * compiler and used as a stale value after we've observed a - * subsequent update. - */ - pmde = READ_ONCE(*pvmw->pmd); - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { - pvmw->ptl = pmd_lock(mm, pvmw->pmd); - pmde = *pvmw->pmd; - if (likely(pmd_trans_huge(pmde))) { - if (pvmw->flags & PVMW_MIGRATION) - return not_found(pvmw); - if (pmd_page(pmde) != page) - return not_found(pvmw); - return true; - } - if (!pmd_present(pmde)) { - swp_entry_t entry; + { + pgd = pgd_offset(mm, pvmw->address); + if (!pgd_present(*pgd)) + return false; + p4d = p4d_offset(pgd, pvmw->address); + if (!p4d_present(*p4d)) + return false; + pud = pud_offset(p4d, pvmw->address); + if (!pud_present(*pud)) + return false; - if (!thp_migration_supported() || - !(pvmw->flags & PVMW_MIGRATION)) - return not_found(pvmw); - entry = pmd_to_swp_entry(pmde); - if (!is_migration_entry(entry) || - migration_entry_to_page(entry) != page) - return not_found(pvmw); - return true; - } - /* THP pmd was split under us: handle on pte level */ - spin_unlock(pvmw->ptl); - pvmw->ptl = NULL; - } else if (!pmd_present(pmde)) { + pvmw->pmd = pmd_offset(pud, pvmw->address); /* - * If PVMW_SYNC, take and drop THP pmd lock so that we - * cannot return prematurely, while zap_huge_pmd() has - * cleared *pmd but not decremented compound_mapcount(). + * Make sure the pmd value isn't cached in a register by the + * compiler and used as a stale value after we've observed a + * subsequent update. */ - if ((pvmw->flags & PVMW_SYNC) && PageTransCompound(page)) { - spinlock_t *ptl = pmd_lock(mm, pvmw->pmd); + pmde = READ_ONCE(*pvmw->pmd); + + if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { + pvmw->ptl = pmd_lock(mm, pvmw->pmd); + pmde = *pvmw->pmd; + if (likely(pmd_trans_huge(pmde))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (pmd_page(pmde) != page) + return not_found(pvmw); + return true; + } + if (!pmd_present(pmde)) { + swp_entry_t entry; - spin_unlock(ptl); + if (!thp_migration_supported() || + !(pvmw->flags & PVMW_MIGRATION)) + return not_found(pvmw); + entry = pmd_to_swp_entry(pmde); + if (!is_migration_entry(entry) || + migration_entry_to_page(entry) != page) + return not_found(pvmw); + return true; + } + /* THP pmd was split under us: handle on pte level */ + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; + } else if (!pmd_present(pmde)) { + /* + * If PVMW_SYNC, take and drop THP pmd lock so that we + * cannot return prematurely, while zap_huge_pmd() has + * cleared *pmd but not decremented compound_mapcount(). + */ + if ((pvmw->flags & PVMW_SYNC) && + PageTransCompound(page)) { + spinlock_t *ptl = pmd_lock(mm, pvmw->pmd); + + spin_unlock(ptl); + } + return false; } - return false; + if (!map_pte(pvmw)) + goto next_pte; } - if (!map_pte(pvmw)) - goto next_pte; while (1) { unsigned long end; From patchwork Fri Jun 25 01:39:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA240C49EA6 for ; Fri, 25 Jun 2021 01:39:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B1DD6613C0 for ; Fri, 25 Jun 2021 01:39:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233024AbhFYBll (ORCPT ); Thu, 24 Jun 2021 21:41:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:60430 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBll (ORCPT ); Thu, 24 Jun 2021 21:41:41 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id DBD0061220; Fri, 25 Jun 2021 01:39:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585161; bh=Gg5smAB42gp6ciOE+Avp+9NE+1VjecbAaiYsOYAlaOY=; h=Date:From:To:Subject:In-Reply-To:From; b=oUPBtV4pXiSeTN6zp09bcKWkTvyqgsNKSM7GwnNINM8lj1aKKA/Bdtl9Ntf8cMY/J STMMGO4HtzK4ReP+bo+6JvfA9V60pG5PQ35p0KDgLHeux/L++Jx+x9NvzHFtMLGX1m OYhdkVGNRR0S37QB9DZWVS1Nu3c1SgrmOaw3raxo= Date: Thu, 24 Jun 2021 18:39:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 07/24] mm: page_vma_mapped_walk(): use goto instead of while (1) Message-ID: <20210625013920.Hun6MvfTf%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): use goto instead of while (1) page_vma_mapped_walk() cleanup: add a label this_pte, matching next_pte, and use "goto this_pte", in place of the "while (1)" loop at the end. Link: https://lkml.kernel.org/r/a52b234a-851-3616-2525-f42736e8934@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-use-goto-instead-of-while-1 +++ a/mm/page_vma_mapped.c @@ -144,6 +144,7 @@ bool page_vma_mapped_walk(struct page_vm { struct mm_struct *mm = pvmw->vma->vm_mm; struct page *page = pvmw->page; + unsigned long end; pgd_t *pgd; p4d_t *p4d; pud_t *pud; @@ -233,10 +234,7 @@ restart: } if (!map_pte(pvmw)) goto next_pte; - } - while (1) { - unsigned long end; - +this_pte: if (check_pte(pvmw)) return true; next_pte: @@ -265,6 +263,7 @@ next_pte: pvmw->ptl = pte_lockptr(mm, pvmw->pmd); spin_lock(pvmw->ptl); } + goto this_pte; } } From patchwork Fri Jun 25 01:39:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8BD0C49EA5 for ; Fri, 25 Jun 2021 01:39:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B4FDD613C3 for ; Fri, 25 Jun 2021 01:39:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233026AbhFYBlp (ORCPT ); Thu, 24 Jun 2021 21:41:45 -0400 Received: from mail.kernel.org ([198.145.29.99]:60528 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232917AbhFYBlp (ORCPT ); Thu, 24 Jun 2021 21:41:45 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 11A1D6044F; Fri, 25 Jun 2021 01:39:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585164; bh=QLBvehw1YqT+Fi04KUS+NeMO5zqTYGXl3bli9Rs1+mE=; h=Date:From:To:Subject:In-Reply-To:From; b=ye84BUYs6wqVkNHW0J4B6E9ENBCXh8lUoRRqasqfgG0D3f4Jq2sUqGmouq7mM+tCS QMZGpd/Q3Bm9Nu2hUQp/8K4O2OPBI6XAnISYH5tXgJg9zJQi8hSQCLWa2O67BpM3X3 sc3ELLeKCq2MIeH13cqvR77h+FTIHO9u6M0tZoxQ= Date: Thu, 24 Jun 2021 18:39:23 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 08/24] mm: page_vma_mapped_walk(): get vma_address_end() earlier Message-ID: <20210625013923.PnNu8VUkZ%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm: page_vma_mapped_walk(): get vma_address_end() earlier page_vma_mapped_walk() cleanup: get THP's vma_address_end() at the start, rather than later at next_pte. It's a little unnecessary overhead on the first call, but makes for a simpler loop in the following commit. Link: https://lkml.kernel.org/r/4542b34d-862f-7cb4-bb22-e0df6ce830a2@google.com Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) --- a/mm/page_vma_mapped.c~mm-page_vma_mapped_walk-get-vma_address_end-earlier +++ a/mm/page_vma_mapped.c @@ -171,6 +171,15 @@ bool page_vma_mapped_walk(struct page_vm return true; } + /* + * Seek to next pte only makes sense for THP. + * But more important than that optimization, is to filter out + * any PageKsm page: whose page->index misleads vma_address() + * and vma_address_end() to disaster. + */ + end = PageTransCompound(page) ? + vma_address_end(page, pvmw->vma) : + pvmw->address + PAGE_SIZE; if (pvmw->pte) goto next_pte; restart: @@ -238,10 +247,6 @@ this_pte: if (check_pte(pvmw)) return true; next_pte: - /* Seek to next pte only makes sense for THP */ - if (!PageTransHuge(page)) - return not_found(pvmw); - end = vma_address_end(page, pvmw->vma); do { pvmw->address += PAGE_SIZE; if (pvmw->address >= end) From patchwork Fri Jun 25 01:39:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467400 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C78D6C49EAF for ; Fri, 25 Jun 2021 01:39:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B58326044F for ; Fri, 25 Jun 2021 01:39:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232917AbhFYBlr (ORCPT ); Thu, 24 Jun 2021 21:41:47 -0400 Received: from mail.kernel.org ([198.145.29.99]:60596 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233027AbhFYBlr (ORCPT ); Thu, 24 Jun 2021 21:41:47 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3607B60232; Fri, 25 Jun 2021 01:39:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585167; bh=4Phypuv2rU6Ab0eZiywnXZFZ3Vmw48lZy36NgYBtlD0=; h=Date:From:To:Subject:In-Reply-To:From; b=v2XhV9y4qjkCAlefedB1QitxJNJRcA0A7KL8c8EyZctrmpSXS9F13LCAWlaGIjUPw KVFcLmvIReDxPXfhlipeg0CwpsA38lpHGD9Wy0UAx7eC73Xmee+U9Jku5ds5KCHpyn cKMKJDZMV0yfjRdl5UKYlUHpfdvjy6CvM+sJJ5n4= Date: Thu, 24 Jun 2021 18:39:26 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 09/24] mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Message-ID: <20210625013926.7ZTZ9B0S5%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm/thp: fix page_vma_mapped_walk() if THP mapped by ptes Running certain tests with a DEBUG_VM kernel would crash within hours, on the total_mapcount BUG() in split_huge_page_to_list(), while trying to free up some memory by punching a hole in a shmem huge page: split's try_to_unmap() was unable to find all the mappings of the page (which, on a !DEBUG_VM kernel, would then keep the huge page pinned in memory). Crash dumps showed two tail pages of a shmem huge page remained mapped by pte: ptes in a non-huge-aligned vma of a gVisor process, at the end of a long unmapped range; and no page table had yet been allocated for the head of the huge page to be mapped into. Although designed to handle these odd misaligned huge-page-mapped-by-pte cases, page_vma_mapped_walk() falls short by returning false prematurely when !pmd_present or !pud_present or !p4d_present or !pgd_present: there are cases when a huge page may span the boundary, with ptes present in the next. Restructure page_vma_mapped_walk() as a loop to continue in these cases, while keeping its layout much as before. Add a step_forward() helper to advance pvmw->address across those boundaries: originally I tried to use mm's standard p?d_addr_end() macros, but hit the same crash 512 times less often: because of the way redundant levels are folded together, but folded differently in different configurations, it was just too difficult to use them correctly; and step_forward() is simpler anyway. Link: https://lkml.kernel.org/r/fedb8632-1798-de42-f39e-873551d5bc81@google.com Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Wang Yugui Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) --- a/mm/page_vma_mapped.c~mm-thp-fix-page_vma_mapped_walk-if-thp-mapped-by-ptes +++ a/mm/page_vma_mapped.c @@ -116,6 +116,13 @@ static bool check_pte(struct page_vma_ma return pfn_is_match(pvmw->page, pfn); } +static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long size) +{ + pvmw->address = (pvmw->address + size) & ~(size - 1); + if (!pvmw->address) + pvmw->address = ULONG_MAX; +} + /** * page_vma_mapped_walk - check if @pvmw->page is mapped in @pvmw->vma at * @pvmw->address @@ -183,16 +190,22 @@ bool page_vma_mapped_walk(struct page_vm if (pvmw->pte) goto next_pte; restart: - { + do { pgd = pgd_offset(mm, pvmw->address); - if (!pgd_present(*pgd)) - return false; + if (!pgd_present(*pgd)) { + step_forward(pvmw, PGDIR_SIZE); + continue; + } p4d = p4d_offset(pgd, pvmw->address); - if (!p4d_present(*p4d)) - return false; + if (!p4d_present(*p4d)) { + step_forward(pvmw, P4D_SIZE); + continue; + } pud = pud_offset(p4d, pvmw->address); - if (!pud_present(*pud)) - return false; + if (!pud_present(*pud)) { + step_forward(pvmw, PUD_SIZE); + continue; + } pvmw->pmd = pmd_offset(pud, pvmw->address); /* @@ -239,7 +252,8 @@ restart: spin_unlock(ptl); } - return false; + step_forward(pvmw, PMD_SIZE); + continue; } if (!map_pte(pvmw)) goto next_pte; @@ -269,7 +283,9 @@ next_pte: spin_lock(pvmw->ptl); } goto this_pte; - } + } while (pvmw->address < end); + + return false; } /** From patchwork Fri Jun 25 01:39:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88FB0C49EA6 for ; Fri, 25 Jun 2021 01:39:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72F45613B9 for ; Fri, 25 Jun 2021 01:39:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233029AbhFYBlx (ORCPT ); Thu, 24 Jun 2021 21:41:53 -0400 Received: from mail.kernel.org ([198.145.29.99]:60688 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232973AbhFYBlv (ORCPT ); Thu, 24 Jun 2021 21:41:51 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 74235610A7; Fri, 25 Jun 2021 01:39:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585170; bh=/GefFZigdsm2FtIHaSvFs3pKxmOWfTVavc1bwNdWQCE=; h=Date:From:To:Subject:In-Reply-To:From; b=j1XKhgy78aHkVkJeNV1zNhB6H7sjECYHuiXakAKeHLSfO32ObeDJxYFLx/AErSsG3 WsJz3Xi0oo3SCpVrNz0CpinVe9wGkhw0UTHrtlWf/LjM2o/WiFMR0mI4Cq3FAdng3I rY6GvIZI7LQwcO0mX5hV5c5EKPhF+oaNYdvrwpag= Date: Thu, 24 Jun 2021 18:39:30 -0700 From: Andrew Morton To: akpm@linux-foundation.org, apopple@nvidia.com, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mm-commits@vger.kernel.org, peterx@redhat.com, rcampbell@nvidia.com, shy828301@gmail.com, stable@vger.kernel.org, torvalds@linux-foundation.org, wangyugui@e16-tech.com, will@kernel.org, willy@infradead.org, ziy@nvidia.com Subject: [patch 10/24] mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Message-ID: <20210625013930.b8M_U9WL0%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm/thp: another PVMW_SYNC fix in page_vma_mapped_walk() Aha! Shouldn't that quick scan over pte_none()s make sure that it holds ptlock in the PVMW_SYNC case? That too might have been responsible for BUGs or WARNs in split_huge_page_to_list() or its unmap_page(), though I've never seen any. Link: https://lkml.kernel.org/r/1bdf384c-8137-a149-2a1e-475a4791c3c@google.com Link: https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/ Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Tested-by: Wang Yugui Cc: Alistair Popple Cc: Matthew Wilcox Cc: Peter Xu Cc: Ralph Campbell Cc: Will Deacon Cc: Yang Shi Cc: Zi Yan Cc: Signed-off-by: Andrew Morton --- mm/page_vma_mapped.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/mm/page_vma_mapped.c~mm-thp-another-pvmw_sync-fix-in-page_vma_mapped_walk +++ a/mm/page_vma_mapped.c @@ -276,6 +276,10 @@ next_pte: goto restart; } pvmw->pte++; + if ((pvmw->flags & PVMW_SYNC) && !pvmw->ptl) { + pvmw->ptl = pte_lockptr(mm, pvmw->pmd); + spin_lock(pvmw->ptl); + } } while (pte_none(*pvmw->pte)); if (!pvmw->ptl) { From patchwork Fri Jun 25 01:39:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97507C49EA5 for ; Fri, 25 Jun 2021 01:39:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 81107613BD for ; Fri, 25 Jun 2021 01:39:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232983AbhFYBmH (ORCPT ); Thu, 24 Jun 2021 21:42:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:60996 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232942AbhFYBmH (ORCPT ); Thu, 24 Jun 2021 21:42:07 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id E69B6610A7; Fri, 25 Jun 2021 01:39:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585186; bh=8aA8l0gUvl7gKVc+xnD7wWRrAJB4Gfj9ScpG9tt1Ikk=; h=Date:From:To:Subject:In-Reply-To:From; b=et+frLiJ9+f0WRKF0I6mkc5u6tojPbOzXj6Z7Oy1u3NfZSfGIKTczShOr66XwLUEi XU29UABB99WEWSX7WPhsxnYUic45VjvQAhZGTSDCyEp7cfXbAB/hj+86S4sYnRzo5l SU4MnduFiEmQe73GC+//UQvuOP6jYL1PpDSIml3E= Date: Thu, 24 Jun 2021 18:39:45 -0700 From: Andrew Morton To: akpm@linux-foundation.org, jenhaochen@google.com, linux-mm@kvack.org, liumartin@google.com, minchan@google.com, mm-commits@vger.kernel.org, nathan@kernel.org, ndesaulniers@google.com, oleg@redhat.com, pmladek@suse.com, stable@vger.kernel.org, tj@kernel.org, torvalds@linux-foundation.org Subject: [patch 15/24] kthread_worker: split code for canceling the delayed work timer Message-ID: <20210625013945.Wgs7ttT2t%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Petr Mladek Subject: kthread_worker: split code for canceling the delayed work timer Patch series "kthread_worker: Fix race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync()". This patchset fixes the race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync() including proper return value handling. This patch (of 2): Simple code refactoring as a preparation step for fixing a race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync(). It does not modify the existing behavior. Link: https://lkml.kernel.org/r/20210610133051.15337-2-pmladek@suse.com Signed-off-by: Petr Mladek Cc: Cc: Martin Liu Cc: Minchan Kim Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Oleg Nesterov Cc: Tejun Heo Cc: Signed-off-by: Andrew Morton --- kernel/kthread.c | 46 ++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) --- a/kernel/kthread.c~kthread_worker-split-code-for-canceling-the-delayed-work-timer +++ a/kernel/kthread.c @@ -1093,6 +1093,33 @@ void kthread_flush_work(struct kthread_w EXPORT_SYMBOL_GPL(kthread_flush_work); /* + * Make sure that the timer is neither set nor running and could + * not manipulate the work list_head any longer. + * + * The function is called under worker->lock. The lock is temporary + * released but the timer can't be set again in the meantime. + */ +static void kthread_cancel_delayed_work_timer(struct kthread_work *work, + unsigned long *flags) +{ + struct kthread_delayed_work *dwork = + container_of(work, struct kthread_delayed_work, work); + struct kthread_worker *worker = work->worker; + + /* + * del_timer_sync() must be called to make sure that the timer + * callback is not running. The lock must be temporary released + * to avoid a deadlock with the callback. In the meantime, + * any queuing is blocked by setting the canceling counter. + */ + work->canceling++; + raw_spin_unlock_irqrestore(&worker->lock, *flags); + del_timer_sync(&dwork->timer); + raw_spin_lock_irqsave(&worker->lock, *flags); + work->canceling--; +} + +/* * This function removes the work from the worker queue. Also it makes sure * that it won't get queued later via the delayed work's timer. * @@ -1106,23 +1133,8 @@ static bool __kthread_cancel_work(struct unsigned long *flags) { /* Try to cancel the timer if exists. */ - if (is_dwork) { - struct kthread_delayed_work *dwork = - container_of(work, struct kthread_delayed_work, work); - struct kthread_worker *worker = work->worker; - - /* - * del_timer_sync() must be called to make sure that the timer - * callback is not running. The lock must be temporary released - * to avoid a deadlock with the callback. In the meantime, - * any queuing is blocked by setting the canceling counter. - */ - work->canceling++; - raw_spin_unlock_irqrestore(&worker->lock, *flags); - del_timer_sync(&dwork->timer); - raw_spin_lock_irqsave(&worker->lock, *flags); - work->canceling--; - } + if (is_dwork) + kthread_cancel_delayed_work_timer(work, flags); /* * Try to remove the work from a worker list. It might either From patchwork Fri Jun 25 01:39:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD2A2C49EA6 for ; Fri, 25 Jun 2021 01:39:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A8D86613B9 for ; Fri, 25 Jun 2021 01:39:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233023AbhFYBmK (ORCPT ); Thu, 24 Jun 2021 21:42:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:32840 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232942AbhFYBmK (ORCPT ); Thu, 24 Jun 2021 21:42:10 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 4937E60232; Fri, 25 Jun 2021 01:39:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585189; bh=LVLE7unR4Wsl1ccuqOvuf0r31mdt3BsLHfumT8SCbLs=; h=Date:From:To:Subject:In-Reply-To:From; b=yZfMMG5jbN4lf5HOHL5gzT0AfpJuP8F70KiJktt2t0b2Ysm1AcCKXhUy99hdQOLVT RhoQn3sMMbqLIGPRGc4b4pGPEzC0gyLBak7wClr12U40JK/4+JyaHLOn4R7AhCpLrO 7OBy0F/dpuNhazIw3yvXFl1KjmuYiaiguWJ571eQ= Date: Thu, 24 Jun 2021 18:39:48 -0700 From: Andrew Morton To: akpm@linux-foundation.org, jenhaochen@google.com, linux-mm@kvack.org, liumartin@google.com, minchan@google.com, mm-commits@vger.kernel.org, nathan@kernel.org, ndesaulniers@google.com, oleg@redhat.com, pmladek@suse.com, stable@vger.kernel.org, tj@kernel.org, torvalds@linux-foundation.org Subject: [patch 16/24] kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() Message-ID: <20210625013948.IbKHai8t0%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Petr Mladek Subject: kthread: prevent deadlock when kthread_mod_delayed_work() races with kthread_cancel_delayed_work_sync() The system might hang with the following backtrace: schedule+0x80/0x100 schedule_timeout+0x48/0x138 wait_for_common+0xa4/0x134 wait_for_completion+0x1c/0x2c kthread_flush_work+0x114/0x1cc kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144 kthread_cancel_delayed_work_sync+0x18/0x2c xxxx_pm_notify+0xb0/0xd8 blocking_notifier_call_chain_robust+0x80/0x194 pm_notifier_call_chain_robust+0x28/0x4c suspend_prepare+0x40/0x260 enter_state+0x80/0x3f4 pm_suspend+0x60/0xdc state_store+0x108/0x144 kobj_attr_store+0x38/0x88 sysfs_kf_write+0x64/0xc0 kernfs_fop_write_iter+0x108/0x1d0 vfs_write+0x2f4/0x368 ksys_write+0x7c/0xec It is caused by the following race between kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync(): CPU0 CPU1 Context: Thread A Context: Thread B kthread_mod_delayed_work() spin_lock() __kthread_cancel_work() spin_unlock() del_timer_sync() kthread_cancel_delayed_work_sync() spin_lock() __kthread_cancel_work() spin_unlock() del_timer_sync() spin_lock() work->canceling++ spin_unlock spin_lock() queue_delayed_work() // dwork is put into the worker->delayed_work_list spin_unlock() kthread_flush_work() // flush_work is put at the tail of the dwork wait_for_completion() Context: IRQ kthread_delayed_work_timer_fn() spin_lock() list_del_init(&work->node); spin_unlock() BANG: flush_work is not longer linked and will never get proceed. The problem is that kthread_mod_delayed_work() checks work->canceling flag before canceling the timer. A simple solution is to (re)check work->canceling after __kthread_cancel_work(). But then it is not clear what should be returned when __kthread_cancel_work() removed the work from the queue (list) and it can't queue it again with the new @delay. The return value might be used for reference counting. The caller has to know whether a new work has been queued or an existing one was replaced. The proper solution is that kthread_mod_delayed_work() will remove the work from the queue (list) _only_ when work->canceling is not set. The flag must be checked after the timer is stopped and the remaining operations can be done under worker->lock. Note that kthread_mod_delayed_work() could remove the timer and then bail out. It is fine. The other canceling caller needs to cancel the timer as well. The important thing is that the queue (list) manipulation is done atomically under worker->lock. Link: https://lkml.kernel.org/r/20210610133051.15337-3-pmladek@suse.com Fixes: 9a6b06c8d9a220860468a ("kthread: allow to modify delayed kthread work") Signed-off-by: Petr Mladek Reported-by: Martin Liu Cc: Cc: Minchan Kim Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Oleg Nesterov Cc: Tejun Heo Cc: Signed-off-by: Andrew Morton --- kernel/kthread.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) --- a/kernel/kthread.c~kthread-prevent-deadlock-when-kthread_mod_delayed_work-races-with-kthread_cancel_delayed_work_sync +++ a/kernel/kthread.c @@ -1120,8 +1120,11 @@ static void kthread_cancel_delayed_work_ } /* - * This function removes the work from the worker queue. Also it makes sure - * that it won't get queued later via the delayed work's timer. + * This function removes the work from the worker queue. + * + * It is called under worker->lock. The caller must make sure that + * the timer used by delayed work is not running, e.g. by calling + * kthread_cancel_delayed_work_timer(). * * The work might still be in use when this function finishes. See the * current_work proceed by the worker. @@ -1129,13 +1132,8 @@ static void kthread_cancel_delayed_work_ * Return: %true if @work was pending and successfully canceled, * %false if @work was not pending */ -static bool __kthread_cancel_work(struct kthread_work *work, bool is_dwork, - unsigned long *flags) +static bool __kthread_cancel_work(struct kthread_work *work) { - /* Try to cancel the timer if exists. */ - if (is_dwork) - kthread_cancel_delayed_work_timer(work, flags); - /* * Try to remove the work from a worker list. It might either * be from worker->work_list or from worker->delayed_work_list. @@ -1188,11 +1186,23 @@ bool kthread_mod_delayed_work(struct kth /* Work must not be used with >1 worker, see kthread_queue_work() */ WARN_ON_ONCE(work->worker != worker); - /* Do not fight with another command that is canceling this work. */ + /* + * Temporary cancel the work but do not fight with another command + * that is canceling the work as well. + * + * It is a bit tricky because of possible races with another + * mod_delayed_work() and cancel_delayed_work() callers. + * + * The timer must be canceled first because worker->lock is released + * when doing so. But the work can be removed from the queue (list) + * only when it can be queued again so that the return value can + * be used for reference counting. + */ + kthread_cancel_delayed_work_timer(work, &flags); if (work->canceling) goto out; + ret = __kthread_cancel_work(work); - ret = __kthread_cancel_work(work, true, &flags); fast_queue: __kthread_queue_delayed_work(worker, dwork, delay); out: @@ -1214,7 +1224,10 @@ static bool __kthread_cancel_work_sync(s /* Work must not be used with >1 worker, see kthread_queue_work(). */ WARN_ON_ONCE(work->worker != worker); - ret = __kthread_cancel_work(work, is_dwork, &flags); + if (is_dwork) + kthread_cancel_delayed_work_timer(work, &flags); + + ret = __kthread_cancel_work(work); if (worker->current_work != work) goto out_fast; From patchwork Fri Jun 25 01:39:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467398 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12977C49EA7 for ; Fri, 25 Jun 2021 01:39:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F3E60613BD for ; Fri, 25 Jun 2021 01:39:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233034AbhFYBmN (ORCPT ); Thu, 24 Jun 2021 21:42:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:32894 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232942AbhFYBmN (ORCPT ); Thu, 24 Jun 2021 21:42:13 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7FF3E61220; Fri, 25 Jun 2021 01:39:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585193; bh=A96yUig542IiOvcZ5BLFFXMBRZa0lxHhyMc7JoIOviI=; h=Date:From:To:Subject:In-Reply-To:From; b=w4nXOIyrxzMhoFwnOiwXYqbRRNMYzLEAnzq64m8uqXdc9IQa7EoalsVEuJsOwD881 Dp6AuaSpOdKQKV2hdZ8sMuBtklSclND1fDyx9WeWczMphcrxsq1uzq/5sjxLSwznrA ALC3ohaS5fm6I4JR6STBpC4e4V9tDWrfjX3XXooo= Date: Thu, 24 Jun 2021 18:39:52 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dave@stgolabs.net, dvhart@infradead.org, hughd@google.com, kirill.shutemov@linux.intel.com, linux-mm@kvack.org, mgorman@techsingularity.net, mike.kravetz@oracle.com, mingo@redhat.com, mm-commits@vger.kernel.org, neelnatu@google.com, peterz@infradead.org, stable@vger.kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, wetpzy@gmail.com, willy@infradead.org Subject: [patch 17/24] mm, futex: fix shared futex pgoff on shmem huge page Message-ID: <20210625013952.eaD8IsXDa%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Hugh Dickins Subject: mm, futex: fix shared futex pgoff on shmem huge page If more than one futex is placed on a shmem huge page, it can happen that waking the second wakes the first instead, and leaves the second waiting: the key's shared.pgoff is wrong. When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when generating futex_key"), the only shared huge pages came from hugetlbfs, and the code added to deal with its exceptional page->index was put into hugetlb source. Then that was missed when 4.8 added shmem huge pages. page_to_pgoff() is what others use for this nowadays: except that, as currently written, it gives the right answer on hugetlbfs head, but nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific hugetlb_basepage_index() on PageHuge tails as well as on head. Yes, it's unconventional to declare hugetlb_basepage_index() there in pagemap.h, rather than in hugetlb.h; but I do not expect anything but page_to_pgoff() ever to need it. [akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope] Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Reported-by: Neel Natu Signed-off-by: Hugh Dickins Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Thomas Gleixner Cc: "Kirill A. Shutemov" Cc: Zhang Yi Cc: Mel Gorman Cc: Mike Kravetz Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Darren Hart Cc: Davidlohr Bueso Cc: Signed-off-by: Andrew Morton --- include/linux/hugetlb.h | 16 ---------------- include/linux/pagemap.h | 13 +++++++------ kernel/futex.c | 3 +-- mm/hugetlb.c | 5 +---- 4 files changed, 9 insertions(+), 28 deletions(-) --- a/include/linux/hugetlb.h~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page +++ a/include/linux/hugetlb.h @@ -741,17 +741,6 @@ static inline int hstate_index(struct hs return h - hstates; } -pgoff_t __basepage_index(struct page *page); - -/* Return page->index in PAGE_SIZE units */ -static inline pgoff_t basepage_index(struct page *page) -{ - if (!PageCompound(page)) - return page->index; - - return __basepage_index(page); -} - extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); @@ -988,11 +977,6 @@ static inline int hstate_index(struct hs return 0; } -static inline pgoff_t basepage_index(struct page *page) -{ - return page->index; -} - static inline int dissolve_free_huge_page(struct page *page) { return 0; --- a/include/linux/pagemap.h~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page +++ a/include/linux/pagemap.h @@ -516,7 +516,7 @@ static inline struct page *read_mapping_ } /* - * Get index of the page with in radix-tree + * Get index of the page within radix-tree (but not for hugetlb pages). * (TODO: remove once hugetlb pages will have ->index in PAGE_SIZE) */ static inline pgoff_t page_to_index(struct page *page) @@ -535,15 +535,16 @@ static inline pgoff_t page_to_index(stru return pgoff; } +extern pgoff_t hugetlb_basepage_index(struct page *page); + /* - * Get the offset in PAGE_SIZE. - * (TODO: hugepage should have ->index in PAGE_SIZE) + * Get the offset in PAGE_SIZE (even for hugetlb pages). + * (TODO: hugetlb pages should have ->index in PAGE_SIZE) */ static inline pgoff_t page_to_pgoff(struct page *page) { - if (unlikely(PageHeadHuge(page))) - return page->index << compound_order(page); - + if (unlikely(PageHuge(page))) + return hugetlb_basepage_index(page); return page_to_index(page); } --- a/kernel/futex.c~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page +++ a/kernel/futex.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #include #include @@ -650,7 +649,7 @@ again: key->both.offset |= FUT_OFF_INODE; /* inode-based key */ key->shared.i_seq = get_inode_sequence_number(inode); - key->shared.pgoff = basepage_index(tail); + key->shared.pgoff = page_to_pgoff(tail); rcu_read_unlock(); } --- a/mm/hugetlb.c~mm-futex-fix-shared-futex-pgoff-on-shmem-huge-page +++ a/mm/hugetlb.c @@ -1588,15 +1588,12 @@ struct address_space *hugetlb_page_mappi return NULL; } -pgoff_t __basepage_index(struct page *page) +pgoff_t hugetlb_basepage_index(struct page *page) { struct page *page_head = compound_head(page); pgoff_t index = page_index(page_head); unsigned long compound_idx; - if (!PageHuge(page_head)) - return page_index(page); - if (compound_order(page_head) >= MAX_ORDER) compound_idx = page_to_pfn(page) - page_to_pfn(page_head); else From patchwork Fri Jun 25 01:39:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89754C49EA5 for ; Fri, 25 Jun 2021 01:39:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7650B613C1 for ; Fri, 25 Jun 2021 01:39:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233038AbhFYBmQ (ORCPT ); Thu, 24 Jun 2021 21:42:16 -0400 Received: from mail.kernel.org ([198.145.29.99]:32976 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232942AbhFYBmQ (ORCPT ); Thu, 24 Jun 2021 21:42:16 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id C71A9610A7; Fri, 25 Jun 2021 01:39:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585196; bh=6fkel34X+r3cYhGIPOjZI7eSlV9X/QHEPVs+9zgJ7VU=; h=Date:From:To:Subject:In-Reply-To:From; b=rf/ZOKKbewboIpa7i7AcpVMQVbQxWFFOStmsGacUL3KCmlg139h2AqCu8QIppTGlY GXtPJrQWOPYRvo6ogU4wR1IAXezwe+IOVsP2Fig6S147RSoajpWXI8Rfs8ovc9sjX9 essyxz9jpSWb/yTE6R1FUYDV8BRjQbpCjYnNMnIs= Date: Thu, 24 Jun 2021 18:39:55 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bp@alien8.de, bp@suse.de, david@redhat.com, juew@google.com, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, stable@vger.kernel.org, tony.luck@intel.com, torvalds@linux-foundation.org, yaoaili@kingsoft.com Subject: [patch 18/24] mm/memory-failure: use a mutex to avoid memory_failure() races Message-ID: <20210625013955.Ag67YQBoO%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Tony Luck Subject: mm/memory-failure: use a mutex to avoid memory_failure() races Patch series "mm,hwpoison: fix sending SIGBUS for Action Required MCE", v5. I wrote this patchset to materialize what I think is the current allowable solution mentioned by the previous discussion [1]. I simply borrowed Tony's mutex patch and Aili's return code patch, then I queued another one to find error virtual address in the best effort manner. I know that this is not a perfect solution, but should work for some typical case. [1]: https://lore.kernel.org/linux-mm/20210331192540.2141052f@alex-virtual-machine/ This patch (of 2): There can be races when multiple CPUs consume poison from the same page. The first into memory_failure() atomically sets the HWPoison page flag and begins hunting for tasks that map this page. Eventually it invalidates those mappings and may send a SIGBUS to the affected tasks. But while all that work is going on, other CPUs see a "success" return code from memory_failure() and so they believe the error has been handled and continue executing. Fix by wrapping most of the internal parts of memory_failure() in a mutex. [akpm@linux-foundation.org: make mf_mutex local to memory_failure()] Link: https://lkml.kernel.org/r/20210521030156.2612074-1-nao.horiguchi@gmail.com Link: https://lkml.kernel.org/r/20210521030156.2612074-2-nao.horiguchi@gmail.com Signed-off-by: Tony Luck Signed-off-by: Naoya Horiguchi Reviewed-by: Borislav Petkov Reviewed-by: Oscar Salvador Cc: Aili Yao Cc: Andy Lutomirski Cc: Borislav Petkov Cc: David Hildenbrand Cc: Jue Wang Cc: Signed-off-by: Andrew Morton --- mm/memory-failure.c | 36 +++++++++++++++++++++++------------- 1 file changed, 23 insertions(+), 13 deletions(-) --- a/mm/memory-failure.c~mm-memory-failure-use-a-mutex-to-avoid-memory_failure-races +++ a/mm/memory-failure.c @@ -1429,9 +1429,10 @@ int memory_failure(unsigned long pfn, in struct page *hpage; struct page *orig_head; struct dev_pagemap *pgmap; - int res; + int res = 0; unsigned long page_flags; bool retry = true; + static DEFINE_MUTEX(mf_mutex); if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); @@ -1449,13 +1450,18 @@ int memory_failure(unsigned long pfn, in return -ENXIO; } + mutex_lock(&mf_mutex); + try_again: - if (PageHuge(p)) - return memory_failure_hugetlb(pfn, flags); + if (PageHuge(p)) { + res = memory_failure_hugetlb(pfn, flags); + goto unlock_mutex; + } + if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + goto unlock_mutex; } orig_head = hpage = compound_head(p); @@ -1488,17 +1494,19 @@ try_again: res = MF_FAILED; } action_result(pfn, MF_MSG_BUDDY, res); - return res == MF_RECOVERED ? 0 : -EBUSY; + res = res == MF_RECOVERED ? 0 : -EBUSY; } else { action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED); - return -EBUSY; + res = -EBUSY; } + goto unlock_mutex; } if (PageTransHuge(hpage)) { if (try_to_split_thp_page(p, "Memory Failure") < 0) { action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); - return -EBUSY; + res = -EBUSY; + goto unlock_mutex; } VM_BUG_ON_PAGE(!page_count(p), p); } @@ -1522,7 +1530,7 @@ try_again: if (PageCompound(p) && compound_head(p) != orig_head) { action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); res = -EBUSY; - goto out; + goto unlock_page; } /* @@ -1542,14 +1550,14 @@ try_again: num_poisoned_pages_dec(); unlock_page(p); put_page(p); - return 0; + goto unlock_mutex; } if (hwpoison_filter(p)) { if (TestClearPageHWPoison(p)) num_poisoned_pages_dec(); unlock_page(p); put_page(p); - return 0; + goto unlock_mutex; } /* @@ -1573,7 +1581,7 @@ try_again: if (!hwpoison_user_mappings(p, pfn, flags, &p)) { action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED); res = -EBUSY; - goto out; + goto unlock_page; } /* @@ -1582,13 +1590,15 @@ try_again: if (PageLRU(p) && !PageSwapCache(p) && p->mapping == NULL) { action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED); res = -EBUSY; - goto out; + goto unlock_page; } identify_page_state: res = identify_page_state(pfn, p, page_flags); -out: +unlock_page: unlock_page(p); +unlock_mutex: + mutex_unlock(&mf_mutex); return res; } EXPORT_SYMBOL_GPL(memory_failure); From patchwork Fri Jun 25 01:39:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55DAFC49EA5 for ; Fri, 25 Jun 2021 01:40:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4475C613C0 for ; Fri, 25 Jun 2021 01:40:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233039AbhFYBmU (ORCPT ); Thu, 24 Jun 2021 21:42:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:33034 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232942AbhFYBmT (ORCPT ); Thu, 24 Jun 2021 21:42:19 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0039F61220; Fri, 25 Jun 2021 01:39:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585199; bh=8c8w0TGUkRuQHCjjh8touhZDdnrgaWxcNkMAqxZi29I=; h=Date:From:To:Subject:In-Reply-To:From; b=QicPOIfMDIlfK8sNkiJ5APQGcnj8Jm1vHuhzIt/1zUka1EieFp7LNvfKPWl9UbIot LJqpMDhjPwFpr2QseeAu+KQ6wFDPc2VaNbcbFm6vWDP6RuLwqdJM8E/9zWd/KQ/b20 KQNpd3mJGAEB4i0LrWoHfuPyKcU/nDtjZlgsFtBI= Date: Thu, 24 Jun 2021 18:39:58 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bp@alien8.de, bp@suse.de, david@redhat.com, juew@google.com, linux-mm@kvack.org, luto@kernel.org, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, stable@vger.kernel.org, tony.luck@intel.com, torvalds@linux-foundation.org, yaoaili@kingsoft.com Subject: [patch 19/24] mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned Message-ID: <20210625013958.xJoXf1EqN%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Aili Yao Subject: mm,hwpoison: return -EHWPOISON to denote that the page has already been poisoned When memory_failure() is called with MF_ACTION_REQUIRED on the page that has already been hwpoisoned, memory_failure() could fail to send SIGBUS to the affected process, which results in infinite loop of MCEs. Currently memory_failure() returns 0 if it's called for already hwpoisoned page, then the caller, kill_me_maybe(), could return without sending SIGBUS to current process. An action required MCE is raised when the current process accesses to the broken memory, so no SIGBUS means that the current process continues to run and access to the error page again soon, so running into MCE loop. This issue can arise for example in the following scenarios: - Two or more threads access to the poisoned page concurrently. If local MCE is enabled, MCE handler independently handles the MCE events. So there's a race among MCE events, and the second or latter threads fall into the situation in question. - If there was a precedent memory error event and memory_failure() for the event failed to unmap the error page for some reason, the subsequent memory access to the error page triggers the MCE loop situation. To fix the issue, make memory_failure() return an error code when the error page has already been hwpoisoned. This allows memory error handler to control how it sends signals to userspace. And make sure that any process touching a hwpoisoned page should get a SIGBUS even in "already hwpoisoned" path of memory_failure() as is done in page fault path. Link: https://lkml.kernel.org/r/20210521030156.2612074-3-nao.horiguchi@gmail.com Signed-off-by: Aili Yao Signed-off-by: Naoya Horiguchi Reviewed-by: Oscar Salvador Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Borislav Petkov Cc: David Hildenbrand Cc: Jue Wang Cc: Tony Luck Cc: Signed-off-by: Andrew Morton --- mm/memory-failure.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/mm/memory-failure.c~mmhwpoison-return-ehwpoison-to-denote-that-the-page-has-already-been-poisoned +++ a/mm/memory-failure.c @@ -1253,7 +1253,7 @@ static int memory_failure_hugetlb(unsign if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + return -EHWPOISON; } num_poisoned_pages_inc(); @@ -1461,6 +1461,7 @@ try_again: if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); + res = -EHWPOISON; goto unlock_mutex; } From patchwork Fri Jun 25 01:40:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 467794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBF40C49EB9 for ; Fri, 25 Jun 2021 01:40:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A677A613BA for ; Fri, 25 Jun 2021 01:40:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232942AbhFYBmW (ORCPT ); Thu, 24 Jun 2021 21:42:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:33130 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232933AbhFYBmW (ORCPT ); Thu, 24 Jun 2021 21:42:22 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 24928610A7; Fri, 25 Jun 2021 01:40:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1624585202; bh=nXqVb0XzfhjU+oyvbBOYczjm7GzgSZMNKuEYBbNNHrY=; h=Date:From:To:Subject:In-Reply-To:From; b=cAgwamnp3TTTPFDFQ3paH0uiab7nN18Jw+47VCDZYcaa440cnS1CKqUuPNrQQNpPS 0RRle8nUrmNLAZz/FzB2PiTD/PiS++c+EEOOovxLcnKzaOW0uxIm4/auvOX9UTlCKn /jS2wLptJyaR9O7Tyfs8TNTk/61wRzbR85feZUg4= Date: Thu, 24 Jun 2021 18:40:01 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.de, stable@vger.kernel.org, tony.luck@intel.com, torvalds@linux-foundation.org Subject: [patch 20/24] mm/hwpoison: do not lock page again when me_huge_page() successfully recovers Message-ID: <20210625014001.5BhWymxNg%akpm@linux-foundation.org> In-Reply-To: <20210624183838.ac3161ca4a43989665ac8b2f@linux-foundation.org> User-Agent: s-nail v14.8.16 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Naoya Horiguchi Subject: mm/hwpoison: do not lock page again when me_huge_page() successfully recovers Currently me_huge_page() temporary unlocks page to perform some actions then locks it again later. My testcase (which calls hard-offline on some tail page in a hugetlb, then accesses the address of the hugetlb range) showed that page allocation code detects this page lock on buddy page and printed out "BUG: Bad page state" message. check_new_page_bad() does not consider a page with __PG_HWPOISON as bad page, so this flag works as kind of filter, but this filtering doesn't work in this case because the "bad page" is not the actual hwpoisoned page. So stop locking page again. Actions to be taken depend on the page type of the error, so page unlocking should be done in ->action() callbacks. So let's make it assumed and change all existing callbacks that way. Link: https://lkml.kernel.org/r/20210609072029.74645-1-nao.horiguchi@gmail.com Fixes: commit 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error") Signed-off-by: Naoya Horiguchi Cc: Oscar Salvador Cc: Michal Hocko Cc: Tony Luck Cc: "Aneesh Kumar K.V" Cc: Signed-off-by: Andrew Morton --- mm/memory-failure.c | 44 ++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-) --- a/mm/memory-failure.c~mm-hwpoison-do-not-lock-page-again-when-me_huge_page-successfully-recovers +++ a/mm/memory-failure.c @@ -658,6 +658,7 @@ static int truncate_error_page(struct pa */ static int me_kernel(struct page *p, unsigned long pfn) { + unlock_page(p); return MF_IGNORED; } @@ -667,6 +668,7 @@ static int me_kernel(struct page *p, uns static int me_unknown(struct page *p, unsigned long pfn) { pr_err("Memory failure: %#lx: Unknown page state\n", pfn); + unlock_page(p); return MF_FAILED; } @@ -675,6 +677,7 @@ static int me_unknown(struct page *p, un */ static int me_pagecache_clean(struct page *p, unsigned long pfn) { + int ret; struct address_space *mapping; delete_from_lru_cache(p); @@ -683,8 +686,10 @@ static int me_pagecache_clean(struct pag * For anonymous pages we're done the only reference left * should be the one m_f() holds. */ - if (PageAnon(p)) - return MF_RECOVERED; + if (PageAnon(p)) { + ret = MF_RECOVERED; + goto out; + } /* * Now truncate the page in the page cache. This is really @@ -698,7 +703,8 @@ static int me_pagecache_clean(struct pag /* * Page has been teared down in the meanwhile */ - return MF_FAILED; + ret = MF_FAILED; + goto out; } /* @@ -706,7 +712,10 @@ static int me_pagecache_clean(struct pag * * Open: to take i_mutex or not for this? Right now we don't. */ - return truncate_error_page(p, pfn, mapping); + ret = truncate_error_page(p, pfn, mapping); +out: + unlock_page(p); + return ret; } /* @@ -782,24 +791,26 @@ static int me_pagecache_dirty(struct pag */ static int me_swapcache_dirty(struct page *p, unsigned long pfn) { + int ret; + ClearPageDirty(p); /* Trigger EIO in shmem: */ ClearPageUptodate(p); - if (!delete_from_lru_cache(p)) - return MF_DELAYED; - else - return MF_FAILED; + ret = delete_from_lru_cache(p) ? MF_FAILED : MF_DELAYED; + unlock_page(p); + return ret; } static int me_swapcache_clean(struct page *p, unsigned long pfn) { + int ret; + delete_from_swap_cache(p); - if (!delete_from_lru_cache(p)) - return MF_RECOVERED; - else - return MF_FAILED; + ret = delete_from_lru_cache(p) ? MF_FAILED : MF_RECOVERED; + unlock_page(p); + return ret; } /* @@ -820,6 +831,7 @@ static int me_huge_page(struct page *p, mapping = page_mapping(hpage); if (mapping) { res = truncate_error_page(hpage, pfn, mapping); + unlock_page(hpage); } else { res = MF_FAILED; unlock_page(hpage); @@ -834,7 +846,6 @@ static int me_huge_page(struct page *p, page_ref_inc(p); res = MF_RECOVERED; } - lock_page(hpage); } return res; @@ -866,6 +877,8 @@ static struct page_state { unsigned long mask; unsigned long res; enum mf_action_page_type type; + + /* Callback ->action() has to unlock the relevant page inside it. */ int (*action)(struct page *p, unsigned long pfn); } error_states[] = { { reserved, reserved, MF_MSG_KERNEL, me_kernel }, @@ -929,6 +942,7 @@ static int page_action(struct page_state int result; int count; + /* page p should be unlocked after returning from ps->action(). */ result = ps->action(p, pfn); count = page_count(p) - 1; @@ -1313,7 +1327,7 @@ static int memory_failure_hugetlb(unsign goto out; } - res = identify_page_state(pfn, p, page_flags); + return identify_page_state(pfn, p, page_flags); out: unlock_page(head); return res; @@ -1596,6 +1610,8 @@ try_again: identify_page_state: res = identify_page_state(pfn, p, page_flags); + mutex_unlock(&mf_mutex); + return res; unlock_page: unlock_page(p); unlock_mutex: