diff mbox series

[4.9] mm, thp: make do_huge_pmd_wp_page() lock page for testing mapcount

Message ID 20210226162200.20548-1-vbabka@suse.cz
State New
Headers show
Series [4.9] mm, thp: make do_huge_pmd_wp_page() lock page for testing mapcount | expand

Commit Message

Vlastimil Babka Feb. 26, 2021, 4:22 p.m. UTC
Jann reported [1] a race between __split_huge_pmd_locked() and
page_trans_huge_map_swapcount() which can result in a page to be reused
instead of COWed. This was later assigned CVE-2020-29368.

This was fixed by commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic
against __split_huge_pmd_locked()") by doing the split under the page lock,
while all users of page_trans_huge_map_swapcount() were already also under page
lock. The fix was backported also to 4.9 stable series.

When testing the backport on a 4.12 based kernel, Nicolai noticed the POC from
[1] still reproduces after backporting c444eb564fb1 and identified a missing
page lock in do_huge_pmd_wp_page() around the call to
page_trans_huge_mapcount(). The page lock was only added in ba3c4ce6def4 ("mm,
THP, swap: make reuse_swap_page() works for THP swapped out") in 4.14. The
commit also wrapped page_trans_huge_mapcount() into
page_trans_huge_map_swapcount() for the purposes of COW decisions.

I have verified that 4.9.y indeed also reproduces with the POC. Backporting
ba3c4ce6def4 alone however is not possible as it's part of a larger effort of
optimizing THP swapping, which would be risky to backport fully.

Therefore this 4.9-stable-only patch just wraps page_trans_huge_mapcount()
in page_trans_huge_mapcount() under page lock the same way as ba3c4ce6def4
does, but without the page_trans_huge_map_swapcount() part. Other callers
of page_trans_huge_mapcount() are all under page lock already. I have verified
the POC no longer reproduces afterwards.

[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=2045

Reported-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/huge_memory.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Sasha Levin Feb. 26, 2021, 7:09 p.m. UTC | #1
On Fri, Feb 26, 2021 at 05:22:00PM +0100, Vlastimil Babka wrote:
>Jann reported [1] a race between __split_huge_pmd_locked() and
>page_trans_huge_map_swapcount() which can result in a page to be reused
>instead of COWed. This was later assigned CVE-2020-29368.
>
>This was fixed by commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic
>against __split_huge_pmd_locked()") by doing the split under the page lock,
>while all users of page_trans_huge_map_swapcount() were already also under page
>lock. The fix was backported also to 4.9 stable series.
>
>When testing the backport on a 4.12 based kernel, Nicolai noticed the POC from
>[1] still reproduces after backporting c444eb564fb1 and identified a missing
>page lock in do_huge_pmd_wp_page() around the call to
>page_trans_huge_mapcount(). The page lock was only added in ba3c4ce6def4 ("mm,
>THP, swap: make reuse_swap_page() works for THP swapped out") in 4.14. The
>commit also wrapped page_trans_huge_mapcount() into
>page_trans_huge_map_swapcount() for the purposes of COW decisions.
>
>I have verified that 4.9.y indeed also reproduces with the POC. Backporting
>ba3c4ce6def4 alone however is not possible as it's part of a larger effort of
>optimizing THP swapping, which would be risky to backport fully.
>
>Therefore this 4.9-stable-only patch just wraps page_trans_huge_mapcount()
>in page_trans_huge_mapcount() under page lock the same way as ba3c4ce6def4
>does, but without the page_trans_huge_map_swapcount() part. Other callers
>of page_trans_huge_mapcount() are all under page lock already. I have verified
>the POC no longer reproduces afterwards.
>
>[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=2045
>
>Reported-by: Nicolai Stange <nstange@suse.de>
>Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Queued up, thanks!
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 05ca01ef97f7..14cd0ef33b62 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1022,6 +1022,19 @@  int do_huge_pmd_wp_page(struct fault_env *fe, pmd_t orig_pmd)
 	 * We can only reuse the page if nobody else maps the huge page or it's
 	 * part.
 	 */
+	if (!trylock_page(page)) {
+		get_page(page);
+		spin_unlock(fe->ptl);
+		lock_page(page);
+		spin_lock(fe->ptl);
+		if (unlikely(!pmd_same(*fe->pmd, orig_pmd))) {
+			unlock_page(page);
+			put_page(page);
+			goto out_unlock;
+		}
+		put_page(page);
+	}
+
 	if (page_trans_huge_mapcount(page, NULL) == 1) {
 		pmd_t entry;
 		entry = pmd_mkyoung(orig_pmd);
@@ -1029,8 +1042,10 @@  int do_huge_pmd_wp_page(struct fault_env *fe, pmd_t orig_pmd)
 		if (pmdp_set_access_flags(vma, haddr, fe->pmd, entry,  1))
 			update_mmu_cache_pmd(vma, fe->address, fe->pmd);
 		ret |= VM_FAULT_WRITE;
+		unlock_page(page);
 		goto out_unlock;
 	}
+	unlock_page(page);
 	get_page(page);
 	spin_unlock(fe->ptl);
 alloc: