From patchwork Tue Jun 6 17:58:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 103187 Delivered-To: patch@linaro.org Received: by 10.182.29.35 with SMTP id g3csp1387609obh; Tue, 6 Jun 2017 10:59:13 -0700 (PDT) X-Received: by 10.84.169.3 with SMTP id g3mr22811008plb.37.1496771953855; Tue, 06 Jun 2017 10:59:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1496771953; cv=none; d=google.com; s=arc-20160816; b=GngPYkDHPQOEsO44V33cb/kVO9bNguNP+L2mUMbWjuEZND/Cr1One4hMslx1PSXvlE iIEvu25RADvG68Dh4vWTfuKDYkXpelvv4emzhfGeQRpZ4kWbKt62IOpafzOvQlTJ4n/8 iYmjnYPjYoDE1S1dmv1oA9B/fm9I1H3FGmMEVYMw93Rz0/1icVth8lUvIwtd0a5Wl9Eg BpDtuSMfFLQWrjcm4YU4pSI/ZMfGlYqpkdbHr3Y2ZmvdV/UwjchOMj77ZyGg5TWVS0w0 AH0kB/b1tvOGzEwYrBra5qb70h535dTBvnlx4b5FPLtA5MFAK0YNbU/XjjLbp69t0NzD NGtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=xfxARnJZ8QEPLX3cREtjkYR8HAATF+du7RmZ4Y7H4dc=; b=BWAUg9zMj94KWT0tblHqxVTGCdYu+CjcO8AXTPIJmAnXdcRvqrRRuxQ7X0rFMrEs/d pihm6eeObwhnUD1e+lO2svkXI3TiG2DWfLUG/PUq8jFZfV0gvclJqnPqrEFyaytvp2uH z5LoHKySVZDTe9NxYMJPYE2pzf5U+pfr/pdcvfZ+yTECjWG3yVMTLKCZbXM/JrYEzVno +rPyGQ0mBCOP8dbGpScgAteLWwxxY9cvVr9GtCEKOM91Z2ZWs7b3IT31xlYTAi4jhjCx 7u3hXe/diSvqxsPJEmSeEIp3W2H8ADguMqqUG8wUpuTH6jW0vhPzkqGdFxQ1z+UpXA5w Tu1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k83si3138925pfh.135.2017.06.06.10.59.13; Tue, 06 Jun 2017 10:59:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751744AbdFFR7H (ORCPT + 25 others); Tue, 6 Jun 2017 13:59:07 -0400 Received: from foss.arm.com ([217.140.101.70]:50624 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751523AbdFFR6a (ORCPT ); Tue, 6 Jun 2017 13:58:30 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6FF3215A2; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 40EAC3F589; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id D7B081AE07AE; Tue, 6 Jun 2017 18:58:36 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, Will Deacon Subject: [PATCH 1/3] mm: numa: avoid waiting on freed migrated pages Date: Tue, 6 Jun 2017 18:58:34 +0100 Message-Id: <1496771916-28203-2-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1496771916-28203-1-git-send-email-will.deacon@arm.com> References: <1496771916-28203-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mark Rutland In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by waiting until the pmd is unlocked before we return and retry. However, we can race with migrate_misplaced_transhuge_page(): // do_huge_pmd_numa_page // migrate_misplaced_transhuge_page() // Holds 0 refs on page // Holds 2 refs on page vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); /* ... */ if (pmd_trans_migrating(*vmf->pmd)) { page = pmd_page(*vmf->pmd); spin_unlock(vmf->ptl); ptl = pmd_lock(mm, pmd); if (page_count(page) != 2)) { /* roll back */ } /* ... */ mlock_migrate_page(new_page, page); /* ... */ spin_unlock(ptl); put_page(page); put_page(page); // page freed here wait_on_page_locked(page); goto out; } This can result in the freed page having its waiters flag set unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the page alloc/free functions. This has been observed on arm64 KVM guests. We can avoid this by having do_huge_pmd_numa_page() take a reference on the page before dropping the pmd lock, mirroring what we do in __migration_entry_wait(). When we hit the race, migrate_misplaced_transhuge_page() will see the reference and abort the migration, as it may do today in other cases. Acked-by: Steve Capper Signed-off-by: Mark Rutland Signed-off-by: Will Deacon --- mm/huge_memory.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -- 2.1.4 Acked-by: Vlastimil Babka Acked-by: Kirill A. Shutemov diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a84909cf20d3..88c6167f194d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1426,8 +1426,11 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) */ if (unlikely(pmd_trans_migrating(*vmf->pmd))) { page = pmd_page(*vmf->pmd); + if (!get_page_unless_zero(page)) + goto out_unlock; spin_unlock(vmf->ptl); wait_on_page_locked(page); + put_page(page); goto out; } @@ -1459,9 +1462,12 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd) /* Migration could have started since the pmd_trans_migrating check */ if (!page_locked) { + page_nid = -1; + if (!get_page_unless_zero(page)) + goto out_unlock; spin_unlock(vmf->ptl); wait_on_page_locked(page); - page_nid = -1; + put_page(page); goto out; } From patchwork Tue Jun 6 17:58:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 103184 Delivered-To: patch@linaro.org Received: by 10.182.29.35 with SMTP id g3csp1387468obh; Tue, 6 Jun 2017 10:58:48 -0700 (PDT) X-Received: by 10.84.214.23 with SMTP id h23mr22983147pli.127.1496771927913; Tue, 06 Jun 2017 10:58:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1496771927; cv=none; d=google.com; s=arc-20160816; b=b3ghCDxdoL1Tv5VmBG8Hu2B2vd2dBI+qUIG5MO/y+DqOQ7/fwJuPN5xfS1e8lZUgLW BC/5gBugVLgxMhbxb7KSyma3SvhWAS7XF82oLunM0ud27wRO8DSB2Vko9bfXAS7MIXg/ c/1KfN+rWEbXnXBY9Z77px+IVmLEexNZq//s1pZ6vmwhnGMNW7sCU5303ezErbFsW+f6 jldnFYrcNGdDGFFObZj45qG20F9yAyijb+/q8LNXLCI7vLaOI8IoeTd8fsWk2zGbZQ/p JGu3DG8bEZVOtj0++vbMH3TZb4FEVZWkU1zxDVUNxvVUEpSuwzZEebgOhWz3SAXkZSrM RURA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=KA0H3YHI5FuXr22YAJgY8Qohko4JsLN3lBav0NPPE4s=; b=zKi37oZn09wPJsY2tKFvs8CAsFc3sa/aQe6nMabbSbKHlE85yOjQiZP511IdY9Gzb6 nIUbNTd4fWF3PQuz5wFbbkUHZa44Yl7itIpzZud8vPX1YpiAdb3vxa4dQwHARKop3beZ YDYAOA6kjVE882ev069qDkODxNX9wp69vHFmKGSde/SfkfFhbmOJl8ccHTdhUejkkoIj m4clS7GoU848QX6B0e4qolCCCN3mp+zYHfT8uvBiD89XfxR94IphXtAiyyHOBBS4VGYc WdDbQ3azU2WVLLo+B8jqRgtk43V61pJ1km784zXSTE+Fo1D4xwdIMYsOA8zTc2JBq8gS 1eNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v5si6175533plk.12.2017.06.06.10.58.47; Tue, 06 Jun 2017 10:58:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751703AbdFFR6e (ORCPT + 25 others); Tue, 6 Jun 2017 13:58:34 -0400 Received: from foss.arm.com ([217.140.101.70]:50632 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751517AbdFFR6a (ORCPT ); Tue, 6 Jun 2017 13:58:30 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 815D215AD; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 52FDB3F88C; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id E7FA51AE11D5; Tue, 6 Jun 2017 18:58:36 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, Will Deacon Subject: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Date: Tue, 6 Jun 2017 18:58:35 +0100 Message-Id: <1496771916-28203-3-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1496771916-28203-1-git-send-email-will.deacon@arm.com> References: <1496771916-28203-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org page_ref_freeze and page_ref_unfreeze are designed to be used as a pair, wrapping a critical section where struct pages can be modified without having to worry about consistency for a concurrent fast-GUP. Whilst page_ref_freeze has full barrier semantics due to its use of atomic_cmpxchg, page_ref_unfreeze is implemented using atomic_set, which doesn't provide any barrier semantics and allows the operation to be reordered with respect to page modifications in the critical section. This patch ensures that page_ref_unfreeze is ordered after any critical section updates, by invoking smp_mb__before_atomic() prior to the atomic_set. Cc: "Kirill A. Shutemov" Acked-by: Steve Capper Signed-off-by: Will Deacon --- include/linux/page_ref.h | 1 + 1 file changed, 1 insertion(+) -- 2.1.4 diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h index 610e13271918..74d32d7905cb 100644 --- a/include/linux/page_ref.h +++ b/include/linux/page_ref.h @@ -174,6 +174,7 @@ static inline void page_ref_unfreeze(struct page *page, int count) VM_BUG_ON_PAGE(page_count(page) != 0, page); VM_BUG_ON(count == 0); + smp_mb__before_atomic(); atomic_set(&page->_refcount, count); if (page_ref_tracepoint_active(__tracepoint_page_ref_unfreeze)) __page_ref_unfreeze(page, count); From patchwork Tue Jun 6 17:58:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 103186 Delivered-To: patch@linaro.org Received: by 10.182.29.35 with SMTP id g3csp1387605obh; Tue, 6 Jun 2017 10:59:13 -0700 (PDT) X-Received: by 10.99.44.201 with SMTP id s192mr5063251pgs.84.1496771953436; Tue, 06 Jun 2017 10:59:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1496771953; cv=none; d=google.com; s=arc-20160816; b=UFoIW0EVV871p0GacCZOR4dT931OjWvYtfuZYEO3DYIKhFqnODxNNGsZS74eAngetJ 2FymR25K4pSjWxubNn9eHFILV7iXjyKVvHz2WTBcD00Q5btoNUXc13ID313h8o1HzQa6 3oMxXXdlkcYgIIMn+O298y58VT2kb9b8TLGJZP46mKSuNHkVFmgLZd0r+iVBZuQulqiI 3ou8p+Kemf/NSgxPPwLjSJksL53EpET3vdw6EOKAOzBR6Rgkx4tR2Vkr+ITytPlT8oaf eddMORSbRNLelRa1nt/YqaFRGQDdgWIzJFu7Cf+hO6BRcECWTjB+m+JnlEimJYnUzKy4 FH/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=x9qS0Db0ISOQoubf+CL+5jyqTkq9UQKjtf6VEzGEORU=; b=YDUkjZeGQ8KF2hcaQjk9dJfayB/m+08qChJTky8Q4rw8DgwPid9/Kf+M101zpmavl0 5775d1mwxHGWeXDhYlKx2HuvVdQlTSG9GdbmShQZ9rgkPlM8NtGJd7Jnnolw95bnjHlD lc1mcYbU/nEyX3XUnlPU1nYvyA3Or4GqTOGuC4Neo0fyzXlYqN6se7bWMM2Y1+Vdxsio UvHmHF1E3BBs9n9VEy2pnNRKZFAnp0RG+WOQgEICdUjAWHPSUz/bG2Gn4VhqR/LF1r4U 9OWn+KATZL1/Du8Ox52oe2gsmLrKCx7LDnG27X1xIhz3amo3cjtQcARiIjdyAJnOjTxH l7ig== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k83si3138925pfh.135.2017.06.06.10.59.13; Tue, 06 Jun 2017 10:59:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751652AbdFFR6c (ORCPT + 25 others); Tue, 6 Jun 2017 13:58:32 -0400 Received: from foss.arm.com ([217.140.101.70]:50640 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751405AbdFFR6a (ORCPT ); Tue, 6 Jun 2017 13:58:30 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8F59615BE; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 604793F911; Tue, 6 Jun 2017 10:58:29 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 04C811AE17A1; Tue, 6 Jun 2017 18:58:36 +0100 (BST) From: Will Deacon To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: mark.rutland@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, Punit.Agrawal@arm.com, mgorman@suse.de, steve.capper@arm.com, Will Deacon Subject: [PATCH 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages Date: Tue, 6 Jun 2017 18:58:36 +0100 Message-Id: <1496771916-28203-4-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1496771916-28203-1-git-send-email-will.deacon@arm.com> References: <1496771916-28203-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When migrating a transparent hugepage, migrate_misplaced_transhuge_page guards itself against a concurrent fastgup of the page by checking that the page count is equal to 2 before and after installing the new pmd. If the page count changes, then the pmd is reverted back to the original entry, however there is a small window where the new (possibly writable) pmd is installed and the underlying page could be written by userspace. Restoring the old pmd could therefore result in loss of data. This patch fixes the problem by freezing the page count whilst updating the page tables, which protects against a concurrent fastgup without the need to restore the old pmd in the failure case (since the page count can no longer change under our feet). Cc: Mel Gorman Signed-off-by: Will Deacon --- mm/migrate.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-) -- 2.1.4 Acked-by: Kirill A. Shutemov diff --git a/mm/migrate.c b/mm/migrate.c index 89a0a1707f4c..8b21f1b1ec6e 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1913,7 +1913,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, int page_lru = page_is_file_cache(page); unsigned long mmun_start = address & HPAGE_PMD_MASK; unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE; - pmd_t orig_entry; /* * Rate-limit the amount of data that is being migrated to a node. @@ -1956,8 +1955,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, /* Recheck the target PMD */ mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); ptl = pmd_lock(mm, pmd); - if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) { -fail_putback: + if (unlikely(!pmd_same(*pmd, entry) || !page_ref_freeze(page, 2))) { spin_unlock(ptl); mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end); @@ -1979,7 +1977,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, goto out_unlock; } - orig_entry = *pmd; entry = mk_huge_pmd(new_page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); @@ -1996,15 +1993,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, set_pmd_at(mm, mmun_start, pmd, entry); update_mmu_cache_pmd(vma, address, &entry); - if (page_count(page) != 2) { - set_pmd_at(mm, mmun_start, pmd, orig_entry); - flush_pmd_tlb_range(vma, mmun_start, mmun_end); - mmu_notifier_invalidate_range(mm, mmun_start, mmun_end); - update_mmu_cache_pmd(vma, address, &entry); - page_remove_rmap(new_page, true); - goto fail_putback; - } - + page_ref_unfreeze(page, 2); mlock_migrate_page(new_page, page); page_remove_rmap(page, true); set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);