From patchwork Mon Dec 6 14:56:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 522213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7BB6C433F5 for ; Mon, 6 Dec 2021 15:08:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347495AbhLFPMM (ORCPT ); Mon, 6 Dec 2021 10:12:12 -0500 Received: from dfw.source.kernel.org ([139.178.84.217]:58226 "EHLO dfw.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349943AbhLFPKh (ORCPT ); Mon, 6 Dec 2021 10:10:37 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7C0B36131B; Mon, 6 Dec 2021 15:07:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 616DEC341C2; Mon, 6 Dec 2021 15:07:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1638803227; bh=TJZKzLWWzQzY4J9lP7NWOnUWXJWBljjiurwrpLMTq2s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TPCz6eui9MLTqCPhar0IUoFqq0lj52pgxpKJKrrHmz1ncrqJWgNOLhZlZIgOu2c0u Nvrv49CO/sTfDK0m0LlstR5G+EbR+NnQl5JfvftgeiCPahJwppmyTiVXa+Wm4F8ATj a1WvrARMvP5+vdRJzWFgJTO2zQxEHnhEdlu8cx1I= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mike Kravetz , "Kirill A. Shutemov" , Naoya Horiguchi , Vlastimil Babka , Davidlohr Bueso , Michal Hocko , Jerome Glisse , Andrew Morton Subject: [PATCH 4.14 072/106] hugetlb: take PMD sharing into account when flushing tlb/caches Date: Mon, 6 Dec 2021 15:56:20 +0100 Message-Id: <20211206145557.963824995@linuxfoundation.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20211206145555.386095297@linuxfoundation.org> References: <20211206145555.386095297@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Mike Kravetz commit dff11abe280b47c21b804a8ace318e0638bb9a49 upstream. When fixing an issue with PMD sharing and migration, it was discovered via code inspection that other callers of huge_pmd_unshare potentially have an issue with cache and tlb flushing. Use the routine adjust_range_if_pmd_sharing_possible() to calculate worst case ranges for mmu notifiers. Ensure that this range is flushed if huge_pmd_unshare succeeds and unmaps a PUD_SUZE area. Link: http://lkml.kernel.org/r/20180823205917.16297-3-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz Acked-by: Kirill A. Shutemov Reviewed-by: Naoya Horiguchi Cc: Vlastimil Babka Cc: Davidlohr Bueso Cc: Michal Hocko Cc: Jerome Glisse Cc: Mike Kravetz Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman --- mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 10 deletions(-) --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3384,8 +3384,8 @@ void __unmap_hugepage_range(struct mmu_g struct page *page; struct hstate *h = hstate_vma(vma); unsigned long sz = huge_page_size(h); - const unsigned long mmun_start = start; /* For mmu_notifiers */ - const unsigned long mmun_end = end; /* For mmu_notifiers */ + unsigned long mmun_start = start; /* For mmu_notifiers */ + unsigned long mmun_end = end; /* For mmu_notifiers */ bool force_flush = false; WARN_ON(!is_vm_hugetlb_page(vma)); @@ -3398,6 +3398,11 @@ void __unmap_hugepage_range(struct mmu_g */ tlb_remove_check_page_size_change(tlb, sz); tlb_start_vma(tlb, vma); + + /* + * If sharing possible, alert mmu notifiers of worst case. + */ + adjust_range_if_pmd_sharing_possible(vma, &mmun_start, &mmun_end); mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end); address = start; for (; address < end; address += sz) { @@ -3508,12 +3513,23 @@ void unmap_hugepage_range(struct vm_area { struct mm_struct *mm; struct mmu_gather tlb; + unsigned long tlb_start = start; + unsigned long tlb_end = end; + + /* + * If shared PMDs were possibly used within this vma range, adjust + * start/end for worst case tlb flushing. + * Note that we can not be sure if PMDs are shared until we try to + * unmap pages. However, we want to make sure TLB flushing covers + * the largest possible range. + */ + adjust_range_if_pmd_sharing_possible(vma, &tlb_start, &tlb_end); mm = vma->vm_mm; - tlb_gather_mmu(&tlb, mm, start, end); + tlb_gather_mmu(&tlb, mm, tlb_start, tlb_end); __unmap_hugepage_range(&tlb, vma, start, end, ref_page); - tlb_finish_mmu(&tlb, start, end); + tlb_finish_mmu(&tlb, tlb_start, tlb_end); } /* @@ -4408,11 +4424,21 @@ unsigned long hugetlb_change_protection( pte_t pte; struct hstate *h = hstate_vma(vma); unsigned long pages = 0; + unsigned long f_start = start; + unsigned long f_end = end; + bool shared_pmd = false; + + /* + * In the case of shared PMDs, the area to flush could be beyond + * start/end. Set f_start/f_end to cover the maximum possible + * range if PMD sharing is possible. + */ + adjust_range_if_pmd_sharing_possible(vma, &f_start, &f_end); BUG_ON(address >= end); - flush_cache_range(vma, address, end); + flush_cache_range(vma, f_start, f_end); - mmu_notifier_invalidate_range_start(mm, start, end); + mmu_notifier_invalidate_range_start(mm, f_start, f_end); i_mmap_lock_write(vma->vm_file->f_mapping); for (; address < end; address += huge_page_size(h)) { spinlock_t *ptl; @@ -4423,6 +4449,7 @@ unsigned long hugetlb_change_protection( if (huge_pmd_unshare(mm, &address, ptep)) { pages++; spin_unlock(ptl); + shared_pmd = true; continue; } pte = huge_ptep_get(ptep); @@ -4458,12 +4485,18 @@ unsigned long hugetlb_change_protection( * Must flush TLB before releasing i_mmap_rwsem: x86's huge_pmd_unshare * may have cleared our pud entry and done put_page on the page table: * once we release i_mmap_rwsem, another task can do the final put_page - * and that page table be reused and filled with junk. + * and that page table be reused and filled with junk. If we actually + * did unshare a page of pmds, flush the range corresponding to the pud. */ - flush_hugetlb_tlb_range(vma, start, end); - mmu_notifier_invalidate_range(mm, start, end); + if (shared_pmd) { + flush_hugetlb_tlb_range(vma, f_start, f_end); + mmu_notifier_invalidate_range(mm, f_start, f_end); + } else { + flush_hugetlb_tlb_range(vma, start, end); + mmu_notifier_invalidate_range(mm, start, end); + } i_mmap_unlock_write(vma->vm_file->f_mapping); - mmu_notifier_invalidate_range_end(mm, start, end); + mmu_notifier_invalidate_range_end(mm, f_start, f_end); return pages << h->order; }