From patchwork Tue Feb 25 20:16:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajay Kaher X-Patchwork-Id: 230562 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED5B6C35DE1 for ; Tue, 25 Feb 2020 12:16:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CB65D2176D for ; Tue, 25 Feb 2020 12:16:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726587AbgBYMQB (ORCPT ); Tue, 25 Feb 2020 07:16:01 -0500 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:41349 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725788AbgBYMQB (ORCPT ); Tue, 25 Feb 2020 07:16:01 -0500 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Tue, 25 Feb 2020 04:15:58 -0800 Received: from akaher-lnx-dev.eng.vmware.com (unknown [10.110.19.203]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id D4F98405C3; Tue, 25 Feb 2020 04:15:53 -0800 (PST) From: Ajay Kaher To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v4 v4.4.y 1/7] mm: make page ref count overflow check tighter and more explicit Date: Wed, 26 Feb 2020 01:46:08 +0530 Message-ID: <1582661774-30925-2-git-send-email-akaher@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582661774-30925-1-git-send-email-akaher@vmware.com> References: <1582661774-30925-1-git-send-email-akaher@vmware.com> MIME-Version: 1.0 Received-SPF: None (EX13-EDG-OU-002.vmware.com: akaher@vmware.com does not designate permitted sender hosts) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Linus Torvalds commit f958d7b528b1b40c44cfda5eabe2d82760d868c3 upsteam. We have a VM_BUG_ON() to check that the page reference count doesn't underflow (or get close to overflow) by checking the sign of the count. That's all fine, but we actually want to allow people to use a "get page ref unless it's already very high" helper function, and we want that one to use the sign of the page ref (without triggering this VM_BUG_ON). Change the VM_BUG_ON to only check for small underflows (or _very_ close to overflowing), and ignore overflows which have strayed into negative territory. Acked-by: Matthew Wilcox Cc: Jann Horn Cc: stable@kernel.org Signed-off-by: Linus Torvalds [ 4.4.y backport notes: Ajay: Open-coded atomic refcount access due to missing page_ref_count() helper in 4.4.y Srivatsa: Added overflow check to get_page_foll() and related code. ] Signed-off-by: Srivatsa S. Bhat (VMware) Signed-off-by: Ajay Kaher --- include/linux/mm.h | 6 +++++- mm/internal.h | 5 +++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ed653ba..701088e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -488,6 +488,10 @@ static inline void get_huge_page_tail(struct page *page) extern bool __get_page_tail(struct page *page); +/* 127: arbitrary random number, small enough to assemble well */ +#define page_ref_zero_or_close_to_overflow(page) \ + ((unsigned int) atomic_read(&page->_count) + 127u <= 127u) + static inline void get_page(struct page *page) { if (unlikely(PageTail(page))) @@ -497,7 +501,7 @@ static inline void get_page(struct page *page) * Getting a normal page or the head of a compound page * requires to already have an elevated page->_count. */ - VM_BUG_ON_PAGE(atomic_read(&page->_count) <= 0, page); + VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page); atomic_inc(&page->_count); } diff --git a/mm/internal.h b/mm/internal.h index f63f439..67015e5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -81,7 +81,8 @@ static inline void __get_page_tail_foll(struct page *page, * speculative page access (like in * page_cache_get_speculative()) on tail pages. */ - VM_BUG_ON_PAGE(atomic_read(&compound_head(page)->_count) <= 0, page); + VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(compound_head(page)), + page); if (get_page_head) atomic_inc(&compound_head(page)->_count); get_huge_page_tail(page); @@ -106,7 +107,7 @@ static inline void get_page_foll(struct page *page) * Getting a normal page or the head of a compound page * requires to already have an elevated page->_count. */ - VM_BUG_ON_PAGE(atomic_read(&page->_count) <= 0, page); + VM_BUG_ON_PAGE(page_ref_zero_or_close_to_overflow(page), page); atomic_inc(&page->_count); } } From patchwork Tue Feb 25 20:16:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajay Kaher X-Patchwork-Id: 230561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41A2EC35DE1 for ; Tue, 25 Feb 2020 12:17:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 22D742176D for ; Tue, 25 Feb 2020 12:17:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727068AbgBYMR2 (ORCPT ); Tue, 25 Feb 2020 07:17:28 -0500 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:41375 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726019AbgBYMR2 (ORCPT ); Tue, 25 Feb 2020 07:17:28 -0500 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Tue, 25 Feb 2020 04:17:25 -0800 Received: from akaher-lnx-dev.eng.vmware.com (unknown [10.110.19.203]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 6D82C405C3; Tue, 25 Feb 2020 04:17:20 -0800 (PST) From: Ajay Kaher To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Hillf Danton Subject: [PATCH v4 v4.4.y 3/7] mm, gup: remove broken VM_BUG_ON_PAGE compound check for hugepages Date: Wed, 26 Feb 2020 01:46:10 +0530 Message-ID: <1582661774-30925-4-git-send-email-akaher@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582661774-30925-1-git-send-email-akaher@vmware.com> References: <1582661774-30925-1-git-send-email-akaher@vmware.com> MIME-Version: 1.0 Received-SPF: None (EX13-EDG-OU-002.vmware.com: akaher@vmware.com does not designate permitted sender hosts) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Will Deacon commit a3e328556d41bb61c55f9dfcc62d6a826ea97b85 upstream. When operating on hugepages with DEBUG_VM enabled, the GUP code checks the compound head for each tail page prior to calling page_cache_add_speculative. This is broken, because on the fast-GUP path (where we don't hold any page table locks) we can be racing with a concurrent invocation of split_huge_page_to_list. split_huge_page_to_list deals with this race by using page_ref_freeze to freeze the page and force concurrent GUPs to fail whilst the component pages are modified. This modification includes clearing the compound_head field for the tail pages, so checking this prior to a successful call to page_cache_add_speculative can lead to false positives: In fact, page_cache_add_speculative *already* has this check once the page refcount has been successfully updated, so we can simply remove the broken calls to VM_BUG_ON_PAGE. Link: http://lkml.kernel.org/r/20170522133604.11392-2-punit.agrawal@arm.com Signed-off-by: Will Deacon Signed-off-by: Punit Agrawal Acked-by: Steve Capper Acked-by: Kirill A. Shutemov Cc: Aneesh Kumar K.V Cc: Catalin Marinas Cc: Naoya Horiguchi Cc: Mark Rutland Cc: Hillf Danton Cc: Michal Hocko Cc: Mike Kravetz Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Srivatsa S. Bhat (VMware) Signed-off-by: Ajay Kaher Signed-off-by: Vlastimil Babka --- mm/gup.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index 45c544b..6e7cfaa 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1136,7 +1136,6 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); tail = page; do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++; @@ -1183,7 +1182,6 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); tail = page; do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++; @@ -1226,7 +1224,6 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, page = head + ((addr & ~PGDIR_MASK) >> PAGE_SHIFT); tail = page; do { - VM_BUG_ON_PAGE(compound_head(page) != head, page); pages[*nr] = page; (*nr)++; page++; From patchwork Tue Feb 25 20:16:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajay Kaher X-Patchwork-Id: 230560 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6481C35DE1 for ; Tue, 25 Feb 2020 12:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B96F32176D for ; Tue, 25 Feb 2020 12:17:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728983AbgBYMRv (ORCPT ); Tue, 25 Feb 2020 07:17:51 -0500 Received: from ex13-edg-ou-002.vmware.com ([208.91.0.190]:20188 "EHLO EX13-EDG-OU-002.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726019AbgBYMRv (ORCPT ); Tue, 25 Feb 2020 07:17:51 -0500 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-002.vmware.com (10.113.208.156) with Microsoft SMTP Server id 15.0.1156.6; Tue, 25 Feb 2020 04:17:46 -0800 Received: from akaher-lnx-dev.eng.vmware.com (unknown [10.110.19.203]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id B1F5C405C3; Tue, 25 Feb 2020 04:17:41 -0800 (PST) From: Ajay Kaher To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v4 v4.4.y 5/7] mm: prevent get_user_pages() from overflowing page refcount Date: Wed, 26 Feb 2020 01:46:12 +0530 Message-ID: <1582661774-30925-6-git-send-email-akaher@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582661774-30925-1-git-send-email-akaher@vmware.com> References: <1582661774-30925-1-git-send-email-akaher@vmware.com> MIME-Version: 1.0 Received-SPF: None (EX13-EDG-OU-002.vmware.com: akaher@vmware.com does not designate permitted sender hosts) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Linus Torvalds commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream. If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page. Reported-by: Jann Horn Acked-by: Matthew Wilcox Cc: stable@kernel.org Signed-off-by: Linus Torvalds [ 4.4.y backport notes: Ajay: - Added local variable 'err' with-in follow_hugetlb_page() from 2be7cfed995e, to resolve compilation error - Added page_ref_count() - Added missing refcount overflow checks on x86 and s390 (Vlastimil, thanks for this change) Srivatsa: - Replaced call to get_page_foll() with try_get_page_foll() ] Signed-off-by: Srivatsa S. Bhat (VMware) Signed-off-by: Ajay Kaher Signed-off-by: Vlastimil Babka --- arch/s390/mm/gup.c | 6 ++++-- arch/x86/mm/gup.c | 9 ++++++++- include/linux/mm.h | 5 +++++ mm/gup.c | 42 +++++++++++++++++++++++++++++++++--------- mm/hugetlb.c | 16 +++++++++++++++- 5 files changed, 65 insertions(+), 13 deletions(-) diff --git a/arch/s390/mm/gup.c b/arch/s390/mm/gup.c index 7ad41be..4f7dad3 100644 --- a/arch/s390/mm/gup.c +++ b/arch/s390/mm/gup.c @@ -37,7 +37,8 @@ static inline int gup_pte_range(pmd_t *pmdp, pmd_t pmd, unsigned long addr, return 0; VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - if (!page_cache_get_speculative(page)) + if (WARN_ON_ONCE(page_ref_count(page) < 0) + || !page_cache_get_speculative(page)) return 0; if (unlikely(pte_val(pte) != pte_val(*ptep))) { put_page(page); @@ -76,7 +77,8 @@ static inline int gup_huge_pmd(pmd_t *pmdp, pmd_t pmd, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - if (!page_cache_add_speculative(head, refs)) { + if (WARN_ON_ONCE(page_ref_count(head) < 0) + || !page_cache_add_speculative(head, refs)) { *nr -= refs; return 0; } diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c index 7d2542a..6612d53 100644 --- a/arch/x86/mm/gup.c +++ b/arch/x86/mm/gup.c @@ -95,7 +95,10 @@ static noinline int gup_pte_range(pmd_t pmd, unsigned long addr, } VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - get_page(page); + if (unlikely(!try_get_page(page))) { + pte_unmap(ptep); + return 0; + } SetPageReferenced(page); pages[*nr] = page; (*nr)++; @@ -132,6 +135,8 @@ static noinline int gup_huge_pmd(pmd_t pmd, unsigned long addr, refs = 0; head = pmd_page(pmd); + if (WARN_ON_ONCE(page_ref_count(head) <= 0)) + return 0; page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); do { VM_BUG_ON_PAGE(compound_head(page) != head, page); @@ -208,6 +213,8 @@ static noinline int gup_huge_pud(pud_t pud, unsigned long addr, refs = 0; head = pud_page(pud); + if (WARN_ON_ONCE(page_ref_count(head) <= 0)) + return 0; page = head + ((addr & ~PUD_MASK) >> PAGE_SHIFT); do { VM_BUG_ON_PAGE(compound_head(page) != head, page); diff --git a/include/linux/mm.h b/include/linux/mm.h index 52edaf1..31a4500 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -488,6 +488,11 @@ static inline void get_huge_page_tail(struct page *page) extern bool __get_page_tail(struct page *page); +static inline int page_ref_count(struct page *page) +{ + return atomic_read(&page->_count); +} + /* 127: arbitrary random number, small enough to assemble well */ #define page_ref_zero_or_close_to_overflow(page) \ ((unsigned int) atomic_read(&page->_count) + 127u <= 127u) diff --git a/mm/gup.c b/mm/gup.c index 71e9d00..4c58578 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -126,8 +126,12 @@ retry: } } - if (flags & FOLL_GET) - get_page_foll(page); + if (flags & FOLL_GET) { + if (unlikely(!try_get_page_foll(page))) { + page = ERR_PTR(-ENOMEM); + goto out; + } + } if (flags & FOLL_TOUCH) { if ((flags & FOLL_WRITE) && !pte_dirty(pte) && !PageDirty(page)) @@ -289,7 +293,10 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, goto unmap; *page = pte_page(*pte); } - get_page(*page); + if (unlikely(!try_get_page(*page))) { + ret = -ENOMEM; + goto unmap; + } out: ret = 0; unmap: @@ -1053,6 +1060,20 @@ struct page *get_dump_page(unsigned long addr) */ #ifdef CONFIG_HAVE_GENERIC_RCU_GUP +/* + * Return the compund head page with ref appropriately incremented, + * or NULL if that failed. + */ +static inline struct page *try_get_compound_head(struct page *page, int refs) +{ + struct page *head = compound_head(page); + if (WARN_ON_ONCE(atomic_read(&head->_count) < 0)) + return NULL; + if (unlikely(!page_cache_add_speculative(head, refs))) + return NULL; + return head; +} + #ifdef __HAVE_ARCH_PTE_SPECIAL static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, int write, struct page **pages, int *nr) @@ -1083,6 +1104,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); + if (WARN_ON_ONCE(page_ref_count(page) < 0)) + goto pte_unmap; + if (!page_cache_get_speculative(page)) goto pte_unmap; @@ -1139,8 +1163,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pmd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pmd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1185,8 +1209,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pud_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pud_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1227,8 +1251,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pgd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pgd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd932e7..3a1501e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3886,6 +3886,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long vaddr = *position; unsigned long remainder = *nr_pages; struct hstate *h = hstate_vma(vma); + int err = -EFAULT; while (vaddr < vma->vm_end && remainder) { pte_t *pte; @@ -3957,6 +3958,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; page = pte_page(huge_ptep_get(pte)); + + /* + * Instead of doing 'try_get_page_foll()' below in the same_page + * loop, just check the count once here. + */ + if (unlikely(page_count(page) <= 0)) { + if (pages) { + spin_unlock(ptl); + remainder = 0; + err = -ENOMEM; + break; + } + } same_page: if (pages) { pages[i] = mem_map_offset(page, pfn_offset); @@ -3983,7 +3997,7 @@ same_page: *nr_pages = remainder; *position = vaddr; - return i ? i : -EFAULT; + return i ? i : err; } unsigned long hugetlb_change_protection(struct vm_area_struct *vma, From patchwork Tue Feb 25 20:16:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ajay Kaher X-Patchwork-Id: 230559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1989C35DF5 for ; Tue, 25 Feb 2020 12:18:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9EA382176D for ; Tue, 25 Feb 2020 12:18:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729034AbgBYMSL (ORCPT ); Tue, 25 Feb 2020 07:18:11 -0500 Received: from ex13-edg-ou-001.vmware.com ([208.91.0.189]:59982 "EHLO EX13-EDG-OU-001.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726019AbgBYMSL (ORCPT ); Tue, 25 Feb 2020 07:18:11 -0500 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-001.vmware.com (10.113.208.155) with Microsoft SMTP Server id 15.0.1156.6; Tue, 25 Feb 2020 04:18:08 -0800 Received: from akaher-lnx-dev.eng.vmware.com (unknown [10.110.19.203]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 0DD63408C1; Tue, 25 Feb 2020 04:18:02 -0800 (PST) From: Ajay Kaher To: CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v4 v4.4.y 7/7] fs: prevent page refcount overflow in pipe_buf_get Date: Wed, 26 Feb 2020 01:46:14 +0530 Message-ID: <1582661774-30925-8-git-send-email-akaher@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582661774-30925-1-git-send-email-akaher@vmware.com> References: <1582661774-30925-1-git-send-email-akaher@vmware.com> MIME-Version: 1.0 Received-SPF: None (EX13-EDG-OU-001.vmware.com: akaher@vmware.com does not designate permitted sender hosts) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Matthew Wilcox commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream. Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn Signed-off-by: Matthew Wilcox Cc: stable@kernel.org Signed-off-by: Linus Torvalds [ 4.4.y backport notes: Regarding the change in generic_pipe_buf_get(), note that page_cache_get() is the same as get_page(). See mainline commit 09cbfeaf1a5a6 "mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros" for context. ] Signed-off-by: Ajay Kaher Signed-off-by: Vlastimil Babka --- fs/fuse/dev.c | 12 ++++++------ fs/pipe.c | 4 ++-- fs/splice.c | 12 ++++++++++-- include/linux/pipe_fs_i.h | 10 ++++++---- kernel/trace/trace.c | 6 +++++- 5 files changed, 29 insertions(+), 15 deletions(-) diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 36a5df9..16891f5 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -2031,10 +2031,8 @@ static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe, rem += pipe->bufs[(pipe->curbuf + idx) & (pipe->buffers - 1)].len; ret = -EINVAL; - if (rem < len) { - pipe_unlock(pipe); - goto out; - } + if (rem < len) + goto out_free; rem = len; while (rem) { @@ -2052,7 +2050,9 @@ static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe, pipe->curbuf = (pipe->curbuf + 1) & (pipe->buffers - 1); pipe->nrbufs--; } else { - pipe_buf_get(pipe, ibuf); + if (!pipe_buf_get(pipe, ibuf)) + goto out_free; + *obuf = *ibuf; obuf->flags &= ~PIPE_BUF_FLAG_GIFT; obuf->len = rem; @@ -2075,13 +2075,13 @@ static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe, ret = fuse_dev_do_write(fud, &cs, len); pipe_lock(pipe); +out_free: for (idx = 0; idx < nbuf; idx++) { struct pipe_buffer *buf = &bufs[idx]; buf->ops->release(pipe, buf); } pipe_unlock(pipe); -out: kfree(bufs); return ret; } diff --git a/fs/pipe.c b/fs/pipe.c index 1e7263b..6534470 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -178,9 +178,9 @@ EXPORT_SYMBOL(generic_pipe_buf_steal); * in the tee() system call, when we duplicate the buffers in one * pipe into another. */ -void generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) +bool generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { - page_cache_get(buf->page); + return try_get_page(buf->page); } EXPORT_SYMBOL(generic_pipe_buf_get); diff --git a/fs/splice.c b/fs/splice.c index fde1263..57ccc58 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -1876,7 +1876,11 @@ retry: * Get a reference to this pipe buffer, * so we can copy the contents over. */ - pipe_buf_get(ipipe, ibuf); + if (!pipe_buf_get(ipipe, ibuf)) { + if (ret == 0) + ret = -EFAULT; + break; + } *obuf = *ibuf; /* @@ -1948,7 +1952,11 @@ static int link_pipe(struct pipe_inode_info *ipipe, * Get a reference to this pipe buffer, * so we can copy the contents over. */ - pipe_buf_get(ipipe, ibuf); + if (!pipe_buf_get(ipipe, ibuf)) { + if (ret == 0) + ret = -EFAULT; + break; + } obuf = opipe->bufs + nbuf; *obuf = *ibuf; diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h index 10876f3..0b28b65 100644 --- a/include/linux/pipe_fs_i.h +++ b/include/linux/pipe_fs_i.h @@ -112,18 +112,20 @@ struct pipe_buf_operations { /* * Get a reference to the pipe buffer. */ - void (*get)(struct pipe_inode_info *, struct pipe_buffer *); + bool (*get)(struct pipe_inode_info *, struct pipe_buffer *); }; /** * pipe_buf_get - get a reference to a pipe_buffer * @pipe: the pipe that the buffer belongs to * @buf: the buffer to get a reference to + * + * Return: %true if the reference was successfully obtained. */ -static inline void pipe_buf_get(struct pipe_inode_info *pipe, +static inline __must_check bool pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { - buf->ops->get(pipe, buf); + return buf->ops->get(pipe, buf); } /* Differs from PIPE_BUF in that PIPE_SIZE is the length of the actual @@ -148,7 +150,7 @@ struct pipe_inode_info *alloc_pipe_info(void); void free_pipe_info(struct pipe_inode_info *); /* Generic pipe buffer ops functions */ -void generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *); +bool generic_pipe_buf_get(struct pipe_inode_info *, struct pipe_buffer *); int generic_pipe_buf_confirm(struct pipe_inode_info *, struct pipe_buffer *); int generic_pipe_buf_steal(struct pipe_inode_info *, struct pipe_buffer *); void generic_pipe_buf_release(struct pipe_inode_info *, struct pipe_buffer *); diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index ae00e68..7fe8d04 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5731,12 +5731,16 @@ static void buffer_pipe_buf_release(struct pipe_inode_info *pipe, buf->private = 0; } -static void buffer_pipe_buf_get(struct pipe_inode_info *pipe, +static bool buffer_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf) { struct buffer_ref *ref = (struct buffer_ref *)buf->private; + if (ref->ref > INT_MAX/2) + return false; + ref->ref++; + return true; } /* Pipe buffer operations for a buffer. */