From patchwork Thu Feb 6 16:18:49 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Capper X-Patchwork-Id: 24263 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-qc0-f200.google.com (mail-qc0-f200.google.com [209.85.216.200]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id A79EB20445 for ; Thu, 6 Feb 2014 16:19:06 +0000 (UTC) Received: by mail-qc0-f200.google.com with SMTP id e9sf4379611qcy.7 for ; Thu, 06 Feb 2014 08:19:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=RLqLM+1MuPd/MOPBIWW7f6ZDS2TSiH5NETJJCPise+s=; b=HmWu1MJ/mkrW51TrrttRKP98xcxHtZnfUl68G9Euiasi4NCKtzTBDzsYVHqOCfnBdw aSKkRxfON/MXzTxXjtyBFuGKO2H44BrpiVwen8F9sZJjdS3uMNeUE7E/A16nWq2Apiuo FCnLAhiCv+fRgkHOelB6zSadfMs2hkk0OevmHCwbpWYTkFi6/CxK9Tn6v2iGKSYYu4Fp R3E2HM2OAsyBWStvP33GSbpOkOyTppHhQCzr6fzQRujuANcnS7UvMCD3XSLUWSO4Efof 1u9zH1LEhtO3tOUkQZwrHM1BHrkiMHXlf5XE8papanHzjrRH7rucpvHf7UYbPwppVbPx UX/Q== X-Gm-Message-State: ALoCoQmkWbDXN50GhRHw045CfgcgM+FTDOohu5tgIcgksALUVvIEPAXoNxZHiqPmkUT1Xq/vuT3n X-Received: by 10.236.133.82 with SMTP id p58mr1581608yhi.37.1391703545838; Thu, 06 Feb 2014 08:19:05 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.23.80 with SMTP id 74ls627305qgo.61.gmail; Thu, 06 Feb 2014 08:19:05 -0800 (PST) X-Received: by 10.58.106.134 with SMTP id gu6mr10968veb.44.1391703545685; Thu, 06 Feb 2014 08:19:05 -0800 (PST) Received: from mail-vb0-f46.google.com (mail-vb0-f46.google.com [209.85.212.46]) by mx.google.com with ESMTPS id fo2si362375vcb.87.2014.02.06.08.19.05 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 06 Feb 2014 08:19:05 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.46 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.212.46; Received: by mail-vb0-f46.google.com with SMTP id o19so1569122vbm.5 for ; Thu, 06 Feb 2014 08:19:05 -0800 (PST) X-Received: by 10.220.110.69 with SMTP id m5mr2538vcp.55.1391703545459; Thu, 06 Feb 2014 08:19:05 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.174.196 with SMTP id u4csp25212vcz; Thu, 6 Feb 2014 08:19:04 -0800 (PST) X-Received: by 10.180.206.172 with SMTP id lp12mr104912wic.46.1391703544342; Thu, 06 Feb 2014 08:19:04 -0800 (PST) Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) by mx.google.com with ESMTPS id md3si11630050wic.60.2014.02.06.08.19.03 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 06 Feb 2014 08:19:04 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.177 is neither permitted nor denied by best guess record for domain of steve.capper@linaro.org) client-ip=209.85.212.177; Received: by mail-wi0-f177.google.com with SMTP id e4so1752632wiv.16 for ; Thu, 06 Feb 2014 08:19:03 -0800 (PST) X-Received: by 10.194.85.168 with SMTP id i8mr65690wjz.81.1391703543782; Thu, 06 Feb 2014 08:19:03 -0800 (PST) Received: from marmot.wormnet.eu (marmot.wormnet.eu. [188.246.204.87]) by mx.google.com with ESMTPSA id d6sm6170359wiz.4.2014.02.06.08.19.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 06 Feb 2014 08:19:03 -0800 (PST) From: Steve Capper To: linux-arm-kernel@lists.infradead.org Cc: will.deacon@arm.com, catalin.marinas@arm.com, linux@arm.linux.org.uk, chanho61.park@samsung.com, zishen.lim@linaro.org, patches@linaro.org, gary.robertson@linaro.org, michael.hudson@linaro.org, christoffer.dall@linaro.org, Steve Capper Subject: [RFC PATCH V2 2/4] arm: mm: implement get_user_pages_fast Date: Thu, 6 Feb 2014 16:18:49 +0000 Message-Id: <1391703531-12845-3-git-send-email-steve.capper@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1391703531-12845-1-git-send-email-steve.capper@linaro.org> References: <1391703531-12845-1-git-send-email-steve.capper@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: steve.capper@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.46 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , An implementation of get_user_pages_fast for ARM. It is based loosely on the PowerPC implementation. We disable interrupts in the walker to prevent the call_rcu_sched pagetable freeing code from running under us. We also explicitly fire an IPI in the Transparent HugePage splitting case to prevent splits from interfering with the fast_gup walker. As THP splits are relatively rare, this should not have a noticable overhead. Signed-off-by: Steve Capper --- arch/arm/include/asm/pgtable-3level.h | 6 + arch/arm/mm/Makefile | 2 +- arch/arm/mm/gup.c | 251 ++++++++++++++++++++++++++++++++++ 3 files changed, 258 insertions(+), 1 deletion(-) create mode 100644 arch/arm/mm/gup.c diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 4f95039..4392c40 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -214,6 +214,12 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr) #ifdef CONFIG_TRANSPARENT_HUGEPAGE #define pmd_trans_huge(pmd) (pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT)) #define pmd_trans_splitting(pmd) (pmd_val(pmd) & PMD_SECT_SPLITTING) + +#ifdef CONFIG_HAVE_RCU_TABLE_FREE +#define __HAVE_ARCH_PMDP_SPLITTING_FLUSH +void pmdp_splitting_flush(struct vm_area_struct *vma, unsigned long address, + pmd_t *pmdp); +#endif #endif #define PMD_BIT_FUNC(fn,op) \ diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index ecfe6e5..45cc6d8 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -6,7 +6,7 @@ obj-y := dma-mapping.o extable.o fault.o init.o \ iomap.o obj-$(CONFIG_MMU) += fault-armv.o flush.o idmap.o ioremap.o \ - mmap.o pgd.o mmu.o + mmap.o pgd.o mmu.o gup.o ifneq ($(CONFIG_MMU),y) obj-y += nommu.o diff --git a/arch/arm/mm/gup.c b/arch/arm/mm/gup.c new file mode 100644 index 0000000..2dcacad --- /dev/null +++ b/arch/arm/mm/gup.c @@ -0,0 +1,251 @@ +/* + * arch/arm/mm/gup.c + * + * Copyright (C) 2014 Linaro Ltd. + * + * Based on arch/powerpc/mm/gup.c which is: + * Copyright (C) 2008 Nick Piggin + * Copyright (C) 2008 Novell Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#include +#include +#include +#include +#include +#include + +static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + pte_t *ptep, *ptem; + int ret = 0; + + ptem = ptep = pte_offset_map(&pmd, addr); + do { + pte_t pte = ACCESS_ONCE(*ptep); + struct page *page; + + if (!pte_present_user(pte) || (write && !pte_write(pte))) + goto pte_unmap; + + VM_BUG_ON(!pfn_valid(pte_pfn(pte))); + page = pte_page(pte); + + if (!page_cache_get_speculative(page)) + goto pte_unmap; + + if (unlikely(pte_val(pte) != pte_val(*ptep))) { + put_page(page); + goto pte_unmap; + } + + pages[*nr] = page; + (*nr)++; + + } while (ptep++, addr += PAGE_SIZE, addr != end); + + ret = 1; + +pte_unmap: + pte_unmap(ptem); + return ret; +} + +static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, + unsigned long end, int write, struct page **pages, int *nr) +{ + struct page *head, *page, *tail; + int refs; + + if (!pmd_present(orig) || (write && !pmd_write(orig))) + return 0; + + refs = 0; + head = pmd_page(orig); + page = head + ((addr & ~PMD_MASK) >> PAGE_SHIFT); + tail = page; + do { + VM_BUG_ON(compound_head(page) != head); + pages[*nr] = page; + (*nr)++; + page++; + refs++; + } while (addr += PAGE_SIZE, addr != end); + + if (!page_cache_add_speculative(head, refs)) { + *nr -= refs; + return 0; + } + + if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) { + *nr -= refs; + while (refs--) + put_page(head); + return 0; + } + + /* + * Any tail pages need their mapcount reference taken before we + * return. (This allows the THP code to bump their ref count when + * they are split into base pages). + */ + while (refs--) { + if (PageTail(tail)) + get_huge_page_tail(tail); + tail++; + } + + return 1; +} + +static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pmd_t *pmdp; + + pmdp = pmd_offset(&pud, addr); + do { + pmd_t pmd = ACCESS_ONCE(*pmdp); + next = pmd_addr_end(addr, end); + if (pmd_none(pmd) || pmd_trans_splitting(pmd)) + return 0; + + if (unlikely(pmd_thp_or_huge(pmd))) { + if (!gup_huge_pmd(pmd, pmdp, addr, next, write, + pages, nr)) + return 0; + } else { + if (!gup_pte_range(pmd, addr, next, write, pages, nr)) + return 0; + } + } while (pmdp++, addr = next, addr != end); + + return 1; +} + +static int gup_pud_range(pgd_t *pgdp, unsigned long addr, unsigned long end, + int write, struct page **pages, int *nr) +{ + unsigned long next; + pud_t *pudp; + + pudp = pud_offset(pgdp, addr); + do { + pud_t pud = ACCESS_ONCE(*pudp); + next = pud_addr_end(addr, end); + if (pud_none(pud)) + return 0; + else if (!gup_pmd_range(pud, addr, next, write, pages, nr)) + return 0; + } while (pudp++, addr = next, addr != end); + + return 1; +} + +/* + * Like get_user_pages_fast() except its IRQ-safe in that it won't fall + * back to the regular GUP. + */ +int __get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + struct mm_struct *mm = current->mm; + unsigned long addr, len, end; + unsigned long next, flags; + pgd_t *pgdp; + int nr = 0; + + start &= PAGE_MASK; + addr = start; + len = (unsigned long) nr_pages << PAGE_SHIFT; + end = start + len; + + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, + start, len))) + return 0; + + /* + * Disable interrupts, we use the nested form as we can already + * have interrupts disabled by get_futex_key. + * + * With interrupts disabled, we block page table pages from being + * freed from under us. See mmu_gather_tlb in asm-generic/tlb.h + * for more details. + */ + + local_irq_save(flags); + pgdp = pgd_offset(mm, addr); + do { + next = pgd_addr_end(addr, end); + if (pgd_none(*pgdp)) + break; + else if (!gup_pud_range(pgdp, addr, next, write, pages, &nr)) + break; + } while (pgdp++, addr = next, addr != end); + local_irq_restore(flags); + + return nr; +} + +int get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + struct mm_struct *mm = current->mm; + int nr, ret; + + start &= PAGE_MASK; + nr = __get_user_pages_fast(start, nr_pages, write, pages); + ret = nr; + + if (nr < nr_pages) { + /* Try to get the remaining pages with get_user_pages */ + start += nr << PAGE_SHIFT; + pages += nr; + + down_read(&mm->mmap_sem); + ret = get_user_pages(current, mm, start, + nr_pages - nr, write, 0, pages, NULL); + up_read(&mm->mmap_sem); + + /* Have to be a bit careful with return values */ + if (nr > 0) { + if (ret < 0) + ret = nr; + else + ret += nr; + } + } + + return ret; +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#ifdef CONFIG_HAVE_RCU_TABLE_FREE +static void thp_splitting_flush_sync(void *arg) +{ +} + +void pmdp_splitting_flush(struct vm_area_struct *vma, unsigned long address, + pmd_t *pmdp) +{ + pmd_t pmd = pmd_mksplitting(*pmdp); + VM_BUG_ON(address & ~PMD_MASK); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + + /* dummy IPI to serialise against fast_gup */ + smp_call_function(thp_splitting_flush_sync, NULL, 1); +} +#endif /* CONFIG_HAVE_RCU_TABLE_FREE */ +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */