From patchwork Sun Aug 21 12:18:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 74379 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp1119693qga; Sun, 21 Aug 2016 05:21:24 -0700 (PDT) X-Received: by 10.66.155.7 with SMTP id vs7mr32101821pab.154.1471782084743; Sun, 21 Aug 2016 05:21:24 -0700 (PDT) Return-Path: Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id z2si16519036pfb.226.2016.08.21.05.21.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 21 Aug 2016 05:21:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) smtp.mailfrom=linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.85_2 #1 (Red Hat Linux)) id 1bbRjM-0001az-6e; Sun, 21 Aug 2016 12:20:00 +0000 Received: from mail-wm0-x22b.google.com ([2a00:1450:400c:c09::22b]) by bombadil.infradead.org with esmtps (Exim 4.85_2 #1 (Red Hat Linux)) id 1bbRiO-0000XJ-7t for linux-arm-kernel@lists.infradead.org; Sun, 21 Aug 2016 12:19:12 +0000 Received: by mail-wm0-x22b.google.com with SMTP id i5so103645874wmg.0 for ; Sun, 21 Aug 2016 05:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=qnzVgmV1fnRMbBdVNgkyGF3saNR4lqdKL2o6ceLbdLk=; b=Avnxw/7rhAkeYe7eHSvtvHGh9crmQHI67/Li0V2MtxUD4QbHFlL+QLodzvGbAhohRb v9UiP2b+dSD2zEz9Wr7RUWWdsiXWf5k6PNafQ1n0CAJQzUL0XpZSaHpg+0zMucH3qHvD qjh2f0REba0t6X0WfgMFOKa/zrn7/zlP9U2Pw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=qnzVgmV1fnRMbBdVNgkyGF3saNR4lqdKL2o6ceLbdLk=; b=Mb2UttCKiIAUxkv553omuIsbfbJFvsRfHErNSE/MWqVTfBCk8zo3G2vH/zATehUpxu FFGobRgKrzEBvfRyxLoRsBIJydxzUb/yCFM2uRTxZ6mDzd4P7yto4CjdY/P4gmgZ4hMO r5aFwFY0oB7MN/NUPqJbmxI0lvL/z/pxFBN7NMhVAQe2ffwhBYtn1v9Flf0LI94B65Es 0sLKeDAg6Y+oDlAUpM8lDw4G6bxEXCwW71hf5qIB2LaVvye9+AYwGKnm4KhgE+ukaAet Rq9VgKCb0SNUcuV5zXIRbCs+oTiKM5Ouhv4Vdim3o/eKXioy8xnTW0kWLnVmyBO/QDiD GPzw== X-Gm-Message-State: AEkooussCKwyvcT7YWRMqG/pmCQJ3Hz7RYIzseDUacbO25qzOo52RBI4pK9l+IyMcSYWWfw7 X-Received: by 10.28.109.214 with SMTP id b83mr10652597wmi.19.1471781917904; Sun, 21 Aug 2016 05:18:37 -0700 (PDT) Received: from localhost.localdomain (2.178.14.62.static.jazztel.es. [62.14.178.2]) by smtp.gmail.com with ESMTPSA id w129sm14875810wmd.9.2016.08.21.05.18.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 21 Aug 2016 05:18:37 -0700 (PDT) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, will.deacon@arm.com Subject: [RFC/RFT PATCH] arm64: mm: allow userland to run with one fewer translation level Date: Sun, 21 Aug 2016 14:18:34 +0200 Message-Id: <1471781914-16681-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160821_051901_360515_E2785031 X-CRM114-Status: GOOD ( 20.96 ) X-Spam-Score: -2.7 (--) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-2.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [2a00:1450:400c:c09:0:0:0:22b listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mark.rutland@arm.com, steve.capper@linaro.org, Ard Biesheuvel , leif.lindholm@linaro.org, agraf@suse.de, jeremy.linton@arm.com, cov@codeaurora.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org The choice of VA size is usually decided by the requirements on the kernel side, particularly the size of the linear region, which must be large enough to cover all of physical memory, including the holes in between, which may be very large (~512 GB on some systems). Since running with more translation levels could potentially result in a performance penalty due to additional TLB pressure, this patch allows the kernel to be configured so that it runs with one fewer translation level on the userland side. Rather than modifying all the compile time logic to deal with folded PUDs or PMDs, we simply allocate the root table and the next table adjacently, so that we can simply point TTBR0_EL1 to the next table (and update TCR_EL1.T0SZ accordingly) Signed-off-by: Ard Biesheuvel --- This is just a proof of concept. *If* there is a performance penalty associated with using 4 translation levels instead of 3, I would expect this patch to compensate for that, given that the additional TLB pressure should be on the userland side primarily. Benchmark results are highly appreciated. As a bonus, this would fix the horrible yet real JIT issues we have been seeing with 48-bit VA configurations. IOW, I expect this to be an easier sell than simply limiting TASKSIZE to 47 bits (assuming anyone can show a benchmark where this patch has a positive impact on the performance of a 48-bit/4 levels kernel) and distros can ship kernels that work on all hardware (including Freescale and Xgene with >= 64 GB) but don't break their JITs. This patch is most likely broken for 16k/47-bit configs, but I didn't bother to fix that before having the discussion. arch/arm64/Kconfig | 38 +++++++++++++++++++- arch/arm64/include/asm/memory.h | 3 +- arch/arm64/include/asm/mmu_context.h | 5 ++- arch/arm64/include/asm/proc-fns.h | 10 +++--- arch/arm64/mm/pgd.c | 15 +++++--- 5 files changed, 58 insertions(+), 13 deletions(-) -- 2.7.4 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index bc3f00f586f1..6b68371af550 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -509,7 +509,7 @@ config ARM64_64K_PAGES endchoice choice - prompt "Virtual address space size" + prompt "Kernel virtual address space size" default ARM64_VA_BITS_39 if ARM64_4K_PAGES default ARM64_VA_BITS_47 if ARM64_16K_PAGES default ARM64_VA_BITS_42 if ARM64_64K_PAGES @@ -539,6 +539,34 @@ config ARM64_VA_BITS_48 endchoice +choice + prompt "Userland virtual address space size" + default ARM64_USER_VA_BITS_39 if ARM64_4K_PAGES + default ARM64_USER_VA_BITS_47 if ARM64_16K_PAGES + default ARM64_USER_VA_BITS_42 if ARM64_64K_PAGES + +config ARM64_USER_VA_BITS_36 + bool "36-bit" + depends on ARM64_VA_BITS_36 || ARM64_VA_BITS_47 + +config ARM64_USER_VA_BITS_39 + bool "39-bit" + depends on ARM64_4K_PAGES + +config ARM64_USER_VA_BITS_42 + bool "42-bit" + depends on ARM64_64K_PAGES + +config ARM64_USER_VA_BITS_47 + bool "47-bit" + depends on ARM64_16K_PAGES && !ARM64_VA_BITS_36 + +config ARM64_USER_VA_BITS_48 + bool "48-bit" + depends on ARM64_VA_BITS_48 + +endchoice + config ARM64_VA_BITS int default 36 if ARM64_VA_BITS_36 @@ -547,6 +575,14 @@ config ARM64_VA_BITS default 47 if ARM64_VA_BITS_47 default 48 if ARM64_VA_BITS_48 +config ARM64_USER_VA_BITS + int + default 36 if ARM64_USER_VA_BITS_36 + default 39 if ARM64_USER_VA_BITS_39 + default 42 if ARM64_USER_VA_BITS_42 + default 47 if ARM64_USER_VA_BITS_47 + default 48 if ARM64_USER_VA_BITS_48 + config CPU_BIG_ENDIAN bool "Build big-endian kernel" help diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 31b73227b41f..605ace198c99 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -64,6 +64,7 @@ * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area. */ #define VA_BITS (CONFIG_ARM64_VA_BITS) +#define USER_VA_BITS (CONFIG_ARM64_USER_VA_BITS) #define VA_START (UL(0xffffffffffffffff) << VA_BITS) #define PAGE_OFFSET (UL(0xffffffffffffffff) << (VA_BITS - 1)) #define KIMAGE_VADDR (MODULES_END) @@ -74,7 +75,7 @@ #define PCI_IO_END (VMEMMAP_START - SZ_2M) #define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE) #define FIXADDR_TOP (PCI_IO_START - SZ_2M) -#define TASK_SIZE_64 (UL(1) << VA_BITS) +#define TASK_SIZE_64 (UL(1) << USER_VA_BITS) #ifdef CONFIG_COMPAT #define TASK_SIZE_32 UL(0x100000000) diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h index b1892a0dbcb0..a605d671b79a 100644 --- a/arch/arm64/include/asm/mmu_context.h +++ b/arch/arm64/include/asm/mmu_context.h @@ -67,8 +67,7 @@ extern u64 idmap_t0sz; static inline bool __cpu_uses_extended_idmap(void) { - return (!IS_ENABLED(CONFIG_ARM64_VA_BITS_48) && - unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS))); + return idmap_t0sz != TCR_T0SZ(USER_VA_BITS); } /* @@ -90,7 +89,7 @@ static inline void __cpu_set_tcr_t0sz(unsigned long t0sz) : "r"(t0sz), "I"(TCR_T0SZ_OFFSET), "I"(TCR_TxSZ_WIDTH)); } -#define cpu_set_default_tcr_t0sz() __cpu_set_tcr_t0sz(TCR_T0SZ(VA_BITS)) +#define cpu_set_default_tcr_t0sz() __cpu_set_tcr_t0sz(TCR_T0SZ(USER_VA_BITS)) #define cpu_set_idmap_tcr_t0sz() __cpu_set_tcr_t0sz(idmap_t0sz) /* diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h index 14ad6e4e87d1..3d61f942adec 100644 --- a/arch/arm64/include/asm/proc-fns.h +++ b/arch/arm64/include/asm/proc-fns.h @@ -35,10 +35,12 @@ extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr); #include -#define cpu_switch_mm(pgd,mm) \ -do { \ - BUG_ON(pgd == swapper_pg_dir); \ - cpu_do_switch_mm(virt_to_phys(pgd),mm); \ +#define cpu_switch_mm(pgd,mm) \ +do { \ + pgd_t *__pgd = (VA_BITS == USER_VA_BITS || pgd == idmap_pg_dir) \ + ? pgd : (void *)pgd + PAGE_SIZE; \ + BUG_ON(pgd == swapper_pg_dir); \ + cpu_do_switch_mm(virt_to_phys(__pgd),mm); \ } while (0) #endif /* __ASSEMBLY__ */ diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c index ae11d4e03d0e..1912c7b10ebb 100644 --- a/arch/arm64/mm/pgd.c +++ b/arch/arm64/mm/pgd.c @@ -32,15 +32,22 @@ static struct kmem_cache *pgd_cache; pgd_t *pgd_alloc(struct mm_struct *mm) { - if (PGD_SIZE == PAGE_SIZE) + if (USER_VA_BITS < VA_BITS) { + pgd_t *pgd = (pgd_t *)__get_free_pages(PGALLOC_GFP, 1); + + set_pgd(pgd, + __pgd(__pa((void *)pgd + PAGE_SIZE) | PUD_TYPE_TABLE)); + return pgd; + } else if (PGD_SIZE == PAGE_SIZE) { return (pgd_t *)__get_free_page(PGALLOC_GFP); - else + } else { return kmem_cache_alloc(pgd_cache, PGALLOC_GFP); + } } void pgd_free(struct mm_struct *mm, pgd_t *pgd) { - if (PGD_SIZE == PAGE_SIZE) + if (USER_VA_BITS < VA_BITS || PGD_SIZE == PAGE_SIZE) free_page((unsigned long)pgd); else kmem_cache_free(pgd_cache, pgd); @@ -48,7 +55,7 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd) void __init pgd_cache_init(void) { - if (PGD_SIZE == PAGE_SIZE) + if (USER_VA_BITS < VA_BITS || PGD_SIZE == PAGE_SIZE) return; /*