From patchwork Tue Feb 16 12:52:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 62021 Delivered-To: patch@linaro.org Received: by 10.112.43.199 with SMTP id y7csp1647625lbl; Tue, 16 Feb 2016 04:59:58 -0800 (PST) X-Received: by 10.67.1.209 with SMTP id bi17mr30908416pad.150.1455627598689; Tue, 16 Feb 2016 04:59:58 -0800 (PST) Return-Path: Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id 82si51146849pfq.218.2016.02.16.04.59.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 Feb 2016 04:59:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) smtp.mailfrom=linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org; dkim=neutral (body hash did not verify) header.i=@linaro.org Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1aVfDI-0004jx-EE; Tue, 16 Feb 2016 12:58:44 +0000 Received: from mail-wm0-x236.google.com ([2a00:1450:400c:c09::236]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1aVf8V-00086z-CN for linux-arm-kernel@lists.infradead.org; Tue, 16 Feb 2016 12:53:54 +0000 Received: by mail-wm0-x236.google.com with SMTP id b205so108577836wmb.1 for ; Tue, 16 Feb 2016 04:53:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jBuw0qT/3sA6jkrdxaivuikV/Y6+ZuJm2u8hWePBx3U=; b=GFdW3YGXKeWFhOPigbXbRDchOE52GKKgstAqdm4V1NlWnWu+HxJSm+FOELEb989nrW 73X6ykA3Q6/atWb+1rh7rjRGn8D3bUmi3Kqxx2rpd3YxsKlJirkFlTxTA78JCRMK+Wg4 eqJCHX9AJzytjw12sC/JCNEUCtIBBeP4wzKzg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jBuw0qT/3sA6jkrdxaivuikV/Y6+ZuJm2u8hWePBx3U=; b=NxpnyqdrmIf6jzW3qUAbz1vUDFTZZEkcN0/UYtFnXVXtKrj/fnu1m5PXYpJbrnA15V F4TGFjIdq/I88rix75k2yLuBvRwdBT5yaJQ3MixaSRE1Wdtio6nWL+6S0yT34qzKt+ot mlQ6989rP1OIbTyHwn8qCOK7bWlS02V9vBTEKmLr4WCoh9jxgDRO+I5ANA/GQndP0uFg FUexiUUC79KoYozr9zzu82TP2GRKBglG9ABRQ2R6RlUCsWLla/ahHmp46Q53nYdqEC1t NLKMyZxTPhubTglsuWZQzsMEkElQaSNeaCZa6HSoAHq5TFVG23+Wiarht18Q6szORPn2 ZvGw== X-Gm-Message-State: AG10YOShBfegiFvonCYqa3p9Szwr5S8HXmcHMmDaeYJ1lllOk9+JaUq5rRgc5zq8+DQI34ah X-Received: by 10.194.91.205 with SMTP id cg13mr21677131wjb.166.1455627205852; Tue, 16 Feb 2016 04:53:25 -0800 (PST) Received: from localhost.localdomain ([195.55.142.58]) by smtp.gmail.com with ESMTPSA id x66sm20454816wmb.20.2016.02.16.04.53.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 16 Feb 2016 04:53:24 -0800 (PST) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, mark.rutland@arm.com, keescook@chromium.org Subject: [PATCH v6sub1 11/11] arm64: allow kernel Image to be loaded anywhere in physical memory Date: Tue, 16 Feb 2016 13:52:42 +0100 Message-Id: <1455627162-31600-12-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1455627162-31600-1-git-send-email-ard.biesheuvel@linaro.org> References: <1455627162-31600-1-git-send-email-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160216_045347_916310_E52F1465 X-CRM114-Status: GOOD ( 31.39 ) X-Spam-Score: -2.7 (--) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-2.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [2a00:1450:400c:c09:0:0:0:236 listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: marc.zyngier@arm.com, Ard Biesheuvel , james.morse@arm.com, laurentiu.tudor@nxp.com MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org This relaxes the kernel Image placement requirements, so that it may be placed at any 2 MB aligned offset in physical memory. This is accomplished by ignoring PHYS_OFFSET when installing memblocks, and accounting for the apparent virtual offset of the kernel Image. As a result, virtual address references below PAGE_OFFSET are correctly mapped onto physical references into the kernel Image regardless of where it sits in memory. Special care needs to be taken for dealing with memory limits passed via mem=, since the generic implementation clips memory top down, which may clip the kernel image itself if it is loaded high up in memory. To deal with this case, we simply add back the memory covering the kernel image, which may result in more memory to be retained than was passed as a mem= parameter. Since mem= should not be considered a production feature, a panic notifier handler is installed that dumps the memory limit at panic time if one was set. Signed-off-by: Ard Biesheuvel --- Documentation/arm64/booting.txt | 20 ++++--- arch/arm64/include/asm/boot.h | 6 ++ arch/arm64/include/asm/kernel-pgtable.h | 12 ++++ arch/arm64/include/asm/kvm_asm.h | 17 +----- arch/arm64/include/asm/memory.h | 18 +++--- arch/arm64/kernel/head.S | 6 +- arch/arm64/kernel/image.h | 13 ++-- arch/arm64/mm/init.c | 63 +++++++++++++++++++- arch/arm64/mm/mmu.c | 3 + 9 files changed, 119 insertions(+), 39 deletions(-) -- 2.5.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt index 701d39d3171a..56d6d8b796db 100644 --- a/Documentation/arm64/booting.txt +++ b/Documentation/arm64/booting.txt @@ -109,7 +109,13 @@ Header notes: 1 - 4K 2 - 16K 3 - 64K - Bits 3-63: Reserved. + Bit 3: Kernel physical placement + 0 - 2MB aligned base should be as close as possible + to the base of DRAM, since memory below it is not + accessible via the linear mapping + 1 - 2MB aligned base may be anywhere in physical + memory + Bits 4-63: Reserved. - When image_size is zero, a bootloader should attempt to keep as much memory as possible free for use by the kernel immediately after the @@ -117,14 +123,14 @@ Header notes: depending on selected features, and is effectively unbound. The Image must be placed text_offset bytes from a 2MB aligned base -address near the start of usable system RAM and called there. Memory -below that base address is currently unusable by Linux, and therefore it -is strongly recommended that this location is the start of system RAM. -The region between the 2 MB aligned base address and the start of the -image has no special significance to the kernel, and may be used for -other purposes. +address anywhere in usable system RAM and called there. The region +between the 2 MB aligned base address and the start of the image has no +special significance to the kernel, and may be used for other purposes. At least image_size bytes from the start of the image must be free for use by the kernel. +NOTE: versions prior to v4.6 cannot make use of memory below the +physical offset of the Image so it is recommended that the Image be +placed as close as possible to the start of system RAM. Any memory described to the kernel (even that below the start of the image) which is not marked as reserved from the kernel (e.g., with a diff --git a/arch/arm64/include/asm/boot.h b/arch/arm64/include/asm/boot.h index 81151b67b26b..ebf2481889c3 100644 --- a/arch/arm64/include/asm/boot.h +++ b/arch/arm64/include/asm/boot.h @@ -11,4 +11,10 @@ #define MIN_FDT_ALIGN 8 #define MAX_FDT_SIZE SZ_2M +/* + * arm64 requires the kernel image to placed + * TEXT_OFFSET bytes beyond a 2 MB aligned base + */ +#define MIN_KIMG_ALIGN SZ_2M + #endif diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h index a459714ee29e..5c6375d8528b 100644 --- a/arch/arm64/include/asm/kernel-pgtable.h +++ b/arch/arm64/include/asm/kernel-pgtable.h @@ -79,5 +79,17 @@ #define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS) #endif +/* + * To make optimal use of block mappings when laying out the linear + * mapping, round down the base of physical memory to a size that can + * be mapped efficiently, i.e., either PUD_SIZE (4k granule) or PMD_SIZE + * (64k granule), or a multiple that can be mapped using contiguous bits + * in the page tables: 32 * PMD_SIZE (16k granule) + */ +#ifdef CONFIG_ARM64_64K_PAGES +#define ARM64_MEMSTART_ALIGN SZ_512M +#else +#define ARM64_MEMSTART_ALIGN SZ_1G +#endif #endif /* __ASM_KERNEL_PGTABLE_H */ diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 31b56008f412..054ac25e7c2e 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -26,24 +26,9 @@ #define KVM_ARM64_DEBUG_DIRTY_SHIFT 0 #define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT) -#define kvm_ksym_ref(sym) ((void *)&sym + kvm_ksym_shift) +#define kvm_ksym_ref(sym) phys_to_virt((u64)&sym - kimage_voffset) #ifndef __ASSEMBLY__ -#if __GNUC__ > 4 -#define kvm_ksym_shift (PAGE_OFFSET - KIMAGE_VADDR) -#else -/* - * GCC versions 4.9 and older will fold the constant below into the addend of - * the reference to 'sym' above if kvm_ksym_shift is declared static or if the - * constant is used directly. However, since we use the small code model for - * the core kernel, the reference to 'sym' will be emitted as a adrp/add pair, - * with a +/- 4 GB range, resulting in linker relocation errors if the shift - * is sufficiently large. So prevent the compiler from folding the shift into - * the addend, by making the shift a variable with external linkage. - */ -__weak u64 kvm_ksym_shift = PAGE_OFFSET - KIMAGE_VADDR; -#endif - struct kvm; struct kvm_vcpu; diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 18b7e77c7495..3239e4d78e0d 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -24,6 +24,7 @@ #include #include #include +#include #include /* @@ -88,10 +89,10 @@ #define __virt_to_phys(x) ({ \ phys_addr_t __x = (phys_addr_t)(x); \ __x >= PAGE_OFFSET ? (__x - PAGE_OFFSET + PHYS_OFFSET) : \ - (__x - KIMAGE_VADDR + PHYS_OFFSET); }) + (__x - kimage_voffset); }) #define __phys_to_virt(x) ((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET)) -#define __phys_to_kimg(x) ((unsigned long)((x) - PHYS_OFFSET + KIMAGE_VADDR)) +#define __phys_to_kimg(x) ((unsigned long)((x) + kimage_voffset)) /* * Convert a page to/from a physical address @@ -133,15 +134,16 @@ extern phys_addr_t memstart_addr; /* PHYS_OFFSET - the physical address of the start of memory. */ -#define PHYS_OFFSET ({ memstart_addr; }) +#define PHYS_OFFSET ({ BUG_ON(memstart_addr & 1); memstart_addr; }) + +/* the offset between the kernel virtual and physical mappings */ +extern u64 kimage_voffset; /* - * The maximum physical address that the linear direct mapping - * of system RAM can cover. (PAGE_OFFSET can be interpreted as - * a 2's complement signed quantity and negated to derive the - * maximum size of the linear mapping.) + * Allow all memory at the discovery stage. We will clip it later. */ -#define MAX_MEMBLOCK_ADDR ({ memstart_addr - PAGE_OFFSET - 1; }) +#define MIN_MEMBLOCK_ADDR 0 +#define MAX_MEMBLOCK_ADDR U64_MAX /* * PFNs are used to describe any physical page; this means diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 04d38a058b19..05b98289093e 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -428,7 +428,11 @@ __mmap_switched: and x4, x4, #~(THREAD_SIZE - 1) msr sp_el0, x4 // Save thread_info str_l x21, __fdt_pointer, x5 // Save FDT pointer - str_l x24, memstart_addr, x6 // Save PHYS_OFFSET + + ldr x4, =KIMAGE_VADDR // Save the offset between + sub x4, x4, x24 // the kernel virtual and + str_l x4, kimage_voffset, x5 // physical mappings + mov x29, #0 #ifdef CONFIG_KASAN bl kasan_early_init diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h index 999633bd7294..c9c62cab25a4 100644 --- a/arch/arm64/kernel/image.h +++ b/arch/arm64/kernel/image.h @@ -42,15 +42,18 @@ #endif #ifdef CONFIG_CPU_BIG_ENDIAN -#define __HEAD_FLAG_BE 1 +#define __HEAD_FLAG_BE 1 #else -#define __HEAD_FLAG_BE 0 +#define __HEAD_FLAG_BE 0 #endif -#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2) +#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2) -#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \ - (__HEAD_FLAG_PAGE_SIZE << 1)) +#define __HEAD_FLAG_PHYS_BASE 1 + +#define __HEAD_FLAGS ((__HEAD_FLAG_BE << 0) | \ + (__HEAD_FLAG_PAGE_SIZE << 1) | \ + (__HEAD_FLAG_PHYS_BASE << 3)) /* * These will output as part of the Image header, which should be little-endian diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 52d1fc465885..c0ea54bd9995 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -35,8 +35,10 @@ #include #include +#include #include #include +#include #include #include #include @@ -46,7 +48,13 @@ #include "mm.h" -phys_addr_t memstart_addr __read_mostly = 0; +/* + * We need to be able to catch inadvertent references to memstart_addr + * that occur (potentially in generic code) before arm64_memblock_init() + * executes, which assigns it its actual value. So use a default value + * that cannot be mistaken for a real physical address. + */ +phys_addr_t memstart_addr __read_mostly = ~0ULL; phys_addr_t arm64_dma_phys_limit __read_mostly; #ifdef CONFIG_BLK_DEV_INITRD @@ -160,7 +168,33 @@ early_param("mem", early_mem); void __init arm64_memblock_init(void) { - memblock_enforce_memory_limit(memory_limit); + const s64 linear_region_size = -(s64)PAGE_OFFSET; + + /* + * Select a suitable value for the base of physical memory. + */ + memstart_addr = round_down(memblock_start_of_DRAM(), + ARM64_MEMSTART_ALIGN); + + /* + * Remove the memory that we will not be able to cover with the + * linear mapping. Take care not to clip the kernel which may be + * high in memory. + */ + memblock_remove(max(memstart_addr + linear_region_size, __pa(_end)), + ULLONG_MAX); + if (memblock_end_of_DRAM() > linear_region_size) + memblock_remove(0, memblock_end_of_DRAM() - linear_region_size); + + /* + * Apply the memory limit if it was set. Since the kernel may be loaded + * high up in memory, add back the kernel region that must be accessible + * via the linear mapping. + */ + if (memory_limit != (phys_addr_t)ULLONG_MAX) { + memblock_enforce_memory_limit(memory_limit); + memblock_add(__pa(_text), (u64)(_end - _text)); + } /* * Register the kernel text, kernel data, initrd, and initial @@ -386,3 +420,28 @@ static int __init keepinitrd_setup(char *__unused) __setup("keepinitrd", keepinitrd_setup); #endif + +/* + * Dump out memory limit information on panic. + */ +static int dump_mem_limit(struct notifier_block *self, unsigned long v, void *p) +{ + if (memory_limit != (phys_addr_t)ULLONG_MAX) { + pr_emerg("Memory Limit: %llu MB\n", memory_limit >> 20); + } else { + pr_emerg("Memory Limit: none\n"); + } + return 0; +} + +static struct notifier_block mem_limit_notifier = { + .notifier_call = dump_mem_limit, +}; + +static int __init register_mem_limit_dumper(void) +{ + atomic_notifier_chain_register(&panic_notifier_list, + &mem_limit_notifier); + return 0; +} +__initcall(register_mem_limit_dumper); diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 6eb8e49889d0..b83c6bf7f90d 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -46,6 +46,9 @@ u64 idmap_t0sz = TCR_T0SZ(VA_BITS); +u64 kimage_voffset __read_mostly; +EXPORT_SYMBOL(kimage_voffset); + /* * Empty_zero_page is a special page that is used for zero-initialized data * and COW.