From patchwork Fri Jan 18 17:43:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 156031 Delivered-To: patch@linaro.org Received: by 2002:a02:48:0:0:0:0:0 with SMTP id 69csp3523980jaa; Fri, 18 Jan 2019 09:43:33 -0800 (PST) X-Google-Smtp-Source: ALg8bN4qJstqT3bAkoFwnbqcQgigdRU/l/wU8PLQC/RqpolrpaLXYyEuQsd8mH87MSGQ08pdhRJf X-Received: by 2002:a17:902:e18d:: with SMTP id cd13mr20073559plb.262.1547833413398; Fri, 18 Jan 2019 09:43:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547833413; cv=none; d=google.com; s=arc-20160816; b=Bl13GNqtcN42v8Fddo34or/Do/tuhZPrtemphSLJyNextpfDAgswsg7IvLz7Ecwa7J wTgVbzQAG66ROIMG82MnlSGUeu5Aa0SGqm0y2hbqge/u3qPfKRinve0sU0poCvm9Wl19 iUXFNVVXoSxKYrWaUhP7NArBPwtk3AhaOwWPGcSBTpIjZHjJ4WdOivQdW5+DVNRmpJsg wnYEmhQjFGdPzf76iuJR6ntryzYxg20kLU+nR2FBus0nBHrm4BcAFp3EAcdxZWRTXGEy lSqhz9yi3lyz+a1qx2oYQv0fsUT2kUkUp4bEbjChQ38RAniwIGzt9BzrC6e1PDW68V6i tuqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=a6+Clyh2qycxFQiZuNwnPZ0jw6xK3RnFgEnN9Oqcwgw=; b=ZHPKmJl8yH1lesCcebn0/0hPQLxdiiDr75Qe8Tml4PWJTNtapC+JN8xv4+VinFAvIu N15a9o2ui8q8KXL+VLspzLmY7Ug3d/PTowZgtnlKqtR/vu7upg7LR3PgbN3DcoL8KdmN o+F9X2DpF2TnvQ67eEVd7FCbNN12ttocso31dMNf5PtRdR0AiAL2YIoB/CZvh0o+h5FL TgiJYG8Joy8RMxTvxXs7XGVkvpFzOjmoE2EIdp+zmGdUNdVAlIktTbzieaQkl5Vsvm4Y M4QBboXMEI3ZNT0FoiSEIKixJFvcP0gaFo2fRb1zP/KYEebnG42txk5pkJSoI8Zdw0hm uXTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=SAMqzaKs; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g92si5181811plg.392.2019.01.18.09.43.33; Fri, 18 Jan 2019 09:43:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=SAMqzaKs; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728163AbfARRnc (ORCPT + 3 others); Fri, 18 Jan 2019 12:43:32 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:36978 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728011AbfARRnc (ORCPT ); Fri, 18 Jan 2019 12:43:32 -0500 Received: by mail-wm1-f65.google.com with SMTP id g67so5268429wmd.2 for ; Fri, 18 Jan 2019 09:43:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=a6+Clyh2qycxFQiZuNwnPZ0jw6xK3RnFgEnN9Oqcwgw=; b=SAMqzaKs27Fzh2OuJSq/2LVYsCubePLLbhCpbL2XxCbQiDq2S4HCWjMCP1LnbVJZys kS7uMIyxHLQVkKXy2+BJ+Hmo6lPp4ZleNlsJaSD0gUtiHxpY/mr+boUnS9sW2v5pNPOp o8EQDedv08r2MTnMvnbgvyAF3lmSv/gIyM8Qk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=a6+Clyh2qycxFQiZuNwnPZ0jw6xK3RnFgEnN9Oqcwgw=; b=pruZSWzKCKmE/iytlRfJWS3ROQDLTiOCU6eCeLK5PWszIRuSL24V0t3u5KoAkZfRKG 58ed5rCuZaIt1AEimNI5wvM/9P3nvEeb4ciDwSA63UMZhUBOsq9fEDsaQFxwHva0/q8T rrzACbYhR8vtijwKfqs4+2f0+VBp0YtMTEA9jtyChiUkQOZfakj7alqxEQ+L+MpjuId8 dMEPnHiLL9uIEXaaIsnB+YS6XcmiDvXtFxnE89j2SiSRXw07lwtQr9GjKN2SP+Vy2nBf sffen93X3XVwor5RJ8rgers3gZgODErJ8r4gUdoJMUr09G7IFAg3gszs0Qe/wfjLxZAy GvTQ== X-Gm-Message-State: AJcUukd0lUO0OaUoExq0AHPBMm6X78q35ZerYDkpgDiW6abS46SwpP2F hiXbttAYkRFsQN70Vqu06iDrATvQ3sVomA== X-Received: by 2002:a1c:81ca:: with SMTP id c193mr17412341wmd.66.1547833409755; Fri, 18 Jan 2019 09:43:29 -0800 (PST) Received: from localhost.localdomain (laubervilliers-657-1-83-120.w92-154.abo.wanadoo.fr. [92.154.90.120]) by smtp.gmail.com with ESMTPSA id v6sm68735424wro.57.2019.01.18.09.43.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Jan 2019 09:43:28 -0800 (PST) From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, marc.zyngier@arm.com, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, james.morse@arm.com, Ard Biesheuvel Subject: [RFC PATCH] arm64: efi: fix chicken-and-egg problem in memreserve code Date: Fri, 18 Jan 2019 18:43:24 +0100 Message-Id: <20190118174324.24715-1-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Sender: linux-efi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org Unfortunately, it appears that the recently introduced and repaired EFI memreserve code is still broken. Originally, we applied all memory reservation passed via the EFI table before doing any memblock allocations. However, this turned out to be problematic, given that the number of reservations is unbounded, and a GICv3 system will reserve a block of memory for each CPU, resulting in hundreds of reservations. We 'fixed' this by deferring the reservations in the memblock table until after we enabled memblock resizing. However, to reach this point, we must have mapped DRAM and the kernel, which itself relies on some memblock allocations for page tables. Also, memblock resizing itself relies on the ability to invoke memblock_alloc() to reallocate those tables themselves. So this is a nice chicken-and-egg problem which is rather difficult to fix cleanly. So instead of a clean solution, I came up with the patch below. The idea is to set a memblock allocation limit below the lowest reservation entry that occurs in the memreserve table. This way, we can map DRAM and the kernel and enable memblock resizing without running the risk of clobbering those reserved regions. After applying all the reservations, the memblock limit restriction is lifted again, allowing the boot to proceed normally. Signed-off-by: Ard Biesheuvel --- The problem with this approach is that it is not guaranteed that the temporary limit will leave enough memory to allocate the page tables and resize the memblock reserved array. Since this is only 10s of KBs, it is unlikely to break in practice, but some pathological behavior may still occur, which is rather nasty :-( arch/arm64/include/asm/memblock.h | 1 + arch/arm64/kernel/setup.c | 2 +- arch/arm64/mm/init.c | 19 ++++++++++ drivers/firmware/efi/efi.c | 39 +++++++++++++++++++- include/linux/efi.h | 7 ++++ 5 files changed, 66 insertions(+), 2 deletions(-) -- 2.20.1 diff --git a/arch/arm64/include/asm/memblock.h b/arch/arm64/include/asm/memblock.h index 6afeed2467f1..461d093e67cf 100644 --- a/arch/arm64/include/asm/memblock.h +++ b/arch/arm64/include/asm/memblock.h @@ -17,5 +17,6 @@ #define __ASM_MEMBLOCK_H extern void arm64_memblock_init(void); +extern void arm64_memblock_post_paging_init(void); #endif diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 4b0e1231625c..a76b165e3f16 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -313,7 +313,7 @@ void __init setup_arch(char **cmdline_p) arm64_memblock_init(); paging_init(); - efi_apply_persistent_mem_reservations(); + arm64_memblock_post_paging_init(); acpi_table_upgrade(); diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 7205a9085b4d..6e95b52b5d07 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -355,6 +355,7 @@ static void __init fdt_enforce_memory_region(void) void __init arm64_memblock_init(void) { const s64 linear_region_size = -(s64)PAGE_OFFSET; + u64 memblock_limit; /* Handle linux,usable-memory-range property */ fdt_enforce_memory_region(); @@ -399,6 +400,18 @@ void __init arm64_memblock_init(void) memblock_add(__pa_symbol(_text), (u64)(_end - _text)); } + /* + * Set a temporary memblock allocation limit so that we don't clobber + * regions that we will want to reserve later. However, since the + * number of reserved regions that can be described this way is + * basically unbounded, we have to defer applying the actual + * reservations until after we have mapped enough memory to allow + * the memblock resize routines to run. + */ + efi_prepare_persistent_mem_reservations(&memblock_limit); + if (memblock_limit < memory_limit) + memblock_set_current_limit(memblock_limit); + if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { /* * Add back the memory we just removed if it results in the @@ -666,3 +679,9 @@ static int __init register_mem_limit_dumper(void) return 0; } __initcall(register_mem_limit_dumper); + +void __init arm64_memblock_post_paging_init(void) +{ + memblock_set_current_limit(memory_limit); + efi_apply_persistent_mem_reservations(); +} diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index 4c46ff6f2242..643e38f5e200 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -595,11 +595,13 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz, return 0; } -int __init efi_apply_persistent_mem_reservations(void) +int __init efi_prepare_persistent_mem_reservations(u64 *lowest) { if (efi.mem_reserve != EFI_INVALID_TABLE_ADDR) { unsigned long prsv = efi.mem_reserve; + *lowest = U64_MAX; + while (prsv) { struct linux_efi_memreserve *rsv; u8 *p; @@ -622,6 +624,41 @@ int __init efi_apply_persistent_mem_reservations(void) /* reserve the entry itself */ memblock_reserve(prsv, EFI_MEMRESERVE_SIZE(rsv->size)); + for (i = 0; i < atomic_read(&rsv->count); i++) + *lowest = min(*lowest, rsv->entry[i].base); + + prsv = rsv->next; + early_memunmap(p, PAGE_SIZE); + } + } + + return 0; +} + +int __init efi_apply_persistent_mem_reservations(void) +{ + if (efi.mem_reserve != EFI_INVALID_TABLE_ADDR) { + unsigned long prsv = efi.mem_reserve; + + while (prsv) { + struct linux_efi_memreserve *rsv; + u8 *p; + int i; + + /* + * Just map a full page: that is what we will get + * anyway, and it permits us to map the entire entry + * before knowing its size. + */ + p = early_memremap(ALIGN_DOWN(prsv, PAGE_SIZE), + PAGE_SIZE); + if (p == NULL) { + pr_err("Could not map UEFI memreserve entry!\n"); + return -ENOMEM; + } + + rsv = (void *)(p + prsv % PAGE_SIZE); + for (i = 0; i < atomic_read(&rsv->count); i++) { memblock_reserve(rsv->entry[i].base, rsv->entry[i].size); diff --git a/include/linux/efi.h b/include/linux/efi.h index be08518c2553..2ec2153fc12e 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -1212,6 +1212,7 @@ extern void efi_reboot(enum reboot_mode reboot_mode, const char *__unused); extern bool efi_is_table_address(unsigned long phys_addr); +extern int efi_prepare_persistent_mem_reservations(u64 *lowest); extern int efi_apply_persistent_mem_reservations(void); #else static inline bool efi_enabled(int feature) @@ -1232,6 +1233,12 @@ static inline bool efi_is_table_address(unsigned long phys_addr) return false; } +static inline int efi_prepare_persistent_mem_reservations(u64 *lowest) +{ + *lowest = U64_MAX; + return 0; +} + static inline int efi_apply_persistent_mem_reservations(void) { return 0;