From patchwork Fri Jun 19 12:53:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marek Bykowski X-Patchwork-Id: 242682 List-Id: U-Boot discussion From: marek.bykowski at gmail.com (marek.bykowski at gmail.com) Date: Fri, 19 Jun 2020 14:53:32 +0200 Subject: [PATCHv2] armv8: MMU: Mark code memory Executable, any other Non-Executable In-Reply-To: <20200619112409.27523-1-marek.bykowski@gmail.com> References: <20200619112409.27523-1-marek.bykowski@gmail.com> Message-ID: <20200619125332.27726-1-marek.bykowski@gmail.com> From: Marek Bykowski If the location the ARM CPU is accessing is executable (translation table descriptor Execute-Never attribute bit cleared) then the ARM CPU fetches a number of instructions from that location all at the same time. For example, Cortex-A57 can source up to 128 bits per fetch depending on alignment. If the CPU mispredicts to the Execute-Never region, it creates the memory fault but it actually never uses the instructions mispredicted. The CPU branches away elsewhere. So, as long as we program the MMU correctly these mispredictions will only affect the performance. However if we fail programming so and the instruction fetch logic goes mispredict to non-instruction memory it may eventually perturb it, eg. corrupt the FIFO, or the control registers, load the unified cache the data side memory system hits into subsequently. U-Boot adheres into attributing the device regions to Execute-Never but it actually fails doing so for data regions. Data as well as Device Regions should be Execute-Never. This patch enables attributing data memory regions to Non-Executable, and code region to Executable, additionally to Read-Only. Read-Only ensures the code region is only Readable resulting in Instruction Abort, Permission Fault exception on a write. To use the updated attributes the DDR memory of interest should be Non-Executable, it is by default Read-Write Access Permission as well with an example as follows: static struct mm_region axxia_mem_map[] = { { .virt = 0x0UL, /* DDR */ .phys = 0x0UL, .size = 0x400000000ULL, .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) | PTE_BLOCK_INNER_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN }, { .virt = AXXIA_CCN_512_BASE, .phys = AXXIA_CCN_512_BASE, .size = AXXIA_CCN_512_SIZE, .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN }, { .virt = AXI_MMAP_BASE, .phys = AXI_MMAP_BASE, .size = AXI_MMAP_SIZE, .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN }, { .virt = AXI_PERIPH_BASE, .phys = AXI_PERIPH_BASE, .size = AXI_PERIPH_SIZE, .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) | PTE_BLOCK_NON_SHARE | PTE_BLOCK_PXN | PTE_BLOCK_UXN }, { /* List terminator */ 0, } }; add_text_map() routine in cache_v8.c locates then the code region in it and modifies its attributes to Executable, Read-Only. The HW Debugger views the Memory Map set then as: EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN EL2N:0x3FD36000-0x3FD86FFF NP:0x3FD36000-0x3FD86FFF Normal RO C S EL2N:0x3FD87000-0x3FFFFFFFF NP:0x3FD87000-0x3FFFFFFFF Normal RW C S XN EL2N:0x400000000-0x3FFFFFFFFF EL2N:0x4000000000-0x403FFFFFFF NP:0x4000000000-0x403FFFFFFF Device-nGnRnE RW NC XN EL2N:0x4040000000-0x7FFFFFFFFF EL2N:0x8000000000-0x803FFFFFFF NP:0x8000000000-0x803FFFFFFF Device-nGnRnE RW NC XN EL2N:0x8040000000-0x807FFFFFFF EL2N:0x8080000000-0x80BFFFFFFF NP:0x8080000000-0x80BFFFFFFF Device-nGnRnE RW NC XN EL2N:0x80C0000000-0xFFFFFFFFFF where: C, NC is Cacheable, Non-Cacheable respectively, S, NS - Shareable and Non-Shareable, XN - Execute-Never, if XN isn't present it is Executable. Shareability of the Device is always treated as Outer Shareable, regardless of the attributes, therefore the Shareability for the Device Regions is not mentioned here. Signed-off-by: Marek Bykowski --- Patch v1 -> v2: I have only changed the description of the commit as incorrectly address the Shareability domain there. arch/arm/cpu/armv8/cache_v8.c | 21 +++++++++++++++++++++ arch/arm/cpu/armv8/u-boot.lds | 6 ++++-- arch/arm/include/asm/armv8/mmu.h | 6 ++++++ arch/arm/lib/sections.c | 1 + include/asm-generic/sections.h | 1 + 5 files changed, 33 insertions(+), 2 deletions(-) diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c index c1a08fb4ac..cb25fc0c8a 100644 --- a/arch/arm/cpu/armv8/cache_v8.c +++ b/arch/arm/cpu/armv8/cache_v8.c @@ -11,6 +11,7 @@ #include #include #include +#include DECLARE_GLOBAL_DATA_PTR; @@ -361,6 +362,24 @@ __weak u64 get_page_table_size(void) return size; } +__weak void add_text_map(void) +{ + if (!(gd->flags & GD_FLG_RELOC)) + return; + + /* Text is always XN=0, read-only region. */ + struct mm_region text = { + .virt = (unsigned long)__image_copy_start, + .phys = (unsigned long)__image_copy_start, + .size = (unsigned long)__text_end - (unsigned long)__image_copy_start, + .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) | + PTE_BLOCK_INNER_SHARE | + PTE_BLOCK_AP_RO + }; + + add_map(&text); +} + void setup_pgtables(void) { int i; @@ -378,6 +397,8 @@ void setup_pgtables(void) /* Now add all MMU table entries one after another to the table */ for (i = 0; mem_map[i].size || mem_map[i].attrs; i++) add_map(&mem_map[i]); + + add_text_map(); } static void setup_all_pgtables(void) diff --git a/arch/arm/cpu/armv8/u-boot.lds b/arch/arm/cpu/armv8/u-boot.lds index 2554980595..20f98ae74d 100644 --- a/arch/arm/cpu/armv8/u-boot.lds +++ b/arch/arm/cpu/armv8/u-boot.lds @@ -20,8 +20,8 @@ SECTIONS #endif . = 0x00000000; - . = ALIGN(8); - .text : + /* Align .text to the page size */ + .text ALIGN(0x1000): { *(.__image_copy_start) CPUDIR/start.o (.text*) @@ -39,6 +39,8 @@ SECTIONS .text_rest : { *(.text*) + . = NEXT(0x1000); + KEEP(*(.__text_end)) } #ifdef CONFIG_ARMV8_PSCI diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h index 4a573208df..4bd15c7e08 100644 --- a/arch/arm/include/asm/armv8/mmu.h +++ b/arch/arm/include/asm/armv8/mmu.h @@ -58,6 +58,12 @@ */ #define PTE_BLOCK_MEMTYPE(x) ((x) << 2) #define PTE_BLOCK_NS (1 << 5) +/* + * AP[1] bit is ignored by hardware and is + * treated as if it was One in EL2/EL3 + */ +#define PTE_BLOCK_AP_RO (1 << 7) +#define PTE_BLOCK_AP_RW (0 << 7) #define PTE_BLOCK_NON_SHARE (0 << 8) #define PTE_BLOCK_OUTER_SHARE (2 << 8) #define PTE_BLOCK_INNER_SHARE (3 << 8) diff --git a/arch/arm/lib/sections.c b/arch/arm/lib/sections.c index 3bb2d8468c..a76f2d3049 100644 --- a/arch/arm/lib/sections.c +++ b/arch/arm/lib/sections.c @@ -18,6 +18,7 @@ * aliasing warnings. */ +char __text_end[0] __attribute__((section(".__text_end"))); char __bss_start[0] __attribute__((section(".__bss_start"))); char __bss_end[0] __attribute__((section(".__bss_end"))); char __image_copy_start[0] __attribute__((section(".__image_copy_start"))); diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h index 17a31ec788..1487e16432 100644 --- a/include/asm-generic/sections.h +++ b/include/asm-generic/sections.h @@ -70,6 +70,7 @@ extern void _start(void); */ #ifdef CONFIG_ARM +extern char __text_end[]; extern char __bss_start[]; extern char __bss_end[]; extern char __image_copy_start[];