Message ID | 1455289046-21321-5-git-send-email-ard.biesheuvel@linaro.org |
---|---|
State | New |
Headers | show |
On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: > Since memstart_addr is assigned relatively late in the boot code, > after generic code like DT parsing and memblock manipulation has > already occurred, we need to ensure that no __va() translation occur > until memstart_addr has been set to a meaningful value. > > So initialize memstart_addr to a value that cannot represent a valid > physical address, and BUG() if memstart_addr is referenced while it > still holds this value. Note that the > comparison against LLONG_MAX > (not ULLONG_MAX) resolves to a single tbnz instruction that performs > a conditional jump to a brk instruction that is emitted out of line. Even so, I'd imagine that having a measurable impact on system performance. Did you have a go at benchmarking this? Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 22 February 2016 at 17:52, Will Deacon <will.deacon@arm.com> wrote: > On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: >> Since memstart_addr is assigned relatively late in the boot code, >> after generic code like DT parsing and memblock manipulation has >> already occurred, we need to ensure that no __va() translation occur >> until memstart_addr has been set to a meaningful value. >> >> So initialize memstart_addr to a value that cannot represent a valid >> physical address, and BUG() if memstart_addr is referenced while it >> still holds this value. Note that the > comparison against LLONG_MAX >> (not ULLONG_MAX) resolves to a single tbnz instruction that performs >> a conditional jump to a brk instruction that is emitted out of line. > > Even so, I'd imagine that having a measurable impact on system > performance. Did you have a go at benchmarking this? > So in what kind of workload would the __pa() translation be on a hot path? If you're dealing with DMA or other things that involve physical addresses, surely, the single predicted non-taken branch instruction shouldn't hurt? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Feb 22, 2016 at 04:52:09PM +0000, Will Deacon wrote: > On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: > > Since memstart_addr is assigned relatively late in the boot code, > > after generic code like DT parsing and memblock manipulation has > > already occurred, we need to ensure that no __va() translation occur > > until memstart_addr has been set to a meaningful value. > > > > So initialize memstart_addr to a value that cannot represent a valid > > physical address, and BUG() if memstart_addr is referenced while it > > still holds this value. Note that the > comparison against LLONG_MAX > > (not ULLONG_MAX) resolves to a single tbnz instruction that performs > > a conditional jump to a brk instruction that is emitted out of line. > > Even so, I'd imagine that having a measurable impact on system > performance. Did you have a go at benchmarking this? Good point, I forgot about this discussion. We should revert this part indeed or at least make it dependent on some config option like DEBUG_VM. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 22 February 2016 at 18:26, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Mon, Feb 22, 2016 at 04:52:09PM +0000, Will Deacon wrote: >> On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: >> > Since memstart_addr is assigned relatively late in the boot code, >> > after generic code like DT parsing and memblock manipulation has >> > already occurred, we need to ensure that no __va() translation occur >> > until memstart_addr has been set to a meaningful value. >> > >> > So initialize memstart_addr to a value that cannot represent a valid >> > physical address, and BUG() if memstart_addr is referenced while it >> > still holds this value. Note that the > comparison against LLONG_MAX >> > (not ULLONG_MAX) resolves to a single tbnz instruction that performs >> > a conditional jump to a brk instruction that is emitted out of line. >> >> Even so, I'd imagine that having a measurable impact on system >> performance. Did you have a go at benchmarking this? > > Good point, I forgot about this discussion. We should revert this part > indeed or at least make it dependent on some config option like > DEBUG_VM. > OK, let me code that up _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Mon, Feb 22, 2016 at 06:17:40PM +0100, Ard Biesheuvel wrote: > On 22 February 2016 at 17:52, Will Deacon <will.deacon@arm.com> wrote: > > On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: > >> Since memstart_addr is assigned relatively late in the boot code, > >> after generic code like DT parsing and memblock manipulation has > >> already occurred, we need to ensure that no __va() translation occur > >> until memstart_addr has been set to a meaningful value. > >> > >> So initialize memstart_addr to a value that cannot represent a valid > >> physical address, and BUG() if memstart_addr is referenced while it > >> still holds this value. Note that the > comparison against LLONG_MAX > >> (not ULLONG_MAX) resolves to a single tbnz instruction that performs > >> a conditional jump to a brk instruction that is emitted out of line. > > > > Even so, I'd imagine that having a measurable impact on system > > performance. Did you have a go at benchmarking this? > > So in what kind of workload would the __pa() translation be on a hot > path? If you're dealing with DMA or other things that involve physical > addresses, surely, the single predicted non-taken branch instruction > shouldn't hurt? I recall we looked at this in the early arm64 days and found a lot of memory accesses to memstart_addr but we decided to keep it as the alternatives would have been: (a) no more single Image or (b) always 4-levels page tables. You could try perf to get some statistics but, for example, most of the code that works on pages (e.g. block I/O) and needs to access a page ends up doing a kmap(page) which in turns does a __va(). You also have lots of virt_to_page() calls in sl*b, so we need to see what impact this change has. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On 22 February 2016 at 18:41, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Mon, Feb 22, 2016 at 06:17:40PM +0100, Ard Biesheuvel wrote: >> On 22 February 2016 at 17:52, Will Deacon <will.deacon@arm.com> wrote: >> > On Fri, Feb 12, 2016 at 03:57:26PM +0100, Ard Biesheuvel wrote: >> >> Since memstart_addr is assigned relatively late in the boot code, >> >> after generic code like DT parsing and memblock manipulation has >> >> already occurred, we need to ensure that no __va() translation occur >> >> until memstart_addr has been set to a meaningful value. >> >> >> >> So initialize memstart_addr to a value that cannot represent a valid >> >> physical address, and BUG() if memstart_addr is referenced while it >> >> still holds this value. Note that the > comparison against LLONG_MAX >> >> (not ULLONG_MAX) resolves to a single tbnz instruction that performs >> >> a conditional jump to a brk instruction that is emitted out of line. >> > >> > Even so, I'd imagine that having a measurable impact on system >> > performance. Did you have a go at benchmarking this? >> >> So in what kind of workload would the __pa() translation be on a hot >> path? If you're dealing with DMA or other things that involve physical >> addresses, surely, the single predicted non-taken branch instruction >> shouldn't hurt? > > I recall we looked at this in the early arm64 days and found a lot of > memory accesses to memstart_addr but we decided to keep it as the > alternatives would have been: (a) no more single Image or (b) always > 4-levels page tables. > > You could try perf to get some statistics but, for example, most of the > code that works on pages (e.g. block I/O) and needs to access a page > ends up doing a kmap(page) which in turns does a __va(). You also have > lots of virt_to_page() calls in sl*b, so we need to see what impact this > change has. > I guess virt_to_page() is only valid on linear addresses, so we could reimplement it not to use the comparison with PAGE_OFFSET that I added to __pa() for the kernmap stuff _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index c900883a3119..ae398919fb5f 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -24,6 +24,7 @@ #include <linux/compiler.h> #include <linux/const.h> #include <linux/types.h> +#include <asm/bug.h> #include <asm/sizes.h> /* @@ -133,7 +134,8 @@ extern phys_addr_t memstart_addr; /* PHYS_OFFSET - the physical address of the start of memory. */ -#define PHYS_OFFSET ({ memstart_addr; }) +#define PHYS_OFFSET \ + ({ BUG_ON(memstart_addr > LLONG_MAX); memstart_addr; }) /* the virtual base of the kernel image (minus TEXT_OFFSET) */ extern u64 kimage_vaddr; diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index ed85778b32e5..023c41f22b5b 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -48,7 +48,13 @@ #include "mm.h" -phys_addr_t memstart_addr __read_mostly = 0; +/* + * We need to be able to catch inadvertent references to memstart_addr + * that occur (potentially in generic code) before arm64_memblock_init() + * executes, which assigns it its actual value. So use a default value + * that cannot be mistaken for a real physical address. + */ +phys_addr_t memstart_addr __read_mostly = ULLONG_MAX; phys_addr_t arm64_dma_phys_limit __read_mostly; #ifdef CONFIG_BLK_DEV_INITRD
Since memstart_addr is assigned relatively late in the boot code, after generic code like DT parsing and memblock manipulation has already occurred, we need to ensure that no __va() translation occur until memstart_addr has been set to a meaningful value. So initialize memstart_addr to a value that cannot represent a valid physical address, and BUG() if memstart_addr is referenced while it still holds this value. Note that the > comparison against LLONG_MAX (not ULLONG_MAX) resolves to a single tbnz instruction that performs a conditional jump to a brk instruction that is emitted out of line. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- arch/arm64/include/asm/memory.h | 4 +++- arch/arm64/mm/init.c | 8 +++++++- 2 files changed, 10 insertions(+), 2 deletions(-) -- 2.5.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel