diff mbox

Change of TEXT_OFFSET for multi_v7_defconfig

Message ID alpine.LFD.2.11.1404161428280.980@knanqh.ubzr
State New
Headers show

Commit Message

Nicolas Pitre April 16, 2014, 7:14 p.m. UTC
On Wed, 16 Apr 2014, Christopher Covington wrote:

> On 04/15/2014 06:44 AM, Daniel Thompson wrote:
> > Hi Folks
> > 
> > I've just been rebasing some of my development branches against v3.15rc1
> > and observed some boot regressions due to TEXT_OFFSET changing from
> > 0x8000 to 0x208000.
> > 
> > Now the boot regression turned out to be fault in the JTAG boot tools I
> > was using (it had internally hardcoded to TEXT_OFFSET to 0x8000 when
> > calculating what physical load address to use). I've fixed the JTAG
> > loader and my own boards now boots fine.
> 
> Your tools are not alone in being affected by this change. QEMU is considering
> changing their hard-coded value to 0x8000 [1], which I was eager to see until
> being reminded of this (that patch would still be an improvement, but not
> enough for users of new multiplatform kernels).
> 
> The boot-wrapper [2] (the default bootloader for ARM's proprietary models
> which could potentially be used on other systems) is also affected.

Why would QEMU and the ARM boot-wrap-per care about the kernel 
TEXT_OFFSET value?

I may understand the desire to boot a plain uncompressed Image over JTAG 
and in this case you are amongst a very small group of people doing so 
and therefore should be knowing what you're doing.  In other words this 
is a minor inconvenient to a few people.

But both QEMU and the boot-wrapper should deal with zImage. That's the 
only image format with documented load offset is guaranteed not to 
change i.e. it can be loaded at about any offset as zImage knows how to 
relocate itself as needed.  There is nowhere a guarantee that 
TEXT_OFFSET can't change.

And if you think booting zImage on ARM models is too slow, then simply 
try out CONFIG_KERNEL_LZO.

> My current thinking is that even if we temporarily removed variance (the
> jumping about) by maybe building every image with the maximum offset that any
> image could have, there would still be variance between images built before
> and after that change, and maybe also when some new platform gets added with
> an even higher offset. So if there's going to be variance, could we maybe make
> it no longer a problem?

There is already no problem with zImage.

> It seems to me that if external/uncompressed image loaders could query the
> text offset in a straightforward manner, variance between images could be
> easily dealt with. Would reading it out of ELF metadata be a reasonable
> mechanism? Could the ELF format vmlinux be a suitable general replacement for
> the raw Image?

The ELF image only has virtual addresses in it.  And the virtual address 
of the kernel may be changed independently of TEXT_OFFSET with 
CONFIG_VMSPLIT_*.

> Now at least with my current .config, the vmlinux only has virtual addresses.
> Documentation/arm/Booting says the MMU has to be off at boot time so this
> still might not be the ideal input for image loaders. Tools could hard-code
> the phsyical-to-virtual offset instead of the TEXT_OFFSET. Is that less likely
> to vary?

Physical offset does vary from one platform to another, so this 
particular physical-to-virtual offset is actually determined at run time 
and the kernel runtime patched during early boot -- see __fixup_pv_table 
in arch/arm/kernel/head.S.

> Or could we patch up the linker script to set zero-based ELF load
> memory addresses (LMAs) [4] so that the physical addresses are almost right,
> you just might have to add a system-specific RAM offset, perhaps pulled out of
> the device tree? If that won't work, we could generate some kind of
> vmlinux-phys with physical addresses. The latter two options might also
> simplify external debugging before __turn_mmu_on(). I like the sound of the
> LMA approach best, assuming it doesn't break existing stuff (I notice a few AT
> directives in vmlinux.lds.S). Some of this might transfer to arm64 as well.
> What do you all think?

If you really really want to get at the TEXT_OFFSET value in the 
uncompressed image, the simplest way would be:


This way the first word for Image would always be 0xea000000 and the 
second one would be TEXT_OFFSET.  No other kernel Image binaries ever 
had 0xea000000 as their first word so that also let you validate whether 
or not the TEXT_OFFSET value is there.


Nicolas

Comments

Christopher Covington April 16, 2014, 9:08 p.m. UTC | #1
Hi Nicolas,

Thanks for your response.

On 04/16/2014 03:14 PM, Nicolas Pitre wrote:
> On Wed, 16 Apr 2014, Christopher Covington wrote:
> 
>> On 04/15/2014 06:44 AM, Daniel Thompson wrote:
>>> Hi Folks
>>>
>>> I've just been rebasing some of my development branches against v3.15rc1
>>> and observed some boot regressions due to TEXT_OFFSET changing from
>>> 0x8000 to 0x208000.
>>>
>>> Now the boot regression turned out to be fault in the JTAG boot tools I
>>> was using (it had internally hardcoded to TEXT_OFFSET to 0x8000 when
>>> calculating what physical load address to use). I've fixed the JTAG
>>> loader and my own boards now boots fine.
>>
>> Your tools are not alone in being affected by this change. QEMU is considering
>> changing their hard-coded value to 0x8000 [1], which I was eager to see until
>> being reminded of this (that patch would still be an improvement, but not
>> enough for users of new multiplatform kernels).
>>
>> The boot-wrapper [2] (the default bootloader for ARM's proprietary models
>> which could potentially be used on other systems) is also affected.
> 
> Why would QEMU and the ARM boot-wrap-per care about the kernel 
> TEXT_OFFSET value?

The simulators I'm familiar with all have the equivalent of a built-in JTAG
debugger, capable of peeking at and poking memory, servicing Angel semihosting
calls, and so on. Knowledge of the TEXT_OFFSET is required for loading
non-self-uncompressing images for the same reasons as when using JTAG on real
hardware, as I understand it.

> I may understand the desire to boot a plain uncompressed Image over JTAG 
> and in this case you are amongst a very small group of people doing so 
> and therefore should be knowing what you're doing.  In other words this 
> is a minor inconvenient to a few people.

I didn't mean to imply that there is a large user base for this style of
loading, just that an approach that works across multiple tools would be nice
if change is warranted.

> But both QEMU and the boot-wrapper should deal with zImage. That's the 
> only image format with documented load offset is guaranteed not to 
> change i.e. it can be loaded at about any offset as zImage knows how to 
> relocate itself as needed.  There is nowhere a guarantee that 
> TEXT_OFFSET can't change.

QEMU definitely does support zImage and I believe it's promoted as the main
boot method. I would expect the bootwrapper to work with zImages, as its (in
the non-semihosting case) basically just packing the kernel, device tree and
initramfs up into an ELF file that's loaded into memory by a simulator's
built-in JTAG-like loader.

> And if you think booting zImage on ARM models is too slow, then simply 
> try out CONFIG_KERNEL_LZO.

Thanks for the tip.

>> My current thinking is that even if we temporarily removed variance (the
>> jumping about) by maybe building every image with the maximum offset that any
>> image could have, there would still be variance between images built before
>> and after that change, and maybe also when some new platform gets added with
>> an even higher offset. So if there's going to be variance, could we maybe make
>> it no longer a problem?
> 
> There is already no problem with zImage.
> 
>> It seems to me that if external/uncompressed image loaders could query the
>> text offset in a straightforward manner, variance between images could be
>> easily dealt with. Would reading it out of ELF metadata be a reasonable
>> mechanism? Could the ELF format vmlinux be a suitable general replacement for
>> the raw Image?
> 
> The ELF image only has virtual addresses in it.  And the virtual address 
> of the kernel may be changed independently of TEXT_OFFSET with 
> CONFIG_VMSPLIT_*.

Do you know why this is the case? The ELF format is capable of storing
physical addresses as mentioned below.

>> Now at least with my current .config, the vmlinux only has virtual addresses.
>> Documentation/arm/Booting says the MMU has to be off at boot time so this
>> still might not be the ideal input for image loaders. Tools could hard-code
>> the phsyical-to-virtual offset instead of the TEXT_OFFSET. Is that less likely
>> to vary?
> 
> Physical offset does vary from one platform to another, so this 
> particular physical-to-virtual offset is actually determined at run time 
> and the kernel runtime patched during early boot -- see __fixup_pv_table 
> in arch/arm/kernel/head.S.

What I meant to ask about was variance from one kernel version and build to
the next, given a single platform. Platform-to-platform variation can probably
be abstracted where needed by the scripts controlling the external load. In
any case, CONFIG_VMSPLIT_* that you mentioned above would be an example where
it would vary in an inconvenient manner, so this approach wouldn't be an
improvement.

>> Or could we patch up the linker script to set zero-based ELF load
>> memory addresses (LMAs) [4] so that the physical addresses are almost right,
>> you just might have to add a system-specific RAM offset, perhaps pulled out of

I don't think I made this very clear, but adding the offset would happen at
load/run-time, controlled by JTAG scripts or simulator equivalent.

>> the device tree? If that won't work, we could generate some kind of
>> vmlinux-phys with physical addresses. The latter two options might also
>> simplify external debugging before __turn_mmu_on(). I like the sound of the
>> LMA approach best, assuming it doesn't break existing stuff (I notice a few AT
>> directives in vmlinux.lds.S). Some of this might transfer to arm64 as well.
>> What do you all think?
>
> If you really really want to get at the TEXT_OFFSET value in the 
> uncompressed image, the simplest way would be:
> 
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index f8c08839ed..de84d0635a 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -78,6 +78,11 @@
>  
>  	__HEAD
>  ENTRY(stext)
> +
> +	b	1f
> +	.word	TEXT_OFFSET		@ located at a 4-byte offset in Image
> +1:
> +
>   ARM_BE8(setend	be )			@ ensure we are in BE8 mode
>  
>   THUMB(	adr	r9, BSYM(1f)	)	@ Kernel is always entered in ARM.
> 
> This way the first word for Image would always be 0xea000000 and the 
> second one would be TEXT_OFFSET.  No other kernel Image binaries ever 
> had 0xea000000 as their first word so that also let you validate whether 
> or not the TEXT_OFFSET value is there.

Thank you for the suggestion. This approach also came to mind, but it would
require new documentation and tooling in the JTAG scripts or simulator
equivalent. That's another aspect of the ELF-based approaches that I
like--hopefully existing documentation and tool support could be reused.

Thanks,
Christopher
Peter Maydell April 16, 2014, 9:36 p.m. UTC | #2
On 16 April 2014 22:08, Christopher Covington <cov@codeaurora.org> wrote:
> On 04/16/2014 03:14 PM, Nicolas Pitre wrote:
>> But both QEMU and the boot-wrapper should deal with zImage. That's the
>> only image format with documented load offset is guaranteed not to
>> change i.e. it can be loaded at about any offset as zImage knows how to
>> relocate itself as needed.  There is nowhere a guarantee that
>> TEXT_OFFSET can't change.
>
> QEMU definitely does support zImage and I believe it's promoted as the main
> boot method.

Yes; we also support uImage. The code nominally handling Image
actually currently loads at 0x10000, so the set of people who actually
try to use it is obviously not very large :-)

thanks
-- PMM
Russell King - ARM Linux April 16, 2014, 10:33 p.m. UTC | #3
On Wed, Apr 16, 2014 at 05:08:35PM -0400, Christopher Covington wrote:
> What I meant to ask about was variance from one kernel version and build to
> the next, given a single platform. Platform-to-platform variation can probably
> be abstracted where needed by the scripts controlling the external load. In
> any case, CONFIG_VMSPLIT_* that you mentioned above would be an example where
> it would vary in an inconvenient manner, so this approach wouldn't be an
> improvement.

No it wouldn't.  CONFIG_VMSPLIT_* has nothing to do with where the kernel
is loaded in physical memory.  That's all about how the kernel sets up the
page tables, and where the kernel eventually expects to run in the virtual
address space.

As far as the debugger goes, it still loads the kernel at the exact same
address irrespective of what CONFIG_VMSPLIT_* setting is chosen.

The issue here is with arm-soc's single zImage project sucking in existing
platforms where there's a requirement to keep the first N kB of memory
free from use - eg, because a boot loader likes to scribble over it, or
it's in use by a DSP processor, or something similar.

Once one of those platforms is merged and enabled in the single zImage,
the offset into RAM that the kernel must be loaded has to change to
avoid clashing on those platforms.

So, there really isn't one single Kconfig option you can point at to tell
what physical RAM offset the kernel needs to be loaded at... it depends
what platforms are enabled in the kernel you're trying to boot.
Russell King - ARM Linux April 16, 2014, 10:34 p.m. UTC | #4
On Wed, Apr 16, 2014 at 10:36:11PM +0100, Peter Maydell wrote:
> On 16 April 2014 22:08, Christopher Covington <cov@codeaurora.org> wrote:
> > On 04/16/2014 03:14 PM, Nicolas Pitre wrote:
> >> But both QEMU and the boot-wrapper should deal with zImage. That's the
> >> only image format with documented load offset is guaranteed not to
> >> change i.e. it can be loaded at about any offset as zImage knows how to
> >> relocate itself as needed.  There is nowhere a guarantee that
> >> TEXT_OFFSET can't change.
> >
> > QEMU definitely does support zImage and I believe it's promoted as the main
> > boot method.
> 
> Yes; we also support uImage. The code nominally handling Image
> actually currently loads at 0x10000, so the set of people who actually
> try to use it is obviously not very large :-)

For ARM, that means precisely zero users without modification of that.
We've never supported an offset of 0x10000 in mainline kernels.
Nicolas Pitre April 16, 2014, 11:21 p.m. UTC | #5
On Wed, 16 Apr 2014, Christopher Covington wrote:

> On 04/16/2014 03:14 PM, Nicolas Pitre wrote:
> > On Wed, 16 Apr 2014, Christopher Covington wrote:
> > 
> >> It seems to me that if external/uncompressed image loaders could query the
> >> text offset in a straightforward manner, variance between images could be
> >> easily dealt with. Would reading it out of ELF metadata be a reasonable
> >> mechanism? Could the ELF format vmlinux be a suitable general replacement for
> >> the raw Image?
> > 
> > The ELF image only has virtual addresses in it.  And the virtual address 
> > of the kernel may be changed independently of TEXT_OFFSET with 
> > CONFIG_VMSPLIT_*.
> 
> Do you know why this is the case? The ELF format is capable of storing
> physical addresses as mentioned below.

We don't know at build time what the physical address of the kernel will 
be.  Back when the kernel could boot on a single SOC family then the 
physical address was hardcoded in the kernel binary.  And with some 
special configs (running a XIP kernel is the most obvious example) it is 
still the case.  But in general we don't want to hardcode any physical 
address in order to boot the same kernel binary on as many different 
machines as possible.  Therefore the physical address is determined at 
run time in most cases and having it in the ELF file would be 
meaningless.

The virtual address for the kernel, including TEXT_OFFSET, is determined 
at build time.  And that's what the ELF file carries.

> What I meant to ask about was variance from one kernel version and build to
> the next, given a single platform. Platform-to-platform variation can probably
> be abstracted where needed by the scripts controlling the external load.

Right.  The platform script must know where physical RAM is and load the 
kernel accordingly.  In case of zImage you need to load it anywhere in 
the first 128MB of RAM.  In the uncompressed Image it is at TEXT_OFFSET 
from start of physical RAM.  

> In any case, CONFIG_VMSPLIT_* that you mentioned above would be an 
> example where it would vary in an inconvenient manner, so this 
> approach wouldn't be an improvement.

Well.  This config option changes the virtual address the kernel is 
going to set up for itself.  And that is going to be reflected in the 
ELF file as well.  That part is pretty machine independent and can be 
obtained easily with tools like debuggers.  you pretty much need the ELF 
file if you want to have symbolic debugging anyway.  So no issue there.

> >> Or could we patch up the linker script to set zero-based ELF load
> >> memory addresses (LMAs) [4] so that the physical addresses are almost right,
> >> you just might have to add a system-specific RAM offset, perhaps pulled out of
> 
> I don't think I made this very clear, but adding the offset would happen at
> load/run-time, controlled by JTAG scripts or simulator equivalent.

Sure.  To load and debug the code before the MMU is turned on, you need 
to know the physical address of RAM (the kernel binary doesn't carry it) 
and the value of TEXT_OFFSET (this is hardcoded in the kernel binary at 
build time).

> > If you really really want to get at the TEXT_OFFSET value in the 
> > uncompressed image, the simplest way would be:
> > 
> > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > index f8c08839ed..de84d0635a 100644
> > --- a/arch/arm/kernel/head.S
> > +++ b/arch/arm/kernel/head.S
> > @@ -78,6 +78,11 @@
> >  
> >  	__HEAD
> >  ENTRY(stext)
> > +
> > +	b	1f
> > +	.word	TEXT_OFFSET		@ located at a 4-byte offset in Image
> > +1:
> > +
> >   ARM_BE8(setend	be )			@ ensure we are in BE8 mode
> >  
> >   THUMB(	adr	r9, BSYM(1f)	)	@ Kernel is always entered in ARM.
> > 
> > This way the first word for Image would always be 0xea000000 and the 
> > second one would be TEXT_OFFSET.  No other kernel Image binaries ever 
> > had 0xea000000 as their first word so that also let you validate whether 
> > or not the TEXT_OFFSET value is there.
> 
> Thank you for the suggestion. This approach also came to mind, but it would
> require new documentation and tooling in the JTAG scripts or simulator
> equivalent. That's another aspect of the ELF-based approaches that I
> like--hopefully existing documentation and tool support could be reused.

The above is useful for loading the raw uncompressed Image without 
carrying the full ELF baggage.  With the ELF file we could simply store 
a symbol which value would be TEXT_OFFSET.  But the physical start of 
RAM has to come from outside the kernel image.

But whatever you do, new documentation and tooling is required anyway if 
you want to automate this into your debugging script.


Nicolas
Rob Herring April 17, 2014, 5:11 p.m. UTC | #6
On Wed, Apr 16, 2014 at 2:14 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Wed, 16 Apr 2014, Christopher Covington wrote:
>
>> On 04/15/2014 06:44 AM, Daniel Thompson wrote:
>> > Hi Folks

[snip]

>> Or could we patch up the linker script to set zero-based ELF load
>> memory addresses (LMAs) [4] so that the physical addresses are almost right,
>> you just might have to add a system-specific RAM offset, perhaps pulled out of
>> the device tree? If that won't work, we could generate some kind of
>> vmlinux-phys with physical addresses. The latter two options might also
>> simplify external debugging before __turn_mmu_on(). I like the sound of the
>> LMA approach best, assuming it doesn't break existing stuff (I notice a few AT
>> directives in vmlinux.lds.S). Some of this might transfer to arm64 as well.
>> What do you all think?
>
> If you really really want to get at the TEXT_OFFSET value in the
> uncompressed image, the simplest way would be:
>
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index f8c08839ed..de84d0635a 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -78,6 +78,11 @@
>
>         __HEAD
>  ENTRY(stext)
> +
> +       b       1f
> +       .word   TEXT_OFFSET             @ located at a 4-byte offset in Image
> +1:
> +
>   ARM_BE8(setend        be )                    @ ensure we are in BE8 mode
>
>   THUMB(        adr     r9, BSYM(1f)    )       @ Kernel is always entered in ARM.
>
> This way the first word for Image would always be 0xea000000 and the
> second one would be TEXT_OFFSET.  No other kernel Image binaries ever
> had 0xea000000 as their first word so that also let you validate whether
> or not the TEXT_OFFSET value is there.

Better yet, we should adopt the arm64 Image header which has this and
other fields for arm Image files. We're going to have to deal with raw
Image (and Image.gz) in bootloaders for arm64, so we might as well
align things.

Rob
Christopher Covington April 17, 2014, 6:33 p.m. UTC | #7
On 04/16/2014 07:21 PM, Nicolas Pitre wrote:
> On Wed, 16 Apr 2014, Christopher Covington wrote:

>> Thank you for the suggestion. This approach also came to mind, but it would
>> require new documentation and tooling in the JTAG scripts or simulator
>> equivalent. That's another aspect of the ELF-based approaches that I
>> like--hopefully existing documentation and tool support could be reused.
> 
> The above is useful for loading the raw uncompressed Image without 
> carrying the full ELF baggage.

What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
debugging symbols, for example, if size is of concern?

Thanks,
Christopher
Nicolas Pitre April 17, 2014, 7:48 p.m. UTC | #8
On Thu, 17 Apr 2014, Christopher Covington wrote:

> On 04/16/2014 07:21 PM, Nicolas Pitre wrote:
> > On Wed, 16 Apr 2014, Christopher Covington wrote:
> 
> >> Thank you for the suggestion. This approach also came to mind, but it would
> >> require new documentation and tooling in the JTAG scripts or simulator
> >> equivalent. That's another aspect of the ELF-based approaches that I
> >> like--hopefully existing documentation and tool support could be reused.
> > 
> > The above is useful for loading the raw uncompressed Image without 
> > carrying the full ELF baggage.
> 
> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
> debugging symbols, for example, if size is of concern?

Most existing bootloaders don't have the ability to parse ELF files.  
This is therefore not the typical kernel image format.  The uncompressed 
kernel image is not very typical either, but like zImage it doesn't rely 
on any parser in the bootloader.


Nicolas
Nicolas Pitre April 17, 2014, 8:06 p.m. UTC | #9
On Thu, 17 Apr 2014, Rob Herring wrote:

> On Wed, Apr 16, 2014 at 2:14 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > On Wed, 16 Apr 2014, Christopher Covington wrote:
> >
> >> On 04/15/2014 06:44 AM, Daniel Thompson wrote:
> >> > Hi Folks
> 
> [snip]
> 
> >> Or could we patch up the linker script to set zero-based ELF load
> >> memory addresses (LMAs) [4] so that the physical addresses are almost right,
> >> you just might have to add a system-specific RAM offset, perhaps pulled out of
> >> the device tree? If that won't work, we could generate some kind of
> >> vmlinux-phys with physical addresses. The latter two options might also
> >> simplify external debugging before __turn_mmu_on(). I like the sound of the
> >> LMA approach best, assuming it doesn't break existing stuff (I notice a few AT
> >> directives in vmlinux.lds.S). Some of this might transfer to arm64 as well.
> >> What do you all think?
> >
> > If you really really want to get at the TEXT_OFFSET value in the
> > uncompressed image, the simplest way would be:
> >
> > diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> > index f8c08839ed..de84d0635a 100644
> > --- a/arch/arm/kernel/head.S
> > +++ b/arch/arm/kernel/head.S
> > @@ -78,6 +78,11 @@
> >
> >         __HEAD
> >  ENTRY(stext)
> > +
> > +       b       1f
> > +       .word   TEXT_OFFSET             @ located at a 4-byte offset in Image
> > +1:
> > +
> >   ARM_BE8(setend        be )                    @ ensure we are in BE8 mode
> >
> >   THUMB(        adr     r9, BSYM(1f)    )       @ Kernel is always entered in ARM.
> >
> > This way the first word for Image would always be 0xea000000 and the
> > second one would be TEXT_OFFSET.  No other kernel Image binaries ever
> > had 0xea000000 as their first word so that also let you validate whether
> > or not the TEXT_OFFSET value is there.
> 
> Better yet, we should adopt the arm64 Image header which has this and
> other fields for arm Image files. We're going to have to deal with raw
> Image (and Image.gz) in bootloaders for arm64, so we might as well
> align things.

We could use the same header as ARM64 if we want to add more information 
to the uncompressed kernel image.

However I really don't want to encourage the proliferation of yet 
another kernel image formats on ARM32.  We've had zImage for the last 20 
years and that's what ARM32 bootloaders should support.  The 
introduction of the uImage format caused enough pain already.

Booting uncompressed kernel image on ARM32 may be useful for some 
debugging setups.  I don't see other cases where it would be legitimate 
to break existing practices.


Nicolas
Russell King - ARM Linux April 17, 2014, 8:16 p.m. UTC | #10
On Thu, Apr 17, 2014 at 04:06:16PM -0400, Nicolas Pitre wrote:
> On Thu, 17 Apr 2014, Rob Herring wrote:
> > Better yet, we should adopt the arm64 Image header which has this and
> > other fields for arm Image files. We're going to have to deal with raw
> > Image (and Image.gz) in bootloaders for arm64, so we might as well
> > align things.
> 
> We could use the same header as ARM64 if we want to add more information 
> to the uncompressed kernel image.
> 
> However I really don't want to encourage the proliferation of yet 
> another kernel image formats on ARM32.  We've had zImage for the last 20 
> years and that's what ARM32 bootloaders should support.  The 
> introduction of the uImage format caused enough pain already.
> 
> Booting uncompressed kernel image on ARM32 may be useful for some 
> debugging setups.  I don't see other cases where it would be legitimate 
> to break existing practices.

Me neither.  We even have good enough reasons (such as the issue in this
thread to do with where the image should be placed) no longer support
uncompressed images anymore.  (Yes, they'll still be generated because
we need the input to compress them, but we should stop advertising them
as a make target.)
Christopher Covington April 17, 2014, 8:49 p.m. UTC | #11
On 04/17/2014 03:48 PM, Nicolas Pitre wrote:
> On Thu, 17 Apr 2014, Christopher Covington wrote:
> 
>> On 04/16/2014 07:21 PM, Nicolas Pitre wrote:
>>> On Wed, 16 Apr 2014, Christopher Covington wrote:
>>
>>>> Thank you for the suggestion. This approach also came to mind, but it would
>>>> require new documentation and tooling in the JTAG scripts or simulator
>>>> equivalent. That's another aspect of the ELF-based approaches that I
>>>> like--hopefully existing documentation and tool support could be reused.
>>>
>>> The above is useful for loading the raw uncompressed Image without 
>>> carrying the full ELF baggage.
>>
>> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
>> debugging symbols, for example, if size is of concern?
> 
> Most existing bootloaders don't have the ability to parse ELF files.  
> This is therefore not the typical kernel image format.  The uncompressed 
> kernel image is not very typical either, but like zImage it doesn't rely 
> on any parser in the bootloader.

It's not obvious to me how you reached that conclusion.

http://en.wikipedia.org/wiki/Comparison_of_boot_loaders#Technical_information

(It looks like syslinux now supports ELF as well:
http://git.kernel.org/cgit/boot/syslinux/syslinux.git/plain/com32/modules/elf.c?id=HEAD)

In any case, when performing boot debugging I'm not as interested in
traditional self-hosted bootloaders as I am external loaders, like those built
into software models (QEMU, Fast Models, etc.) or available to JTAG scripts
(OpenOCD, Trace32, etc.). These seem to generally have ELF support.

I'll play around with Jason's patch (thanks!) and see how things look in practice.

Christopher
Peter Maydell April 17, 2014, 8:54 p.m. UTC | #12
On 17 April 2014 21:49, Christopher Covington <cov@codeaurora.org> wrote:
> In any case, when performing boot debugging I'm not as interested in
> traditional self-hosted bootloaders as I am external loaders, like those built
> into software models (QEMU, Fast Models, etc.) or available to JTAG scripts
> (OpenOCD, Trace32, etc.). These seem to generally have ELF support.

FWIW ARM QEMU won't boot an ELF kernel -- the assumption is
that an ELF file is not a kernel and should just be put into RAM
and run, whereas a non-ELF file is a kernel and gets the special
setup/handling of secondary CPUs/etc that the Booting document
requires.

thanks
-- PMM
Rob Herring April 17, 2014, 9:18 p.m. UTC | #13
On Thu, Apr 17, 2014 at 3:16 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Apr 17, 2014 at 04:06:16PM -0400, Nicolas Pitre wrote:
>> On Thu, 17 Apr 2014, Rob Herring wrote:
>> > Better yet, we should adopt the arm64 Image header which has this and
>> > other fields for arm Image files. We're going to have to deal with raw
>> > Image (and Image.gz) in bootloaders for arm64, so we might as well
>> > align things.
>>
>> We could use the same header as ARM64 if we want to add more information
>> to the uncompressed kernel image.
>>
>> However I really don't want to encourage the proliferation of yet
>> another kernel image formats on ARM32.  We've had zImage for the last 20
>> years and that's what ARM32 bootloaders should support.  The
>> introduction of the uImage format caused enough pain already.
>>
>> Booting uncompressed kernel image on ARM32 may be useful for some
>> debugging setups.  I don't see other cases where it would be legitimate
>> to break existing practices.
>
> Me neither.  We even have good enough reasons (such as the issue in this
> thread to do with where the image should be placed) no longer support
> uncompressed images anymore.  (Yes, they'll still be generated because
> we need the input to compress them, but we should stop advertising them
> as a make target.)

The problem here is more than just the TEXT_OFFSET changed. From what
I've heard, there are some QC chips which need much more reserved RAM
than the 2MB discussed here. Changing the TEXT_OFFSET is a hack that
doesn't scale.

A simple issue is you are now wasting 2MB of low memory on every
platform. Not such a big deal I guess. But what if more is needed?

The zImage requires that the kernel be placed at a 128M aligned
address plus TEXT_OFFSET. The v2p patching then requires the kernel to
be located within the first 16MB of RAM. So the Image can only ever be
placed at 0x8000 - 15.?MB from a 128MB aligned address. You can never
have the first 16-127MB of RAM reserved. The only way to have reserved
memory (in chucks of 16MB) is by loading an Image file directly
instead. The bootloaders will know the start of RAM and any reserved
memory size because they can simply parse DT.

Bootloaders are going to have to change for arm64 Image support
anyway, so we should have an aligned solution here.

Rob
Russell King - ARM Linux April 17, 2014, 9:35 p.m. UTC | #14
On Thu, Apr 17, 2014 at 04:18:45PM -0500, Rob Herring wrote:
> The problem here is more than just the TEXT_OFFSET changed. From what
> I've heard, there are some QC chips which need much more reserved RAM
> than the 2MB discussed here. Changing the TEXT_OFFSET is a hack that
> doesn't scale.

You may think it's a hack, but we really can't get around this.  There
really are platforms out there where we must do this kind of stuff.  I
invite you next time you meet up to talk to Michal Simek.  There's no
way they can load the kernel at 32K into RAM.

> A simple issue is you are now wasting 2MB of low memory on every
> platform. Not such a big deal I guess. But what if more is needed?

Why do you think it's wasted in the general case?  Do you think the
first 16K is ignored by Linux?  All memory will be freed to the Linux
page allocator unless it has an explicit reservation in memblock.  So
the 2MB won't be wasted - it will be freed as before to the page
allocator.

> The zImage requires that the kernel be placed at a 128M aligned
> address plus TEXT_OFFSET. The v2p patching then requires the kernel to
> be located within the first 16MB of RAM. So the Image can only ever be
> placed at 0x8000 - 15.?MB from a 128MB aligned address. You can never
> have the first 16-127MB of RAM reserved.

Wrong.  You can have as much RAM as you want reserved, you just can't
manage it with Linux memory allocators if you go over 16MB.

Remember that the virtual address space PAGE_OFFSET...kernel corresponds
with PHYS_OFFSET...kernel.  So, if you have 16MB between PHYS_OFFSET and
the kernel, then you have 16MB between PAGE_OFFSET and the kernel.  Your
modules are looking very distant, and PCREL24 relocations become
troublesome.

> The only way to have reserved
> memory (in chucks of 16MB) is by loading an Image file directly
> instead. The bootloaders will know the start of RAM and any reserved
> memory size because they can simply parse DT.
> 
> Bootloaders are going to have to change for arm64 Image support
> anyway, so we should have an aligned solution here.

No.  You simply can't eliminate any of the above - each one has been
negotiated through quite an amount of discussion with relevant parties
and/or due to technical requirements and they just can't be magic'd
away.

Plus the ARM64 image format is different from our zImage format.  It
would make far *more* sense to align our Image format with our zImage
format so existing boot loaders which look for the zImage magic numbers
can boot plain Image files too.

Moreover, since we could *never* align zImage with the ARM64 format,
why on earth would we want to start using the ARM64 format for the
Image format?

If you say, we should just break the existing zImage format, my response
will be: who the hell are you to decide to break 20 odd years of boot
ABI in a way which *stops* platforms from booting on such a pathetic
whim.

No, this is *not* going to happen.  It is either the zImage format or
no special format at all.
Rob Herring April 18, 2014, 2:53 a.m. UTC | #15
On Thu, Apr 17, 2014 at 4:35 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Apr 17, 2014 at 04:18:45PM -0500, Rob Herring wrote:
>> The problem here is more than just the TEXT_OFFSET changed. From what
>> I've heard, there are some QC chips which need much more reserved RAM
>> than the 2MB discussed here. Changing the TEXT_OFFSET is a hack that
>> doesn't scale.
>
> You may think it's a hack, but we really can't get around this.  There
> really are platforms out there where we must do this kind of stuff.  I
> invite you next time you meet up to talk to Michal Simek.  There's no
> way they can load the kernel at 32K into RAM.

In fact, I have discussed this with him. If we're having discussions
about it, then obviously some problems remain.

>> A simple issue is you are now wasting 2MB of low memory on every
>> platform. Not such a big deal I guess. But what if more is needed?
>
> Why do you think it's wasted in the general case?  Do you think the
> first 16K is ignored by Linux?  All memory will be freed to the Linux
> page allocator unless it has an explicit reservation in memblock.  So
> the 2MB won't be wasted - it will be freed as before to the page
> allocator.

Okay, my mistake.

>> The zImage requires that the kernel be placed at a 128M aligned
>> address plus TEXT_OFFSET. The v2p patching then requires the kernel to
>> be located within the first 16MB of RAM. So the Image can only ever be
>> placed at 0x8000 - 15.?MB from a 128MB aligned address. You can never
>> have the first 16-127MB of RAM reserved.
>
> Wrong.  You can have as much RAM as you want reserved, you just can't
> manage it with Linux memory allocators if you go over 16MB.
>
> Remember that the virtual address space PAGE_OFFSET...kernel corresponds
> with PHYS_OFFSET...kernel.  So, if you have 16MB between PHYS_OFFSET and
> the kernel, then you have 16MB between PAGE_OFFSET and the kernel.  Your
> modules are looking very distant, and PCREL24 relocations become
> troublesome.

For the reasons you give here, doesn't that mean you want to have
TEXT_OFFSET be as small as possible? And (ab)using TEXT_OFFSET to
reserve 16, 32, 64MB, etc, would be a bad idea. Also, that only gives
us a compile time memory reservation.

Here's a simple test of what I was trying to point out. I took a
working kernel with TEXT_OFFSET of 0x8000 and booted it on QEMU using
the "virt" machine which RAM normally starts at 0x40000000. Then
varying the RAM base, I get these results:

0x40000000 - boots
0x41000000 - no output because the decompressor will still put the
Image at 0x40008000.
0x48000000 - boots

So without changing TEXT_OFFSET, you can only vary the PHYS_OFFSET in
128MB steps. For anything in between you have to use TEXT_OFFSET. Is
that really the right solution?

BTW, a TEXT_OFFSET of 0x408000 or more doesn't work either due to the
limits in immediate values, but that problem could be easily fixed.

>> The only way to have reserved
>> memory (in chucks of 16MB) is by loading an Image file directly
>> instead. The bootloaders will know the start of RAM and any reserved
>> memory size because they can simply parse DT.
>>
>> Bootloaders are going to have to change for arm64 Image support
>> anyway, so we should have an aligned solution here.
>
> No.  You simply can't eliminate any of the above - each one has been
> negotiated through quite an amount of discussion with relevant parties
> and/or due to technical requirements and they just can't be magic'd
> away.
>
> Plus the ARM64 image format is different from our zImage format.  It
> would make far *more* sense to align our Image format with our zImage
> format so existing boot loaders which look for the zImage magic numbers
> can boot plain Image files too.
>
> Moreover, since we could *never* align zImage with the ARM64 format,
> why on earth would we want to start using the ARM64 format for the
> Image format?

I'm not talking about zImage. I'm talking about Image files only. The
arm64 Image header could be added to ARM Image files and that would
not hurt or change a thing for existing users. The cost is 64 bytes.

> If you say, we should just break the existing zImage format, my response
> will be: who the hell are you to decide to break 20 odd years of boot
> ABI in a way which *stops* platforms from booting on such a pathetic
> whim.

I'm not suggesting to break anything or changing existing platforms,
but how do we improve the Image format in a compatible way. If
bootloaders want to support booting Image files or vmlinux directly,
then we should support that including any compatible changes to make
things work better.

Rob
Nicolas Pitre April 18, 2014, 4:34 a.m. UTC | #16
On Thu, 17 Apr 2014, Rob Herring wrote:

> Here's a simple test of what I was trying to point out. I took a
> working kernel with TEXT_OFFSET of 0x8000 and booted it on QEMU using
> the "virt" machine which RAM normally starts at 0x40000000. Then
> varying the RAM base, I get these results:
> 
> 0x40000000 - boots
> 0x41000000 - no output because the decompressor will still put the
> Image at 0x40008000.

If you want this to work, you have to disable CONFIG_AUTO_ZRELADDR.

In practice there is no actual hardware with physical RAM not aligned to 
a 128MB boundary.  That's why this particular alignment was selected.

> 0x48000000 - boots
> 
> So without changing TEXT_OFFSET, you can only vary the PHYS_OFFSET in
> 128MB steps. For anything in between you have to use TEXT_OFFSET. Is
> that really the right solution?

What "solution" are you looking for?  I'm under the impression you're 
getting confused about what TEXT_OFFSET is for.

> I'm not suggesting to break anything or changing existing platforms,
> but how do we improve the Image format in a compatible way. If
> bootloaders want to support booting Image files or vmlinux directly,
> then we should support that including any compatible changes to make
> things work better.

And why would bootloaders want that?  Just to create confusion with 
the established boot protocol?

There is really not much to share between ARM32 and ARM64 bootloader 
implementation wise given the major platform initialization differences, 
so trying to consolidate the very little code to actually boot the image 
is rather futile.


Nicolas
Russell King - ARM Linux April 18, 2014, 8:41 a.m. UTC | #17
On Thu, Apr 17, 2014 at 09:53:23PM -0500, Rob Herring wrote:
> On Thu, Apr 17, 2014 at 4:35 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > No.  You simply can't eliminate any of the above - each one has been
> > negotiated through quite an amount of discussion with relevant parties
> > and/or due to technical requirements and they just can't be magic'd
> > away.
> >
> > Plus the ARM64 image format is different from our zImage format.  It
> > would make far *more* sense to align our Image format with our zImage
> > format so existing boot loaders which look for the zImage magic numbers
> > can boot plain Image files too.
> >
> > Moreover, since we could *never* align zImage with the ARM64 format,
> > why on earth would we want to start using the ARM64 format for the
> > Image format?
> 
> I'm not talking about zImage. I'm talking about Image files only. The
> arm64 Image header could be added to ARM Image files and that would
> not hurt or change a thing for existing users. The cost is 64 bytes.

No it isn't.  The cost is 64-bytes *and* user confusion with two
completely different "headers" for no reason what so ever.

Why use the ARM64 version and then have it *block* existing boot
loaders which look for the zImage magic from being able to boot an
Image.

It's a much saner idea to use the ARM32 zImage header than to use the
ARM64 version - or nothing at all.
Daniel Thompson April 22, 2014, 9:44 a.m. UTC | #18
On 17/04/14 21:35, Jason Gunthorpe wrote:
>>> The above is useful for loading the raw uncompressed Image without 
>>> carrying the full ELF baggage.
>>
>> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
>> debugging symbols, for example, if size is of concern?
> 
> FWIW, it is a small non-intrusive change to produce ELFs with the
> proper LMA, if it is useful for specialized tooling, here is the 3.14
> version of the patch I created (I see it needs a bit of cleanup..)
> You must also force PATCH_PHYS_VIRT off.
> 
> The ELF also has the correct entry point address, so ELF tooling can
> just jump into it, after setting the proper register values according
> to the boot protocol.

That might be a useful approach for single platform kernels but I don't
think such an approach can work for multi-arch kernels since, because
the RAM can be located anywhere in the address map, the physical load
address is platform dependant.


Daniel.
Daniel Thompson April 22, 2014, 9:53 a.m. UTC | #19
On 17/04/14 22:35, Russell King - ARM Linux wrote:
> On Thu, Apr 17, 2014 at 04:18:45PM -0500, Rob Herring wrote:
>> The problem here is more than just the TEXT_OFFSET changed. From what
>> I've heard, there are some QC chips which need much more reserved RAM
>> than the 2MB discussed here. Changing the TEXT_OFFSET is a hack that
>> doesn't scale.
> 
> You may think it's a hack, but we really can't get around this.  There
> really are platforms out there where we must do this kind of stuff.  I
> invite you next time you meet up to talk to Michal Simek.  There's no
> way they can load the kernel at 32K into RAM.

After reading this thread I have noticed that the sort order for the
textofs part of this makefile is numeric (based on textofs) rather than
alphabetic.

Is this an intentional mechanism? Certainly the result of numeric
sorting is that the kernel will rise to the highest point in memory that
suits all enabled platforms (based on the assumption that platforms are
much more likely to reserve memory right at the start of RAM than
slightly offset into it).

If so would you welcome a comment only patch explaining this?
Russell King - ARM Linux April 22, 2014, 10:07 a.m. UTC | #20
On Tue, Apr 22, 2014 at 10:53:07AM +0100, Daniel Thompson wrote:
> On 17/04/14 22:35, Russell King - ARM Linux wrote:
> > On Thu, Apr 17, 2014 at 04:18:45PM -0500, Rob Herring wrote:
> >> The problem here is more than just the TEXT_OFFSET changed. From what
> >> I've heard, there are some QC chips which need much more reserved RAM
> >> than the 2MB discussed here. Changing the TEXT_OFFSET is a hack that
> >> doesn't scale.
> > 
> > You may think it's a hack, but we really can't get around this.  There
> > really are platforms out there where we must do this kind of stuff.  I
> > invite you next time you meet up to talk to Michal Simek.  There's no
> > way they can load the kernel at 32K into RAM.
> 
> After reading this thread I have noticed that the sort order for the
> textofs part of this makefile is numeric (based on textofs) rather than
> alphabetic.

Correct.

> Is this an intentional mechanism? Certainly the result of numeric
> sorting is that the kernel will rise to the highest point in memory that
> suits all enabled platforms (based on the assumption that platforms are
> much more likely to reserve memory right at the start of RAM than
> slightly offset into it).

Also correct.

> If so would you welcome a comment only patch explaining this?

If you're willing to provide one.
Daniel Thompson April 22, 2014, 10:26 a.m. UTC | #21
On 18/04/14 05:34, Nicolas Pitre wrote:
>> I'm not suggesting to break anything or changing existing platforms,
>> > but how do we improve the Image format in a compatible way. If
>> > bootloaders want to support booting Image files or vmlinux directly,
>> > then we should support that including any compatible changes to make
>> > things work better.
> And why would bootloaders want that?  Just to create confusion with 
> the established boot protocol?

I'd say that they don't. My original concern was how the different
architectures negotiate if more than one arch wants a special text
offset, not how to write a correct bootloader.

The existing uImage files already provide sufficient information to load
the kernel regardless of the TEXT_OFFSET chosen by negotiation among the
enabled architectures.

The entry point is PAGE_OFFSET + TEXT_OFFSET and, although only
implicitly defined, the entry point cannot be set to any other value
without making a backward incompatible to arm/Booting:
    "The boot loader is expected to call the kernel image by jumping
    directly to the first instruction of the kernel image."

Therefore providing PAGE_OFFSET remains 1G aligned and the hardware
meets the not-unreasonably-stupid test (i.e. TEXT_OFFSET < 1G) then
deriving the right value for TEXT_OFFSET is a trivial mask operation on
the entry point.


Daniel.
Russell King - ARM Linux April 22, 2014, 10:40 a.m. UTC | #22
On Tue, Apr 22, 2014 at 11:26:53AM +0100, Daniel Thompson wrote:
> On 18/04/14 05:34, Nicolas Pitre wrote:
> >> I'm not suggesting to break anything or changing existing platforms,
> >> > but how do we improve the Image format in a compatible way. If
> >> > bootloaders want to support booting Image files or vmlinux directly,
> >> > then we should support that including any compatible changes to make
> >> > things work better.
> > And why would bootloaders want that?  Just to create confusion with 
> > the established boot protocol?
> 
> I'd say that they don't. My original concern was how the different
> architectures negotiate if more than one arch wants a special text
> offset, not how to write a correct bootloader.
> 
> The existing uImage files already provide sufficient information to load
> the kernel regardless of the TEXT_OFFSET chosen by negotiation among the
> enabled architectures.

No.  uImage merely specifies the address at which to load/execute the
zImage, and more often than not this is a step which has to be done
after kernel build as the kernel build does not have the information
to be able to generate a uImage on its own.  Also, a uImage generated
for one platform will not necessarily boot on a different platform
even though the contents of the uImage may be 100% identical apart
from the header.

This is why we're moving away from uImage, and booting zImage directly
on ARM hardware.  File formats which encode the load and execution
addresses are cumbersome when you positively don't want that information
in the file (because that information is irrelevant.)

> The entry point is PAGE_OFFSET + TEXT_OFFSET and, although only
> implicitly defined, the entry point cannot be set to any other value
> without making a backward incompatible to arm/Booting:
>     "The boot loader is expected to call the kernel image by jumping
>     directly to the first instruction of the kernel image."
> 
> Therefore providing PAGE_OFFSET remains 1G aligned and the hardware
> meets the not-unreasonably-stupid test (i.e. TEXT_OFFSET < 1G) then
> deriving the right value for TEXT_OFFSET is a trivial mask operation on
> the entry point.

PAGE_OFFSET doesn't have to be 1G aligned.  As I've already pointed out
in previous replies, PAGE_OFFSET is totally irrelevant in this discussion.
PAGE_OFFSET is the *virtual* address of the RAM, and has no bearing what
so ever on where you load the kernel image.

Even if you did mean PHYS_OFFSET and the above was a typo, the above
statement still remains false.
Daniel Thompson April 22, 2014, 11:41 a.m. UTC | #23
On 22/04/14 11:40, Russell King - ARM Linux wrote:
> On Tue, Apr 22, 2014 at 11:26:53AM +0100, Daniel Thompson wrote:
>> On 18/04/14 05:34, Nicolas Pitre wrote:
>>>> I'm not suggesting to break anything or changing existing platforms,
>>>>> but how do we improve the Image format in a compatible way. If
>>>>> bootloaders want to support booting Image files or vmlinux directly,
>>>>> then we should support that including any compatible changes to make
>>>>> things work better.
>>> And why would bootloaders want that?  Just to create confusion with 
>>> the established boot protocol?
>>
>> I'd say that they don't. My original concern was how the different
>> architectures negotiate if more than one arch wants a special text
>> offset, not how to write a correct bootloader.
>>
>> The existing uImage files already provide sufficient information to load
>> the kernel regardless of the TEXT_OFFSET chosen by negotiation among the
>> enabled architectures.
> 
> No.  uImage merely specifies the address at which to load/execute the
> zImage, and more often than not this is a step which has to be done
> after kernel build as the kernel build does not have the information
> to be able to generate a uImage on its own.  Also, a uImage generated
> for one platform will not necessarily boot on a different platform
> even though the contents of the uImage may be 100% identical apart
> from the header.

You were right about the typo but I'm afraid the location was much
earlier. Sorry! Replace uImage with the vmlinux ELF image and my last
post is not quite such nonsense.


>> The entry point is PAGE_OFFSET + TEXT_OFFSET and, although only
>> implicitly defined, the entry point cannot be set to any other value
>> without making a backward incompatible to arm/Booting:
>>     "The boot loader is expected to call the kernel image by jumping
>>     directly to the first instruction of the kernel image."

Although for this bit probably will always be nonsense.


>> Therefore providing PAGE_OFFSET remains 1G aligned and the hardware
>> meets the not-unreasonably-stupid test (i.e. TEXT_OFFSET < 1G) then
>> deriving the right value for TEXT_OFFSET is a trivial mask operation on
>> the entry point.
> 
> PAGE_OFFSET doesn't have to be 1G aligned.  As I've already pointed out
> in previous replies, PAGE_OFFSET is totally irrelevant in this discussion.
> PAGE_OFFSET is the *virtual* address of the RAM, and has no bearing what
> so ever on where you load the kernel image.

When trying to directly load an ELF image, where by default vaddr ==
paddr, its actually PAGE_OFFSET we're looking for. Combining that with
some platform specific knowledge about RAM (i.e. typical PHYS_OFFSET for
the platform) and we can derive sensible paddr values.

At the start of last week the loader I used assumed TEXT_OFFSET would be
0x8000 and used it to calculate PAGE_OFFSET from the ELF entry point.

So I guess what I have now is still broken just, not quite as obviously...
Michal Simek April 22, 2014, 2:50 p.m. UTC | #24
On 04/17/2014 10:35 PM, Jason Gunthorpe wrote:
> On Thu, Apr 17, 2014 at 02:33:43PM -0400, Christopher Covington wrote:
>> On 04/16/2014 07:21 PM, Nicolas Pitre wrote:
>>> On Wed, 16 Apr 2014, Christopher Covington wrote:
>>
>>>> Thank you for the suggestion. This approach also came to mind, but it would
>>>> require new documentation and tooling in the JTAG scripts or simulator
>>>> equivalent. That's another aspect of the ELF-based approaches that I
>>>> like--hopefully existing documentation and tool support could be reused.
>>>
>>> The above is useful for loading the raw uncompressed Image without 
>>> carrying the full ELF baggage.
>>
>> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
>> debugging symbols, for example, if size is of concern?
> 
> FWIW, it is a small non-intrusive change to produce ELFs with the
> proper LMA, if it is useful for specialized tooling, here is the 3.14
> version of the patch I created (I see it needs a bit of cleanup..)
> You must also force PATCH_PHYS_VIRT off.
> 
> The ELF also has the correct entry point address, so ELF tooling can
> just jump into it, after setting the proper register values according
> to the boot protocol.
> 
> From ca9763668eed2eaaf0c0c2640f1502c22b68a739 Mon Sep 17 00:00:00 2001
> From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
> Date: Fri, 14 Sep 2012 11:27:17 -0600
> Subject: [PATCH] [ARM] Use AT() in the linker script to create correct program
>  headers
> 
> The standard linux asm-generic/vmlinux.lds.h already supports this,
> and it seems other architectures do as well.
> 
> The goal is to create an ELF file that has correct program headers. We
> want to see the VirtAddr be the runtime address of the kernel with the
> MMU turned on, and PhysAddr be the physical load address for the section
> with no MMU.
> 
> This allows ELF based boot loaders to properly load vmlinux:
> 
> $ readelf -l vmlinux
> Entry point 0x8000
>   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>   LOAD           0x008000 0xc0008000 0x00008000 0x372244 0x3a4310 RWE 0x8000
> 
> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
> ---
>  arch/arm/include/asm/memory.h |  2 +-
>  arch/arm/kernel/vmlinux.lds.S | 51 +++++++++++++++++++++++++------------------
>  2 files changed, 31 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 8756e4b..551e971 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -350,7 +350,7 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
>  #define virt_addr_valid(kaddr)	(((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
>  					&& pfn_valid(__pa(kaddr) >> PAGE_SHIFT) )
>  
> -#endif
> +#endif /* __ASSEMBLY__ */


This is unrelated change.

>  
>  #include <asm-generic/memory_model.h>
>  
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index 7bcee5c..15353d2 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -3,6 +3,13 @@
>   * Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
>   */
>  
> +/* If we have a known, fixed physical load address then set LOAD_OFFSET
> +   and generate an ELF that has the physical load address in the program
> +   headers. */
> +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
> +#define LOAD_OFFSET (PAGE_OFFSET - PLAT_PHYS_OFFSET)
> +#endif
> +
>  #include <asm-generic/vmlinux.lds.h>
>  #include <asm/cache.h>
>  #include <asm/thread_info.h>
> @@ -43,7 +50,7 @@
>  #endif
>  
>  OUTPUT_ARCH(arm)
> -ENTRY(stext)
> +ENTRY(phys_start)
>  
>  #ifndef __ARMEB__
>  jiffies = jiffies_64;
> @@ -86,11 +93,13 @@ SECTIONS
>  #else
>  	. = PAGE_OFFSET + TEXT_OFFSET;
>  #endif
> -	.head.text : {
> +	.head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
>  		_text = .;
> +		phys_start = . - LOAD_OFFSET;
>  		HEAD_TEXT
>  	}

I am not quite about these changes above but Russell can comment it.

> -	.text : {			/* Real text segment		*/
> +	/* Real text segment */
> +	.text :  AT(ADDR(.text) - LOAD_OFFSET) {

The rest is just fine. It is exactly what I have written some months
ago when I want to get ELF with correct addresses for qemu.
It is the same what it is written in asm-generic/vmlinux.lds.h
and ARM should also use it.

Thanks,
Michal
Jason Gunthorpe April 22, 2014, 5 p.m. UTC | #25
> > index 8756e4b..551e971 100644
> > +++ b/arch/arm/include/asm/memory.h
> > @@ -350,7 +350,7 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
> >  #define virt_addr_valid(kaddr)	(((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
> >  					&& pfn_valid(__pa(kaddr) >> PAGE_SHIFT) )
> >  
> > -#endif
> > +#endif /* __ASSEMBLY__ */

> This is unrelated change.

Right, as I said it needs some cleanup :) This is left overs from
rebasing to 3.14 - the original had to carry some small changes to
memory.h as well, but now that we have PLAT_PHYS_OFFSET that isn't
necessary.
 
> > @@ -43,7 +50,7 @@
> >  #endif
> >  
> >  OUTPUT_ARCH(arm)
> > -ENTRY(stext)
> > +ENTRY(phys_start)
> >  
> >  #ifndef __ARMEB__
> >  jiffies = jiffies_64;
> > @@ -86,11 +93,13 @@ SECTIONS
> >  #else
> >  	. = PAGE_OFFSET + TEXT_OFFSET;
> >  #endif
> > -	.head.text : {
> > +	.head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
> >  		_text = .;
> > +		phys_start = . - LOAD_OFFSET;
> >  		HEAD_TEXT
> >  	}
> 
> I am not quite about these changes above but Russell can comment it.

This is adjusting the entry point address in the ELF.

I have copied what other arch's are doing and used the physical
address as the entry address (see x86, ia64).

Jason
Jason Gunthorpe April 22, 2014, 5:05 p.m. UTC | #26
On Tue, Apr 22, 2014 at 10:44:14AM +0100, Daniel Thompson wrote:
> On 17/04/14 21:35, Jason Gunthorpe wrote:
> >>> The above is useful for loading the raw uncompressed Image without 
> >>> carrying the full ELF baggage.
> >>
> >> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
> >> debugging symbols, for example, if size is of concern?
> > 
> > FWIW, it is a small non-intrusive change to produce ELFs with the
> > proper LMA, if it is useful for specialized tooling, here is the 3.14
> > version of the patch I created (I see it needs a bit of cleanup..)
> > You must also force PATCH_PHYS_VIRT off.
> > 
> > The ELF also has the correct entry point address, so ELF tooling can
> > just jump into it, after setting the proper register values according
> > to the boot protocol.
> 
> That might be a useful approach for single platform kernels but I don't
> think such an approach can work for multi-arch kernels since, because
> the RAM can be located anywhere in the address map, the physical load
> address is platform dependant.

Right, I think everyone realizes that.

What this patch does is make kernels that are built with
PLAT_PHYS_OFFSET set and CONFIG_ARM_PATCH_PHYS_VIRT disabled produce
an ELF that reflects the build configuration.

Based on comments from others in the thread this is looking useful for
a variety of cases.

I'm not sure who would need to ack this patch? I can clean it up of course.

Thanks,
Jason
Russell King - ARM Linux April 22, 2014, 5:11 p.m. UTC | #27
On Tue, Apr 22, 2014 at 04:50:12PM +0200, Michal Simek wrote:
> On 04/17/2014 10:35 PM, Jason Gunthorpe wrote:
> > +/* If we have a known, fixed physical load address then set LOAD_OFFSET
> > +   and generate an ELF that has the physical load address in the program
> > +   headers. */
> > +#ifndef CONFIG_ARM_PATCH_PHYS_VIRT
> > +#define LOAD_OFFSET (PAGE_OFFSET - PLAT_PHYS_OFFSET)
> > +#endif
> > +
...
> > -	.head.text : {
> > +	.head.text : AT(ADDR(.head.text) - LOAD_OFFSET) {
> >  		_text = .;
> > +		phys_start = . - LOAD_OFFSET;
> >  		HEAD_TEXT
> >  	}
> 
> I am not quite about these changes above but Russell can comment it.

I don't think /anyone/ is seriously proposing that we should merge this
patch... because that won't happen.  It should be clear enough from the
discussion why that is, but in case it isn't, take a look above.

What is that ifdef saying.  It's saying that if you enable
ARM_PATCH_PHYS_VIRT, which is an absolute requirement for multi-
platform kernels, then you get the proper LMA addresses.  If you don't,
then you don't get proper LMA addresses.

Put another way, if your platform is part of the multi-platform kernel
then you are *excluded* from being able to use this... unless you hack
the Kconfig, and then also provide a constant value for PHYS_OFFSET,
thereby _tying_ the kernel you built to a _single_ platform.

You can't do this _and_ have a multi-platform kernel.
Jason Gunthorpe April 22, 2014, 5:53 p.m. UTC | #28
On Tue, Apr 22, 2014 at 06:11:42PM +0100, Russell King - ARM Linux wrote:

> Put another way, if your platform is part of the multi-platform kernel
> then you are *excluded* from being able to use this... unless you hack
> the Kconfig, and then also provide a constant value for PHYS_OFFSET,
> thereby _tying_ the kernel you built to a _single_ platform.

That is exactly right. To get a fixed LMA you must commit to a
non-relocatable kernel image.

Realistically this patch would need to be accompanied by something
that makes ARM_PATCH_PHYS_VIRT optional for multiplatform based on
EXPERT or similar.

The best usecase seems to be to support ELF tooling for low level
debugging activities, a non-relocatable image isn't a blocker for that
case.

Since the patch is a no-op if LOAD_OFFSET isn't set, is there a
downside I don't see?

Thanks,
Jason
Nicolas Pitre April 22, 2014, 5:55 p.m. UTC | #29
On Tue, 22 Apr 2014, Jason Gunthorpe wrote:

> On Tue, Apr 22, 2014 at 10:44:14AM +0100, Daniel Thompson wrote:
> > On 17/04/14 21:35, Jason Gunthorpe wrote:
> > >>> The above is useful for loading the raw uncompressed Image without 
> > >>> carrying the full ELF baggage.
> > >>
> > >> What exactly is the full ELF baggage? Aren't there existing mechanisms to omit
> > >> debugging symbols, for example, if size is of concern?
> > > 
> > > FWIW, it is a small non-intrusive change to produce ELFs with the
> > > proper LMA, if it is useful for specialized tooling, here is the 3.14
> > > version of the patch I created (I see it needs a bit of cleanup..)
> > > You must also force PATCH_PHYS_VIRT off.
> > > 
> > > The ELF also has the correct entry point address, so ELF tooling can
> > > just jump into it, after setting the proper register values according
> > > to the boot protocol.
> > 
> > That might be a useful approach for single platform kernels but I don't
> > think such an approach can work for multi-arch kernels since, because
> > the RAM can be located anywhere in the address map, the physical load
> > address is platform dependant.
> 
> Right, I think everyone realizes that.
> 
> What this patch does is make kernels that are built with
> PLAT_PHYS_OFFSET set and CONFIG_ARM_PATCH_PHYS_VIRT disabled produce
> an ELF that reflects the build configuration.
> 
> Based on comments from others in the thread this is looking useful for
> a variety of cases.

Well...

We do not want people in general to have PLAT_PHYS_OFFSET defined and
CONFIG_ARM_PATCH_PHYS_VIRT disabled.  In fact a huge effort has been 
deployed to go the exact opposite way over the last few years.

There are special cases where CONFIG_ARM_PATCH_PHYS_VIRT needs to be 
turned off for example.  But those are specialized configurations and 
they should be the exception not the norm.  And you should be knowing 
what you're doing in those cases.

So I doubt it is worth complexifying the linker script for something 
that is meant to be the exception, _especially_ if this is for some 
debugging environment purposes.  You may just adjust some setting in 
your environment or do a quick kernel modification locally instead.  
And if you don't know what to modify then you're probably lacking the 
necessary qualifications to perform that kind of kernel debugging in the 
first place.

Making the patch available on a mailing list is fine.  If it is useful 
to someone else then it'll be found.  But I don't think this is useful 
upstream.


Nicolas
Russell King - ARM Linux April 22, 2014, 6:12 p.m. UTC | #30
On Tue, Apr 22, 2014 at 11:53:25AM -0600, Jason Gunthorpe wrote:
> On Tue, Apr 22, 2014 at 06:11:42PM +0100, Russell King - ARM Linux wrote:
> 
> > Put another way, if your platform is part of the multi-platform kernel
> > then you are *excluded* from being able to use this... unless you hack
> > the Kconfig, and then also provide a constant value for PHYS_OFFSET,
> > thereby _tying_ the kernel you built to a _single_ platform.
> 
> That is exactly right. To get a fixed LMA you must commit to a
> non-relocatable kernel image.
> 
> Realistically this patch would need to be accompanied by something
> that makes ARM_PATCH_PHYS_VIRT optional for multiplatform based on
> EXPERT or similar.
> 
> The best usecase seems to be to support ELF tooling for low level
> debugging activities, a non-relocatable image isn't a blocker for that
> case.

Let's not forget that if you want to debug, it's because you've hit a
problem in the kernel you're running.  To get an ELF image you can debug,
you have to turn ARM_PATCH_PHYS_VIRT off, which changes the generated
code - which can in itself cause bugs to hide themselves.

Unfortunately, those kinds of bugs are not as rare as you might think.

> Since the patch is a no-op if LOAD_OFFSET isn't set, is there a
> downside I don't see?

It leads people into thinking that we support booting an ELF file.
We don't.
Russell King - ARM Linux April 22, 2014, 6:36 p.m. UTC | #31
On Tue, Apr 22, 2014 at 01:55:16PM -0400, Nicolas Pitre wrote:
> We do not want people in general to have PLAT_PHYS_OFFSET defined and
> CONFIG_ARM_PATCH_PHYS_VIRT disabled.  In fact a huge effort has been 
> deployed to go the exact opposite way over the last few years.
> 
> There are special cases where CONFIG_ARM_PATCH_PHYS_VIRT needs to be 
> turned off for example.  But those are specialized configurations and 
> they should be the exception not the norm.  And you should be knowing 
> what you're doing in those cases.
> 
> So I doubt it is worth complexifying the linker script for something 
> that is meant to be the exception, _especially_ if this is for some 
> debugging environment purposes.  You may just adjust some setting in 
> your environment or do a quick kernel modification locally instead.  
> And if you don't know what to modify then you're probably lacking the 
> necessary qualifications to perform that kind of kernel debugging in the 
> first place.
> 
> Making the patch available on a mailing list is fine.  If it is useful 
> to someone else then it'll be found.  But I don't think this is useful 
> upstream.

Also, let's not forget that it the ELF file can be modified after the
kernel build:

$ vmlinux=your-vmlinux-file
$ newlma=lma-for-your-platform
$ arm-linux-objcopy $(
  arm-linux-objdump -h ${vmlinux} |
  grep -B1 'LOAD' | \
  sed -nr 's/^[ 0-9]*[0-9] ([^ ]*).*/--change-section-lma \1+${newlma}/p') \
  ${vmlinux} ${vmlinux}-${newlma}

(It would be nice if objcopy could be told "change any section with _this_
attribute".)

The nice thing about this is that you can keep ARM_PATCH_PHYS_VIRT enabled
and not have to change the code in any way - you just fix up the headers on
the ELF file.
Russell King - ARM Linux April 22, 2014, 6:38 p.m. UTC | #32
On Tue, Apr 22, 2014 at 08:32:10PM +0200, Arnd Bergmann wrote:
> @@ -1943,6 +1953,7 @@ config DEPRECATED_PARAM_STRUCT
>  # TEXT and BSS so we preserve their values in the config files.
>  config ZBOOT_ROM_TEXT
>  	hex "Compressed ROM boot loader base address"
> +	depends on BROKEN_MULTIPLATFORM
>  	default "0"
>  	help
>  	  The physical address at which the ROM-able zImage is to be
> @@ -1954,6 +1965,7 @@ config ZBOOT_ROM_TEXT
>  
>  config ZBOOT_ROM_BSS
>  	hex "Compressed ROM boot loader BSS address"
> +	depends on BROKEN_MULTIPLATFORM
>  	default "0"
>  	help
>  	  The base address of an area of read/write memory in the target

Please get rid of the above changes.
diff mbox

Patch

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index f8c08839ed..de84d0635a 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -78,6 +78,11 @@ 
 
 	__HEAD
 ENTRY(stext)
+
+	b	1f
+	.word	TEXT_OFFSET		@ located at a 4-byte offset in Image
+1:
+
  ARM_BE8(setend	be )			@ ensure we are in BE8 mode
 
  THUMB(	adr	r9, BSYM(1f)	)	@ Kernel is always entered in ARM.