ARM: force linker to use PIC veneers

Message ID	1427192184-31261-1-git-send-email-ard.biesheuvel@linaro.org
State	New
Headers	show Return-Path: <patchwork-forward+bncBDYNJBOFRECBBAPUYSUAKGQE3CHBHIQ@linaro.org> Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.50 as permitted sender) client-ip=209.85.215.50; Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; From: Ard Biesheuvel <ard.biesheuvel@linaro.org> To: linux@arm.linux.org.uk, arnd@arndb.de, linux-arm-kernel@lists.infradead.org, nico@linaro.org Subject: [PATCH] ARM: force linker to use PIC veneers Date: Tue, 24 Mar 2015 11:16:24 +0100 Message-Id: <1427192184-31261-1-git-send-email-ard.biesheuvel@linaro.org> summary: Content analysis details: (-0.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.177 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.212.177 listed in wl.mailspike.net] -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org

Ard Biesheuvel March 24, 2015, 10:16 a.m. UTC

When building a very large kernel, it is up to the linker to decide
when and where to insert stubs to allow calls to functions that are
out of range for the ordinary b/bl instructions.

However, since the kernel is built as a position dependent binary,
these stubs (aka veneers) may contain absolute addresses, which will
break such veneer assisted far calls performed with the MMU off.

For instance, the call from __enable_mmu() in the .head.text section
to __turn_mmu_on() in the .idmap.text section may be turned into
something like this:

c0008168 <__enable_mmu>:
c0008168:       f020 0002       bic.w   r0, r0, #2
c000816c:       f420 5080       bic.w   r0, r0, #4096
c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
[...]
c0008200 <____turn_mmu_on_veneer>:
c0008200:       4778            bx      pc
c0008202:       46c0            nop
c0008204:       e59fc000        ldr     ip, [pc]
c0008208:       e12fff1c        bx      ip
c000820c:       c13dfae1        teqgt   sp, r1, ror #21
[...]
c13dfae0 <__turn_mmu_on>:
c13dfae0:       4600            mov     r0, r0
[...]

After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
this instead:

c0008200 <____turn_mmu_on_veneer>:
c0008200:       4778            bx      pc
c0008202:       46c0            nop
c0008204:       e59fc004        ldr     ip, [pc, #4]
c0008208:       e08fc00c        add     ip, pc, ip
c000820c:       e12fff1c        bx      ip
c0008210:       013d7d31        teqeq   sp, r1, lsr sp
c0008214:       00000000        andeq   r0, r0, r0

Note that this particular example is best addressed by moving
.head.text and .idmap.text closer together, but this issue could
potentially affect any code that needs to execute with the
MMU off.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Ard Biesheuvel March 24, 2015, 12:50 p.m. UTC | #1

On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
> On Tue, Mar 24, 2015 at 11:16:24AM +0100, Ard Biesheuvel wrote:
>> When building a very large kernel, it is up to the linker to decide
>> when and where to insert stubs to allow calls to functions that are
>> out of range for the ordinary b/bl instructions.
>>
>> However, since the kernel is built as a position dependent binary,
>> these stubs (aka veneers) may contain absolute addresses, which will
>> break such veneer assisted far calls performed with the MMU off.
>>
>> For instance, the call from __enable_mmu() in the .head.text section
>> to __turn_mmu_on() in the .idmap.text section may be turned into
>> something like this:
>>
>> c0008168 <__enable_mmu>:
>> c0008168:       f020 0002       bic.w   r0, r0, #2
>> c000816c:       f420 5080       bic.w   r0, r0, #4096
>> c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
>> [...]
>> c0008200 <____turn_mmu_on_veneer>:
>> c0008200:       4778            bx      pc
>> c0008202:       46c0            nop
>> c0008204:       e59fc000        ldr     ip, [pc]
>> c0008208:       e12fff1c        bx      ip
>> c000820c:       c13dfae1        teqgt   sp, r1, ror #21
>> [...]
>> c13dfae0 <__turn_mmu_on>:
>> c13dfae0:       4600            mov     r0, r0
>> [...]
>>
>> After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
>> this instead:
>>
>> c0008200 <____turn_mmu_on_veneer>:
>> c0008200:       4778            bx      pc
>> c0008202:       46c0            nop
>> c0008204:       e59fc004        ldr     ip, [pc, #4]
>> c0008208:       e08fc00c        add     ip, pc, ip
>> c000820c:       e12fff1c        bx      ip
>> c0008210:       013d7d31        teqeq   sp, r1, lsr sp
>> c0008214:       00000000        andeq   r0, r0, r0
>>
>> Note that this particular example is best addressed by moving
>> .head.text and .idmap.text closer together, but this issue could
>> potentially affect any code that needs to execute with the
>> MMU off.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>
> Although that fixes the problem, wouldn't this introduce extra potential
> overhead for every call in the kernel?
>

It does not change whether a veneer is emitted or not, it only affects
the PIC nature of it.
So the overhead is 1 additional word for the add instruction, which I
think is a small price to pay for correctness, especially considering
that someone building such a big kernel obviously does not optimize
for size.

> How many such veneers get added in the your kernel configuration, and
> how many are actually necessary (i.e., calls between MMU-off code and
> elsewhere)?
>

Very few. In addition to the example (which will be addressed in
another way regardless) there are some resume functions that get
allocated in .data, and those would need it as well. I have also
proposed b_far/bl_far macros that could be used there as well.

The primary concern is that you can't really check whether any
problematic veneers have been emitted, unless all code that may run
with the MMU off is moved to the idmap.text section.

Ard Biesheuvel March 24, 2015, 2:04 p.m. UTC | #2

On 24 March 2015 at 14:54, Dave Martin <Dave.Martin@arm.com> wrote:
> On Tue, Mar 24, 2015 at 01:50:40PM +0100, Ard Biesheuvel wrote:
>> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
>> > On Tue, Mar 24, 2015 at 11:16:24AM +0100, Ard Biesheuvel wrote:
>> >> When building a very large kernel, it is up to the linker to decide
>> >> when and where to insert stubs to allow calls to functions that are
>> >> out of range for the ordinary b/bl instructions.
>> >>
>> >> However, since the kernel is built as a position dependent binary,
>> >> these stubs (aka veneers) may contain absolute addresses, which will
>> >> break such veneer assisted far calls performed with the MMU off.
>> >>
>> >> For instance, the call from __enable_mmu() in the .head.text section
>> >> to __turn_mmu_on() in the .idmap.text section may be turned into
>> >> something like this:
>> >>
>> >> c0008168 <__enable_mmu>:
>> >> c0008168:       f020 0002       bic.w   r0, r0, #2
>> >> c000816c:       f420 5080       bic.w   r0, r0, #4096
>> >> c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
>> >> [...]
>> >> c0008200 <____turn_mmu_on_veneer>:
>> >> c0008200:       4778            bx      pc
>> >> c0008202:       46c0            nop
>> >> c0008204:       e59fc000        ldr     ip, [pc]
>> >> c0008208:       e12fff1c        bx      ip
>> >> c000820c:       c13dfae1        teqgt   sp, r1, ror #21
>> >> [...]
>> >> c13dfae0 <__turn_mmu_on>:
>> >> c13dfae0:       4600            mov     r0, r0
>> >> [...]
>> >>
>> >> After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
>> >> this instead:
>> >>
>> >> c0008200 <____turn_mmu_on_veneer>:
>> >> c0008200:       4778            bx      pc
>> >> c0008202:       46c0            nop
>> >> c0008204:       e59fc004        ldr     ip, [pc, #4]
>> >> c0008208:       e08fc00c        add     ip, pc, ip
>> >> c000820c:       e12fff1c        bx      ip
>> >> c0008210:       013d7d31        teqeq   sp, r1, lsr sp
>> >> c0008214:       00000000        andeq   r0, r0, r0
>> >>
>> >> Note that this particular example is best addressed by moving
>> >> .head.text and .idmap.text closer together, but this issue could
>> >> potentially affect any code that needs to execute with the
>> >> MMU off.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >
>> > Although that fixes the problem, wouldn't this introduce extra potential
>> > overhead for every call in the kernel?
>> >
>>
>> It does not change whether a veneer is emitted or not, it only affects
>> the PIC nature of it.
>> So the overhead is 1 additional word for the add instruction, which I
>
> You're right, I misunderstood lightly what is going on there.
>
>> think is a small price to pay for correctness, especially considering
>> that someone building such a big kernel obviously does not optimize
>> for size.
>>
>> > How many such veneers get added in the your kernel configuration, and
>> > how many are actually necessary (i.e., calls between MMU-off code and
>> > elsewhere)?
>> >
>>
>> Very few. In addition to the example (which will be addressed in
>> another way regardless) there are some resume functions that get
>> allocated in .data, and those would need it as well. I have also
>> proposed b_far/bl_far macros that could be used there as well.
>>
>> The primary concern is that you can't really check whether any
>> problematic veneers have been emitted, unless all code that may run
>> with the MMU off is moved to the idmap.text section.
>
> That's a valid argument.
>
> Come to think of it, I can't think of a good reason why we don't
> pass --use-blx to the linker for THUMB2_KERNEL.  I think that would
> at least make these sequences a bit less painful by getting rid of
> the "bx pc" stuff.
>

Yes, that would be an improvement.

> How big is your kernel?  It would be good to compare the veneer
> count with a more normal-sized kernel.
>

With all my patches applied, I can actually build allyesconfig which
produces a 74 MB zImage. But this is obviously never going to be able
to execute with the usual policy of placing zImage around 32 MB into
DRAM, and expecting it to be able to decompress. It's also not such a
meaningful benchmark since the majority of the drivers is not
appropriate for ARM.

But since I am mainly helping out Arnd with this stuff, perhaps he can
propose a suitable dotconfig for comparison?

Nicolas Pitre March 24, 2015, 3:16 p.m. UTC | #3

On Tue, 24 Mar 2015, Ard Biesheuvel wrote:

> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
> > How many such veneers get added in the your kernel configuration, and
> > how many are actually necessary (i.e., calls between MMU-off code and
> > elsewhere)?
> >
> 
> Very few. In addition to the example (which will be addressed in
> another way regardless) there are some resume functions that get
> allocated in .data, and those would need it as well.

What are they? I thought we removed all instances of those already.

> I have also proposed b_far/bl_far macros that could be used there as 
> well.

Could the automatic veneer insertion replace the unconditional 
b_far/bl_far usage?  The former would be preferable to the later.

Nicolas

Ard Biesheuvel March 24, 2015, 3:23 p.m. UTC | #4

On 24 March 2015 at 16:16, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 24 Mar 2015, Ard Biesheuvel wrote:
>
>> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
>> > How many such veneers get added in the your kernel configuration, and
>> > how many are actually necessary (i.e., calls between MMU-off code and
>> > elsewhere)?
>> >
>>
>> Very few. In addition to the example (which will be addressed in
>> another way regardless) there are some resume functions that get
>> allocated in .data, and those would need it as well.
>
>
> What are they? I thought we removed all instances of those already.
>
>> I have also proposed b_far/bl_far macros that could be used there as
>> well.
>
> Could the automatic veneer insertion replace the unconditional
> b_far/bl_far usage?  The former would be preferable to the later.
>

Agreed. I am not entirely sure why those functions don't get a veneer.
Perhaps simply because .data is not annotated as executable?

Frankly, I don't really understand the purpose of putting those in
.data in the first place. but if they need to remain there, I can try
to figure out how to get the linker to emit veneers for those as well.

Nicolas Pitre March 24, 2015, 3:23 p.m. UTC | #5

On Tue, 24 Mar 2015, Ard Biesheuvel wrote:

> On 24 March 2015 at 14:54, Dave Martin <Dave.Martin@arm.com> wrote:
> > How big is your kernel?  It would be good to compare the veneer
> > count with a more normal-sized kernel.
> >
> 
> With all my patches applied, I can actually build allyesconfig which
> produces a 74 MB zImage. But this is obviously never going to be able
> to execute with the usual policy of placing zImage around 32 MB into
> DRAM, and expecting it to be able to decompress.

It should work nevertheless.  zImage knows the size of the decompressed 
kernel and will figure out that it needs to relocate itself above the 
final kernel area first.  It might overwrite your DTB and initrd in the 
process if they're not sufficiently high in memory though.

Nicolas

Ard Biesheuvel March 24, 2015, 3:41 p.m. UTC | #6

On 24 March 2015 at 16:16, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 24 Mar 2015, Ard Biesheuvel wrote:
>
>> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
>> > How many such veneers get added in the your kernel configuration, and
>> > how many are actually necessary (i.e., calls between MMU-off code and
>> > elsewhere)?
>> >
>>
>> Very few. In addition to the example (which will be addressed in
>> another way regardless) there are some resume functions that get
>> allocated in .data, and those would need it as well.
>
>
> What are they? I thought we removed all instances of those already.
>

It is cpu_resume() and then some mach specific sleep.S files that do a
relative branch to cpu_resume().
But it seems cpu_resume() could easily be moved to .text, it's just
one PC relative load that needs to be rewritten

>> I have also proposed b_far/bl_far macros that could be used there as
>> well.
>
> Could the automatic veneer insertion replace the unconditional
> b_far/bl_far usage?  The former would be preferable to the later.
>
>
> Nicolas

Nicolas Pitre March 24, 2015, 3:49 p.m. UTC | #7

On Tue, 24 Mar 2015, Ard Biesheuvel wrote:

> On 24 March 2015 at 16:16, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> > On Tue, 24 Mar 2015, Ard Biesheuvel wrote:
> >
> >> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
> >> > How many such veneers get added in the your kernel configuration, and
> >> > how many are actually necessary (i.e., calls between MMU-off code and
> >> > elsewhere)?
> >> >
> >>
> >> Very few. In addition to the example (which will be addressed in
> >> another way regardless) there are some resume functions that get
> >> allocated in .data, and those would need it as well.
> >
> >
> > What are they? I thought we removed all instances of those already.
> >
> >> I have also proposed b_far/bl_far macros that could be used there as
> >> well.
> >
> > Could the automatic veneer insertion replace the unconditional
> > b_far/bl_far usage?  The former would be preferable to the later.
> >
> 
> Agreed. I am not entirely sure why those functions don't get a veneer.
> Perhaps simply because .data is not annotated as executable?
> 
> Frankly, I don't really understand the purpose of putting those in
> .data in the first place. but if they need to remain there, I can try
> to figure out how to get the linker to emit veneers for those as well.

I'm guilty of introducing the first instance of code in .data back in 
... hrm ... 1998 or so.  I wasn't as experienced in ARM assembly back 
then and the resume code needed to fetch its context data while the MMU 
was off. So the easy way was simply to put the code next to the data 
block and get its address using adr.

These days we know how to write code to obtain position independent 
memory addresses at run time. One such example is commit b4e6153704 
where the bl relocation also exceeded its range.  But I thought I had 
converted all those instances already.

Nicolas

Ard Biesheuvel March 24, 2015, 3:51 p.m. UTC | #8

On 24 March 2015 at 16:49, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 24 Mar 2015, Ard Biesheuvel wrote:
>
>> On 24 March 2015 at 16:16, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
>> > On Tue, 24 Mar 2015, Ard Biesheuvel wrote:
>> >
>> >> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
>> >> > How many such veneers get added in the your kernel configuration, and
>> >> > how many are actually necessary (i.e., calls between MMU-off code and
>> >> > elsewhere)?
>> >> >
>> >>
>> >> Very few. In addition to the example (which will be addressed in
>> >> another way regardless) there are some resume functions that get
>> >> allocated in .data, and those would need it as well.
>> >
>> >
>> > What are they? I thought we removed all instances of those already.
>> >
>> >> I have also proposed b_far/bl_far macros that could be used there as
>> >> well.
>> >
>> > Could the automatic veneer insertion replace the unconditional
>> > b_far/bl_far usage?  The former would be preferable to the later.
>> >
>>
>> Agreed. I am not entirely sure why those functions don't get a veneer.
>> Perhaps simply because .data is not annotated as executable?
>>
>> Frankly, I don't really understand the purpose of putting those in
>> .data in the first place. but if they need to remain there, I can try
>> to figure out how to get the linker to emit veneers for those as well.
>
> I'm guilty of introducing the first instance of code in .data back in
> ... hrm ... 1998 or so.  I wasn't as experienced in ARM assembly back
> then and the resume code needed to fetch its context data while the MMU
> was off. So the easy way was simply to put the code next to the data
> block and get its address using adr.
>
> These days we know how to write code to obtain position independent
> memory addresses at run time. One such example is commit b4e6153704
> where the bl relocation also exceeded its range.  But I thought I had
> converted all those instances already.
>

Yes, there are still some instances left of that, including the core
cpu_resume() itself.
(and they all copy the same comment :-))

Nicolas Pitre March 24, 2015, 4:42 p.m. UTC | #9

On Tue, 24 Mar 2015, Ard Biesheuvel wrote:

> When building a very large kernel, it is up to the linker to decide
> when and where to insert stubs to allow calls to functions that are
> out of range for the ordinary b/bl instructions.
> 
> However, since the kernel is built as a position dependent binary,
> these stubs (aka veneers) may contain absolute addresses, which will
> break such veneer assisted far calls performed with the MMU off.
> 
> For instance, the call from __enable_mmu() in the .head.text section
> to __turn_mmu_on() in the .idmap.text section may be turned into
> something like this:
> 
> c0008168 <__enable_mmu>:
> c0008168:       f020 0002       bic.w   r0, r0, #2
> c000816c:       f420 5080       bic.w   r0, r0, #4096
> c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
> [...]
> c0008200 <____turn_mmu_on_veneer>:
> c0008200:       4778            bx      pc
> c0008202:       46c0            nop
> c0008204:       e59fc000        ldr     ip, [pc]
> c0008208:       e12fff1c        bx      ip
> c000820c:       c13dfae1        teqgt   sp, r1, ror #21
> [...]
> c13dfae0 <__turn_mmu_on>:
> c13dfae0:       4600            mov     r0, r0
> [...]
> 
> After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
> this instead:
> 
> c0008200 <____turn_mmu_on_veneer>:
> c0008200:       4778            bx      pc
> c0008202:       46c0            nop
> c0008204:       e59fc004        ldr     ip, [pc, #4]
> c0008208:       e08fc00c        add     ip, pc, ip
> c000820c:       e12fff1c        bx      ip
> c0008210:       013d7d31        teqeq   sp, r1, lsr sp
> c0008214:       00000000        andeq   r0, r0, r0
> 
> Note that this particular example is best addressed by moving
> .head.text and .idmap.text closer together, but this issue could
> potentially affect any code that needs to execute with the
> MMU off.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Acked-by: Nicolas Pitre <nico@linaro.org>

This is of no consequence when the kernel is still small enough not to 
require any veneers i.e. 99.99% of the time.  And, as I said, I prefer 
relying on the linker to insert those veneers as needed rather than 
using some unconditional far call macros.


> ---
>  arch/arm/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> index eb7bb511f853..ae5b33527f32 100644
> --- a/arch/arm/Makefile
> +++ b/arch/arm/Makefile
> @@ -13,7 +13,7 @@
>  # Ensure linker flags are correct
>  LDFLAGS		:=
>  
> -LDFLAGS_vmlinux	:=-p --no-undefined -X
> +LDFLAGS_vmlinux	:=-p --no-undefined -X --pic-veneer
>  ifeq ($(CONFIG_CPU_ENDIAN_BE8),y)
>  LDFLAGS_vmlinux	+= --be8
>  LDFLAGS_MODULE	+= --be8
> -- 
> 1.8.3.2
> 
>

Ard Biesheuvel March 24, 2015, 5:35 p.m. UTC | #10

On 24 March 2015 at 14:54, Dave Martin <Dave.Martin@arm.com> wrote:
> On Tue, Mar 24, 2015 at 01:50:40PM +0100, Ard Biesheuvel wrote:
>> On 24 March 2015 at 13:22, Dave Martin <Dave.Martin@arm.com> wrote:
>> > On Tue, Mar 24, 2015 at 11:16:24AM +0100, Ard Biesheuvel wrote:
>> >> When building a very large kernel, it is up to the linker to decide
>> >> when and where to insert stubs to allow calls to functions that are
>> >> out of range for the ordinary b/bl instructions.
>> >>
>> >> However, since the kernel is built as a position dependent binary,
>> >> these stubs (aka veneers) may contain absolute addresses, which will
>> >> break such veneer assisted far calls performed with the MMU off.
>> >>
>> >> For instance, the call from __enable_mmu() in the .head.text section
>> >> to __turn_mmu_on() in the .idmap.text section may be turned into
>> >> something like this:
>> >>
>> >> c0008168 <__enable_mmu>:
>> >> c0008168:       f020 0002       bic.w   r0, r0, #2
>> >> c000816c:       f420 5080       bic.w   r0, r0, #4096
>> >> c0008170:       f000 b846       b.w     c0008200 <____turn_mmu_on_veneer>
>> >> [...]
>> >> c0008200 <____turn_mmu_on_veneer>:
>> >> c0008200:       4778            bx      pc
>> >> c0008202:       46c0            nop
>> >> c0008204:       e59fc000        ldr     ip, [pc]
>> >> c0008208:       e12fff1c        bx      ip
>> >> c000820c:       c13dfae1        teqgt   sp, r1, ror #21
>> >> [...]
>> >> c13dfae0 <__turn_mmu_on>:
>> >> c13dfae0:       4600            mov     r0, r0
>> >> [...]
>> >>
>> >> After adding --pic-veneer to the LDFLAGS, the veneer is emitted like
>> >> this instead:
>> >>
>> >> c0008200 <____turn_mmu_on_veneer>:
>> >> c0008200:       4778            bx      pc
>> >> c0008202:       46c0            nop
>> >> c0008204:       e59fc004        ldr     ip, [pc, #4]
>> >> c0008208:       e08fc00c        add     ip, pc, ip
>> >> c000820c:       e12fff1c        bx      ip
>> >> c0008210:       013d7d31        teqeq   sp, r1, lsr sp
>> >> c0008214:       00000000        andeq   r0, r0, r0
>> >>
>> >> Note that this particular example is best addressed by moving
>> >> .head.text and .idmap.text closer together, but this issue could
>> >> potentially affect any code that needs to execute with the
>> >> MMU off.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >
>> > Although that fixes the problem, wouldn't this introduce extra potential
>> > overhead for every call in the kernel?
>> >
>>
>> It does not change whether a veneer is emitted or not, it only affects
>> the PIC nature of it.
>> So the overhead is 1 additional word for the add instruction, which I
>
> You're right, I misunderstood lightly what is going on there.
>
>> think is a small price to pay for correctness, especially considering
>> that someone building such a big kernel obviously does not optimize
>> for size.
>>
>> > How many such veneers get added in the your kernel configuration, and
>> > how many are actually necessary (i.e., calls between MMU-off code and
>> > elsewhere)?
>> >
>>
>> Very few. In addition to the example (which will be addressed in
>> another way regardless) there are some resume functions that get
>> allocated in .data, and those would need it as well. I have also
>> proposed b_far/bl_far macros that could be used there as well.
>>
>> The primary concern is that you can't really check whether any
>> problematic veneers have been emitted, unless all code that may run
>> with the MMU off is moved to the idmap.text section.
>
> That's a valid argument.
>
> Come to think of it, I can't think of a good reason why we don't
> pass --use-blx to the linker for THUMB2_KERNEL.  I think that would
> at least make these sequences a bit less painful by getting rid of
> the "bx pc" stuff.
>

Well, passing --use-blx doesn't seem to have the desired effect. I
still get these

c0181ba8 <___raw_spin_lock_veneer>:
c0181ba8:       4778            bx      pc
c0181baa:       46c0            nop                     ; (mov r8, r8)
[...]

$ size vmlinux
   text   data    bss    dec    hex filename
30038344 13868020 9613876 53520240 330a770 vmlinux

$ grep veneer System.map |wc -l
2211

Note that this is a Thumb2 kernel, and we may have some diminishing
returns here due to the reduced reach of the Thumb2 b/bl instructions.
Also, loading modules is going to be difficult without my PLT patch

Ard Biesheuvel March 24, 2015, 11:30 p.m. UTC | #11

On 25 March 2015 at 00:25, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Mar 24, 2015 at 04:51:07PM +0100, Ard Biesheuvel wrote:
>> Yes, there are still some instances left of that, including the core
>> cpu_resume() itself.
>> (and they all copy the same comment :-))
>
> That's because I follow the ethos of "why write complex code when
> simple and obvious code will do" and we'd done this for the original
> SA11x0 CPU resume code which formed (to some extent) the basis of the
> consolidated version we have today.
>
> At the time cpu_resume() was consolidated (which is actually only in
> the last five years or so) there wasn't the requirement to have large
> kernels link, so rewriting it as relatively complex implementation
> would've been damned stupid and a needless change in the consolidation
> effort.
>

Oh I totally agree. The simplest code that does the job correctly
should be preferred in almost all cases. But some of the copies are
rather shallow, e.g., the s5pv210 code sits in .data, but does nothing
except branch to cpu_resume(). Here the pattern is copied for no
obvious reason.

Ard Biesheuvel March 26, 2015, 12:20 p.m. UTC | #12

On 26 March 2015 at 12:36, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Wed, Mar 25, 2015 at 10:46:14AM +0000, Dave Martin wrote:
>> Hmmm, you seem to be right.
>>
>> Thumb has no bx <label> instruction, and veneers introduced by ld are
>> always ARM code, so this looks tricky to avoid without patching ld.
>>
>> As you observe, this only impacts large kernels anyway.
>>
>> So,
>>
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Someone needs to check what the result is like for older CPUs too.
>

Any specific concerns? The --pic-veneer switch dates back to 2009, so
it's been around for a while

Ard Biesheuvel March 26, 2015, 12:24 p.m. UTC | #13

On 26 March 2015 at 13:22, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Thu, Mar 26, 2015 at 01:20:16PM +0100, Ard Biesheuvel wrote:
>> On 26 March 2015 at 12:36, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Wed, Mar 25, 2015 at 10:46:14AM +0000, Dave Martin wrote:
>> >> Hmmm, you seem to be right.
>> >>
>> >> Thumb has no bx <label> instruction, and veneers introduced by ld are
>> >> always ARM code, so this looks tricky to avoid without patching ld.
>> >>
>> >> As you observe, this only impacts large kernels anyway.
>> >>
>> >> So,
>> >>
>> >> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>> >
>> > Someone needs to check what the result is like for older CPUs too.
>> >
>>
>> Any specific concerns? The --pic-veneer switch dates back to 2009, so
>> it's been around for a while
>
> BX.  Not every CPU supports BX.
>

The bx is used for the Thumb -> ARM switch inside the veneer only, so
if you build for ARM these won't appear in the output

Ard Biesheuvel March 26, 2015, 1:05 p.m. UTC | #14

On 26 March 2015 at 13:53, Dave P Martin <Dave.Martin@arm.com> wrote:
> On Thu, Mar 26, 2015 at 12:24:01PM +0000, Ard Biesheuvel wrote:
>> On 26 March 2015 at 13:22, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > On Thu, Mar 26, 2015 at 01:20:16PM +0100, Ard Biesheuvel wrote:
>
> [...]
>
>> >> Any specific concerns? The --pic-veneer switch dates back to 2009, so
>> >> it's been around for a while
>> >
>> > BX.  Not every CPU supports BX.
>> >
>>
>> The bx is used for the Thumb -> ARM switch inside the veneer only, so
>> if you build for ARM these won't appear in the output
>
> Agreed that's what's _supposed_ to happen.
>
> Note that --use-blx is an override and might introduce BXs into a kernel
> that otherwise doesn't have them.  But in any case, this option doesn't
> show a benefit for the scenario under discussion, hence my dropping the
> request (conditional on THUMB2_KERNEL in any case).
>

Actually, i am not quite sure. I hadn't realised that there is no
branch-without-link counterpart of 'blx <label>' so even if I still
potted a number of bx pc/nop sequences, I can't really say if there
were quite as many as before.

> Ard, can you should build a couple of !THUMB2_KERNEL/pre-v5
> configurations with your change and check, just to be sure?
>

Will do.

Ard Biesheuvel March 27, 2015, 12:01 a.m. UTC | #15

On 26 March 2015 at 14:19, Dave P Martin <Dave.Martin@arm.com> wrote:
> On Thu, Mar 26, 2015 at 01:05:09PM +0000, Ard Biesheuvel wrote:
>> On 26 March 2015 at 13:53, Dave P Martin <Dave.Martin@arm.com> wrote:
>> > On Thu, Mar 26, 2015 at 12:24:01PM +0000, Ard Biesheuvel wrote:
>> >> On 26 March 2015 at 13:22, Russell King - ARM Linux
>> >> <linux@arm.linux.org.uk> wrote:
>> >> > On Thu, Mar 26, 2015 at 01:20:16PM +0100, Ard Biesheuvel wrote:
>> >
>> > [...]
>> >
>> >> >> Any specific concerns? The --pic-veneer switch dates back to 2009, so
>> >> >> it's been around for a while
>> >> >
>> >> > BX.  Not every CPU supports BX.
>> >> >
>> >>
>> >> The bx is used for the Thumb -> ARM switch inside the veneer only, so
>> >> if you build for ARM these won't appear in the output
>> >
>> > Agreed that's what's _supposed_ to happen.
>> >
>> > Note that --use-blx is an override and might introduce BXs into a kernel
>> > that otherwise doesn't have them.  But in any case, this option doesn't
>> > show a benefit for the scenario under discussion, hence my dropping the
>> > request (conditional on THUMB2_KERNEL in any case).
>> >
>>
>> Actually, i am not quite sure. I hadn't realised that there is no
>> branch-without-link counterpart of 'blx <label>' so even if I still
>> potted a number of bx pc/nop sequences, I can't really say if there
>> were quite as many as before.
>
> Where not out of range, the compiler should just branch to the
> destination, with no veneer.  This would apply to optimised tail-
> calls, for example -- but with an ordinary branch, not BX.  The
> linker would only ever fix it up as B{L}X for one of two reasons:
> either the destination is in a different instruction set, or
> Thumb code is branching to a veneer (veneers are always ARM).
>
> THUMB2_KERNEL generates a kernel which is very nearly 100% Thumb
> code, so BX should be very rare when veneers are not involved.
>
>> > Ard, can you should build a couple of !THUMB2_KERNEL/pre-v5
>> > configurations with your change and check, just to be sure?
>> >
>>
>> Will do.
>>
>

ARM kernel with --pic-veneer:
=============================
$ readelf -A vmlinux
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "4T"
  Tag_CPU_arch: v4T
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-1
  [...]

$ size vmlinux
   text   data    bss    dec    hex filename
38287123 13496224 9589540 61372887 3a879d7 vmlinux

$ grep -c veneer System.map
493

Veneers look like this:

c247d53c <__sys_write_veneer>:
c247d53c:       e59fc000        ldr     ip, [pc]
c247d540:       e08ff00c        add     pc, pc, ip
c247d544:       fdd3aa8c        .word   0xfdd3aa8c

c247d548 <__init_rt_rq_veneer>:
c247d548:       e59fc000        ldr     ip, [pc]
c247d54c:       e08ff00c        add     pc, pc, ip
c247d550:       fdbf9600        .word   0xfdbf9600

c247d554 <__read_boot_clock_veneer>:
c247d554:       e59fc000        ldr     ip, [pc]
c247d558:       e08ff00c        add     pc, pc, ip
c247d55c:       fdb91310        .word   0xfdb91310

Thumb2 kernel with --pic-veneer
===============================
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "7-A"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  [...]

$ size vmlinux
   text   data    bss    dec    hex filename
29353740 13644128 9614052 52611920 322cb50 vmlinux

$ grep -c veneer System.map
1928

Veneers look like this:

c014f9d8 <____down_interruptible_veneer>:
c014f9d8:       e59fc004        ldr     ip, [pc, #4]
c014f9dc:       e08fc00c        add     ip, pc, ip
c014f9e0:       e12fff1c        bx      ip
c014f9e4:       0128de45        .word   0x0128de45

c014f9e8 <___raw_write_unlock_veneer>:
c014f9e8:       4778            bx      pc
c014f9ea:       46c0            nop
c014f9ec:       e59fc004        ldr     ip, [pc, #4]
c014f9f0:       e08fc00c        add     ip, pc, ip
c014f9f4:       e12fff1c        bx      ip
c014f9f8:       012901e1        .word   0x012901e1

c014f9fc <____mutex_unlock_slowpath_veneer>:
c014f9fc:       e59fc004        ldr     ip, [pc, #4]
c014fa00:       e08fc00c        add     ip, pc, ip
c014fa04:       e12fff1c        bx      ip
c014fa08:       0128d8e1        .word   0x0128d8e1

$ arm-linux-gnueabihf-objdump -d vmlinux |grep -E bx\\s+pc -c
319

This number does not change when adding --use-blx, so there is probably not
much point in adding it.

ARM: force linker to use PIC veneers

Commit Message

Comments

Patch