Message ID | 20200914095706.3985-1-ardb@kernel.org |
---|---|
Headers | show |
Series | ARM: use adr_l/ldr_l macros for PC-relative references | expand |
On Mon, 14 Sep 2020, Ard Biesheuvel wrote: > This is a respin of the adr_l/ldr_l code I wrote some years ago in the > context of my KASLR proof of concept for 32-bit ARM. > > A new use case came up, in the form of Clang, which does not implement > the 'adrl' pseudo-instruction in its assembler, and so for PC-relative > references that don't fit into a ARM adr instruction, we need something > else. Patch #2 addresses an actual Clang build issue of this nature, by > replacing an occurrence of adrl with adr_l. > > I have included my existing cleanup patches that were built on top of the > adr_l macro, which replace several occurrences of open coded arithmetic to > calculate runtime addresses based on link time virtual addresses stored > in literals. > > Note that all of these patches with the exception of #2 were reviewed or > acked by Nico before, but given that this was a while ago (and the fact Certainly it must have been, as I didn't remember much of it. > that neither of us work for Linaro anymore), I have dropped these. Note > that only patch #1 deviates significantly from the last version that I > sent out, the remaining ones were just freshened up (and their commit > logs slightly expanded). Reviewed-by: Nicolas Pitre <nico@fluxnic.net> > Cc: Russell King <linux@armlinux.org.uk> > Cc: Linus Walleij <linus.walleij@linaro.org> > Cc: Nicolas Pitre <nico@fluxnic.net> > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Stefan Agner <stefan@agner.ch> > Cc: Peter Smith <Peter.Smith@arm.com> > Cc: Marc Zyngier <maz@kernel.org> > Cc: Will Deacon <will@kernel.org> > > Ard Biesheuvel (12): > ARM: assembler: introduce adr_l, ldr_l and str_l macros > ARM: efistub: replace adrl pseudo-op with adr_l macro invocation > ARM: module: add support for place relative relocations > ARM: head-common.S: use PC-relative insn sequence for __proc_info > ARM: head-common.S: use PC-relative insn sequence for idmap creation > ARM: head.S: use PC-relative insn sequence for secondary_data > ARM: kernel: use relative references for UP/SMP alternatives > ARM: head: use PC-relative insn sequence for __smp_alt > ARM: sleep.S: use PC-relative insn sequence for > sleep_save_sp/mpidr_hash > ARM: head.S: use PC-relative insn sequences for __fixup_pv_table > ARM: head.S: use PC relative insn sequence to calculate PHYS_OFFSET > ARM: kvm: replace open coded VA->PA calculations with adr_l call > > arch/arm/boot/compressed/head.S | 18 +--- > arch/arm/include/asm/assembler.h | 88 ++++++++++++++++++- > arch/arm/include/asm/elf.h | 5 ++ > arch/arm/include/asm/processor.h | 2 +- > arch/arm/kernel/head-common.S | 22 ++--- > arch/arm/kernel/head.S | 90 +++++--------------- > arch/arm/kernel/hyp-stub.S | 27 +++--- > arch/arm/kernel/module.c | 20 ++++- > arch/arm/kernel/sleep.S | 19 ++--- > 9 files changed, 159 insertions(+), 132 deletions(-) > > -- > 2.17.1 > >
On Mon, Sep 14, 2020 at 09:35:41AM -0400, Nicolas Pitre wrote: > On Mon, 14 Sep 2020, Ard Biesheuvel wrote: > > > When using the new adr_l/ldr_l/str_l macros to refer to external symbols > > from modules, the linker may emit place relative ELF relocations that > > need to be fixed up by the module loader. So add support for these. > > Just wondering if that capability requirement should be added to the > module signature somehow...? > > Maybe not. The MODULE_ARCH_VERMAGIC definition only contains things that > are configurable for a given build. It doesn't need to be. If a module contains a relocation we don't know how to handle, it will fail to load.
On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote: > > Like arm64, ARM supports position independent code sequences that > produce symbol references with a greater reach than the ordinary > adr/ldr instructions. Since on ARM, the adrl pseudo-instruction is > only supported in ARM mode (and not at all when using Clang), having > a adr_l macro like we do on arm64 is useful, and increases symmetry > as well. > > Currently, we use open coded instruction sequences involving literals > and arithmetic operations. Instead, we can use movw/movt pairs on v7 > CPUs, circumventing the D-cache entirely. > > E.g., on v7+ CPUs, we can emit a PC-relative reference as follows: > > movw <reg>, #:lower16:<sym> - (1f + 8) > movt <reg>, #:upper16:<sym> - (1f + 8) > 1: add <reg>, <reg>, pc > > For older CPUs, we can emit the literal into a subsection, allowing it > to be emitted out of line while retaining the ability to perform > arithmetic on label offsets. > > E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows: > > ldr <reg>, 2f > 1: add <reg>, <reg>, pc > .subsection 1 > 2: .long <sym> - (1b + 8) > .previous > > This is allowed by the assembler because, unlike ordinary sections, > subsections are combined into a single section in the object file, and > so the label references are not true cross-section references that are > visible as relocations. (Subsections have been available in binutils > since 2004 at least, so they should not cause any issues with older > toolchains.) > > So use the above to implement the macros mov_l, adr_l, ldr_l and str_l, > all of which will use movw/movt pairs on v7 and later CPUs, and use > PC-relative literals otherwise. > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > --- > arch/arm/include/asm/assembler.h | 84 ++++++++++++++++++++ > 1 file changed, 84 insertions(+) > > diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h > index feac2c8b86f2..39e972eaaa3f 100644 > --- a/arch/arm/include/asm/assembler.h > +++ b/arch/arm/include/asm/assembler.h > @@ -494,4 +494,88 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) > #define _ASM_NOKPROBE(entry) > #endif > > + .macro __adldst_l, op, reg, sym, tmp, c > + .if __LINUX_ARM_ARCH__ < 7 > + ldr\c \tmp, .La\@ > + .subsection 1 > + .align 2 > +.La\@: .long \sym - .Lpc\@ > + .previous > + .else > + .ifnb \c > + THUMB( ittt \c ) > + .endif > + movw\c \tmp, #:lower16:\sym - .Lpc\@ > + movt\c \tmp, #:upper16:\sym - .Lpc\@ > + .endif > + > +#ifndef CONFIG_THUMB2_KERNEL > + .set .Lpc\@, . + 8 // PC bias > + .ifc \op, add > + add\c \reg, \tmp, pc > + .else > + \op\c \reg, [\tmp, pc] This should be \op\c \reg, [pc, \tmp] > + .endif > +#else > +.Lb\@: add\c \tmp, \tmp, pc > + /* > + * In Thumb-2 builds, the PC bias depends on whether we are currently > + * emitting into a .arm or a .thumb section. The size of the add opcode > + * above will be 2 bytes when emitting in Thumb mode and 4 bytes when > + * emitting in ARM mode, so let's use this to account for the bias. > + */ > + .set .Lpc\@, . + (. - .Lb\@) > + > + .ifnc \op, add > + \op\c \reg, [\tmp] > + .endif > +#endif > + .endm > + > + /* > + * mov_l - move a constant value or [relocated] address into a register > + */ > + .macro mov_l, dst:req, imm:req > + .if __LINUX_ARM_ARCH__ < 7 > + ldr \dst, =\imm > + .else > + movw \dst, #:lower16:\imm > + movt \dst, #:upper16:\imm > + .endif > + .endm > + > + /* > + * adr_l - adr pseudo-op with unlimited range > + * > + * @dst: destination register > + * @sym: name of the symbol > + * @cond: conditional opcode suffix > + */ > + .macro adr_l, dst:req, sym:req, cond > + __adldst_l add, \dst, \sym, \dst, \cond > + .endm > + > + /* > + * ldr_l - ldr <literal> pseudo-op with unlimited range > + * > + * @dst: destination register > + * @sym: name of the symbol > + * @cond: conditional opcode suffix > + */ > + .macro ldr_l, dst:req, sym:req, cond > + __adldst_l ldr, \dst, \sym, \dst, \cond > + .endm > + > + /* > + * str_l - str <literal> pseudo-op with unlimited range > + * > + * @src: source register > + * @sym: name of the symbol > + * @tmp: mandatory scratch register > + * @cond: conditional opcode suffix > + */ > + .macro str_l, src:req, sym:req, tmp:req, cond > + __adldst_l str, \src, \sym, \tmp, \cond > + .endm > + > #endif /* __ASM_ASSEMBLER_H__ */ > -- > 2.17.1 >
On Tue, Sep 15, 2020 at 12:35 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Mon, 14 Sep 2020 at 12:57, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h > > index feac2c8b86f2..39e972eaaa3f 100644 > > --- a/arch/arm/include/asm/assembler.h > > +++ b/arch/arm/include/asm/assembler.h > > @@ -494,4 +494,88 @@ THUMB( orr \reg , \reg , #PSR_T_BIT ) > > #define _ASM_NOKPROBE(entry) > > #endif > > > > + .macro __adldst_l, op, reg, sym, tmp, c > > + .if __LINUX_ARM_ARCH__ < 7 > > + ldr\c \tmp, .La\@ > > + .subsection 1 > > + .align 2 > > +.La\@: .long \sym - .Lpc\@ > > + .previous > > + .else > > + .ifnb \c > > + THUMB( ittt \c ) > > + .endif > > + movw\c \tmp, #:lower16:\sym - .Lpc\@ > > + movt\c \tmp, #:upper16:\sym - .Lpc\@ > > + .endif > > + > > +#ifndef CONFIG_THUMB2_KERNEL > > + .set .Lpc\@, . + 8 // PC bias > > + .ifc \op, add > > + add\c \reg, \tmp, pc > > + .else > > + \op\c \reg, [\tmp, pc] > > This should be > > \op\c \reg, [pc, \tmp] That looks better. I had tested this series yesterday and wasn't able to build with either toolchains. Was going to ask if I was using the wrong base or if there was an issue with my version binutils. All good now, I'll try to hammer on this a bit.
On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > This is a respin of the adr_l/ldr_l code I wrote some years ago in the > context of my KASLR proof of concept for 32-bit ARM. > > A new use case came up, in the form of Clang, which does not implement > the 'adrl' pseudo-instruction in its assembler, and so for PC-relative > references that don't fit into a ARM adr instruction, we need something > else. Patch #2 addresses an actual Clang build issue of this nature, by > replacing an occurrence of adrl with adr_l. > > I have included my existing cleanup patches that were built on top of the > adr_l macro, which replace several occurrences of open coded arithmetic to > calculate runtime addresses based on link time virtual addresses stored > in literals. > > Note that all of these patches with the exception of #2 were reviewed or > acked by Nico before, but given that this was a while ago (and the fact > that neither of us work for Linaro anymore), I have dropped these. Note > that only patch #1 deviates significantly from the last version that I > sent out, the remaining ones were just freshened up (and their commit > logs slightly expanded). Tested-by: Nick Desaulniers <ndesaulniers@google.com> Thanks for the series, Ard. I was able to compile and boot the following with this series (and the fixup to 01/12 applied): $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig (So ARM v7/GCC, ARM v5,6,7/LLVM). (Technically, the v6 is not booting, but it's not related to this series and fails to boot without the series as well. Our CI on -next is red for an unrelated issue which is masking the regression). I was also able to build+boot the defconfig with v5 and v7 with some configs disabled and a few hacks with LLVM_IAS=1. This series allowed me to get further in the build/testing, and I have a few new bugs to go chase. If anyone's interested: https://github.com/ClangBuiltLinux/linux/issues/1154 https://github.com/ClangBuiltLinux/linux/issues/1155 so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm, but we're steadily chipping away at it, and this series is a big help. Lest it look like there's only kernel fixes in this area, Jian's https://reviews.llvm.org/D69411 recently was a big help, specifically for ARCH=arm LLVM_IAS=1. -- Thanks, ~Nick Desaulniers
On Tue, 15 Sep 2020 at 22:32, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Mon, Sep 14, 2020 at 2:57 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > This is a respin of the adr_l/ldr_l code I wrote some years ago in the > > context of my KASLR proof of concept for 32-bit ARM. > > > > A new use case came up, in the form of Clang, which does not implement > > the 'adrl' pseudo-instruction in its assembler, and so for PC-relative > > references that don't fit into a ARM adr instruction, we need something > > else. Patch #2 addresses an actual Clang build issue of this nature, by > > replacing an occurrence of adrl with adr_l. > > > > I have included my existing cleanup patches that were built on top of the > > adr_l macro, which replace several occurrences of open coded arithmetic to > > calculate runtime addresses based on link time virtual addresses stored > > in literals. > > > > Note that all of these patches with the exception of #2 were reviewed or > > acked by Nico before, but given that this was a while ago (and the fact > > that neither of us work for Linaro anymore), I have dropped these. Note > > that only patch #1 deviates significantly from the last version that I > > sent out, the remaining ones were just freshened up (and their commit > > logs slightly expanded). > > Tested-by: Nick Desaulniers <ndesaulniers@google.com> > > Thanks for the series, Ard. I was able to compile and boot the > following with this series (and the fixup to 01/12 applied): > > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make defconfig > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 defconfig > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 aspeed_g5_defconfig > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=1 multi_v5_defconfig > > (So ARM v7/GCC, ARM v5,6,7/LLVM). (Technically, the v6 is not > booting, but it's not related to this series and fails to boot without > the series as well. Our CI on -next is red for an unrelated issue > which is masking the regression). > Excellent, thanks for testing. Do you have any coverage for Thumb2 builds as well? (CONFIG_THUMB2_KERNEL=y) > I was also able to build+boot the defconfig with v5 and v7 with some > configs disabled and a few hacks with LLVM_IAS=1. This series allowed > me to get further in the build/testing, and I have a few new bugs to > go chase. If anyone's interested: > https://github.com/ClangBuiltLinux/linux/issues/1154 > https://github.com/ClangBuiltLinux/linux/issues/1155 > so we're still a handful of bugs away from LLVM_IAS=1 with ARCH=arm, > but we're steadily chipping away at it, and this series is a big help. > Lest it look like there's only kernel fixes in this area, Jian's > https://reviews.llvm.org/D69411 recently was a big help, specifically > for ARCH=arm LLVM_IAS=1. > -- > Thanks, > ~Nick Desaulniers
On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > builds as well? (CONFIG_THUMB2_KERNEL=y) Ah, right, manually testing ARCH=arm defconfig with CONFIG_THUMB2_KERNEL enabled via menuconfig: Builds and boots for clang. (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u for an unrelated issue). For GCC, I observe: arch/arm/vfp/vfphw.o: in function `vfp_support_entry': (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against symbol `vfp_kmode_exception' defined in .text.unlikely section in arch/arm/vfp/vfpmodule.o There doesn't seem to be a config option to work around that for now. Though it's not caused by your series; if I drop your series, I can still reproduce. Is there a different config I should be testing rather than defconfig+manual testing, like one of the existing configs? Looks like only milbeaut_m10v_defconfig enable THUMB2 by default. I should add that to my CI and kernelCI for coverage with clang. https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376 milbeaut_m10v_defconfig builds with your series for clang. Looks like that config may be currently broken for GCC? Looks like it doesn't boot with Clang, so I'll need to debug that. Again, both of the two past sentences are regardless of your series. So your series still LGTM. -- Thanks, ~Nick Desaulniers
On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > > builds as well? (CONFIG_THUMB2_KERNEL=y) > > Ah, right, manually testing ARCH=arm defconfig with > CONFIG_THUMB2_KERNEL enabled via menuconfig: > > Builds and boots for clang. > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u > for an unrelated issue). > > For GCC, I observe: > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry': > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against > symbol `vfp_kmode_exception' defined in .text.unlikely section in > arch/arm/vfp/vfpmodule.o > Interesting. And this is using ld.bfd ? > There doesn't seem to be a config option to work around that for now. > Though it's not caused by your series; if I drop your series, I can > still reproduce. > > Is there a different config I should be testing rather than > defconfig+manual testing, like one of the existing configs? Looks like > only milbeaut_m10v_defconfig enable THUMB2 by default. I should add > that to my CI and kernelCI for coverage with clang. > https://github.com/ClangBuiltLinux/continuous-integration/issues/94#issuecomment-693030376 > > milbeaut_m10v_defconfig > builds with your series for clang. Looks like that config may be > currently broken for GCC? > Looks like it doesn't boot with Clang, so I'll need to debug that. > Again, both of the two past sentences are regardless of your series. > So your series still LGTM. Cheers. I usually build multi_v7_defconfig with/wihout CONFIG_LPAE x with/without CONFIG_THUMB2_KERNEL (LPAE affects the size of dma_addr_t, and uses a different page table format, so it is a useful bit to flick in testing)
On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > > > builds as well? (CONFIG_THUMB2_KERNEL=y) > > > > Ah, right, manually testing ARCH=arm defconfig with > > CONFIG_THUMB2_KERNEL enabled via menuconfig: > > > > Builds and boots for clang. > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u > > for an unrelated issue). > > > > For GCC, I observe: > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry': > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against > > symbol `vfp_kmode_exception' defined in .text.unlikely section in > > arch/arm/vfp/vfpmodule.o > > > > Interesting. And this is using ld.bfd ? $ arm-linux-gnueabihf-ld --version GNU ld (GNU Binutils for Debian) 2.34 .text.unlikely reminds me of the sections created when profiling data is present. That's obviously not the case here, so maybe there's other ways this section can be created by GCC? I may have added a patch recently for placing this section explicitly, which was a prerequisite for Kees' series explicitly placing all sections. https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/ Maybe that can be improved? IIRC, LLD is able to emit range extension thunks for these cases, but BFD does not.
On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > > > > builds as well? (CONFIG_THUMB2_KERNEL=y) > > > > > > Ah, right, manually testing ARCH=arm defconfig with > > > CONFIG_THUMB2_KERNEL enabled via menuconfig: > > > > > > Builds and boots for clang. > > > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u > > > for an unrelated issue). > > > > > > For GCC, I observe: > > > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry': > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in > > > arch/arm/vfp/vfpmodule.o > > > > > > > Interesting. And this is using ld.bfd ? > > $ arm-linux-gnueabihf-ld --version > GNU ld (GNU Binutils for Debian) 2.34 > > .text.unlikely reminds me of the sections created when profiling data > is present. That's obviously not the case here, so maybe there's > other ways this section can be created by GCC? I may have added a > patch recently for placing this section explicitly, which was a > prerequisite for Kees' series explicitly placing all sections. > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/ > Maybe that can be improved? > > IIRC, LLD is able to emit range extension thunks for these cases, but > BFD does not. ld.bfd will usually emit veneers for branches that turn out to be out of range in the final link stage. Does the following help? diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 4fcff9f59947..f1468702fbc9 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry) ldr r3, [sp, #S_PSR] @ Neither lazy restore nor FP exceptions and r3, r3, #MODE_MASK @ are supported in kernel mode teq r3, #USR_MODE +THUMB( it ne ) bne vfp_kmode_exception @ Returns through lr VFPFMRX r1, FPEXC @ Is the VFP enabled?
On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > > > > > builds as well? (CONFIG_THUMB2_KERNEL=y) > > > > > > > > Ah, right, manually testing ARCH=arm defconfig with > > > > CONFIG_THUMB2_KERNEL enabled via menuconfig: > > > > > > > > Builds and boots for clang. > > > > > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u > > > > for an unrelated issue). > > > > > > > > For GCC, I observe: > > > > > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry': > > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against > > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in > > > > arch/arm/vfp/vfpmodule.o > > > > > > > > > > Interesting. And this is using ld.bfd ? > > > > $ arm-linux-gnueabihf-ld --version > > GNU ld (GNU Binutils for Debian) 2.34 > > > > .text.unlikely reminds me of the sections created when profiling data > > is present. That's obviously not the case here, so maybe there's > > other ways this section can be created by GCC? I may have added a > > patch recently for placing this section explicitly, which was a > > prerequisite for Kees' series explicitly placing all sections. > > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/ > > Maybe that can be improved? > > > > IIRC, LLD is able to emit range extension thunks for these cases, but > > BFD does not. > > ld.bfd will usually emit veneers for branches that turn out to be out > of range in the final link stage. > > Does the following help? > > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > index 4fcff9f59947..f1468702fbc9 100644 > --- a/arch/arm/vfp/vfphw.S > +++ b/arch/arm/vfp/vfphw.S > @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry) > ldr r3, [sp, #S_PSR] @ Neither lazy restore nor FP exceptions > and r3, r3, #MODE_MASK @ are supported in kernel mode > teq r3, #USR_MODE > +THUMB( it ne ) > bne vfp_kmode_exception @ Returns through lr > > VFPFMRX r1, FPEXC @ Is the VFP enabled? Yes, it does! I'm curious why though? I've seen the "it prefixes" before (is that what they're called?), but I don't recall what they're for. How come that solves this issue? (Feel free to use my tested by tag on a subsequent resulting patch that uses that). That fix is irrespective of this series, so you should send it maybe separately? Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to boot for me in QEMU; so maybe not just a Clang bug (or maybe, more than one bug). (or maybe one of my command line params to QEMU is wrong). Stepping through start_kernel(), the call to setup_arch() seems to hang in qemu. For both GCC and Clang builds. A breakpoint in panic() never gets hit. Looks like the deepest I can get is: Looks like: #0 memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1601 #1 0xc060f0b4 in memblock_alloc (size=<optimized out>, align=<optimized out>) at ./include/linux/memblock.h:409 #2 cma_init_reserved_mem (base=1543503872, size=67108864, order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc <dma_contiguous_default_area>) at mm/cma.c:190 #3 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, fixed=false, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 #4 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, order_per_bit=<optimized out>, name=<optimized out>, res_cma=<optimized out>, fixed=<optimized out>, limit=<optimized out>, size=<optimized out>, base=<optimized out>) at ./include/linux/cma.h:28 #5 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized out>, limit=<optimized out>, res_cma=0xc07ccbdc <dma_contiguous_default_area>, fixed=false) at kernel/dma/contiguous.c:201 #6 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at kernel/dma/contiguous.c:171 #7 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 #8 0xc060302c in setup_arch (cmdline_p=<optimized out>) at arch/arm/kernel/setup.c:1132 #9 0xc06007d2 in start_kernel () at init/main.c:857 there's a call to memset that seems to hang. I wonder if memset() was defined in terms of __builtin_memset, which oft can result in infinite loops? (IIRC there's an AEABI related memset; this kind of thing has hit android userspace before). (gdb) layout asm shows that the source call to memset is transformed into a call to mmioset. (gdb) bt #0 mmioset () at arch/arm/lib/memset.S:19 #1 0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602 #2 0xc060f0b4 in memblock_alloc (size=<optimized out>, align=<optimized out>) at ./include/linux/memblock.h:409 #3 cma_init_reserved_mem (base=1543503872, size=67108864, order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc <dma_contiguous_default_area>) at mm/cma.c:190 #4 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, fixed=false, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 #5 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, order_per_bit=<optimized out>, name=<optimized out>, res_cma=<optimized out>, fixed=<optimized out>, limit=<optimized out>, size=<optimized out>, base=<optimized out>) at ./include/linux/cma.h:28 #6 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized out>, limit=<optimized out>, res_cma=0xc07ccbdc <dma_contiguous_default_area>, fixed=false) at kernel/dma/contiguous.c:201 #7 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at kernel/dma/contiguous.c:171 #8 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 #9 0xc060302c in setup_arch (cmdline_p=<optimized out>) at arch/arm/kernel/setup.c:1132 #10 0xc06007d2 in start_kernel () at init/main.c:857 Using `si` in gdb, it looks like we maybe hit an exception vector? x 1202 .section .vectors, "ax", %progbits x x 1203 .L__vectors_start: x x 1204 W(b) vector_rst x x 1205 W(b) vector_und x x 1206 W(ldr) pc, .L__vectors_start + 0x1000 x x >1207 W(b) vector_pabt Is the last thing I see, then `si` stops working. Not sure if `pabt` is a recognizable acronym to anyone more familiar with ARM? Happens regardless of your series, FWIW.
On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers <ndesaulniers@google.com> wrote: > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more > than one bug). (or maybe one of my command line params to QEMU is > wrong). > > Stepping through start_kernel(), the call to setup_arch() seems to > hang in qemu. For both GCC and Clang builds. A breakpoint in panic() > never gets hit. Looks like the deepest I can get is: > > Looks like: > #0 memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0, > nid=-1) at mm/memblock.c:1601 > #1 0xc060f0b4 in memblock_alloc (size=<optimized out>, > align=<optimized out>) at ./include/linux/memblock.h:409 > #2 cma_init_reserved_mem (base=1543503872, size=67108864, > order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc > <dma_contiguous_default_area>) at mm/cma.c:190 > #3 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, > size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, > fixed=false, name=0xc04b9240 "reserved", > res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 > #4 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, > order_per_bit=<optimized out>, name=<optimized out>, > res_cma=<optimized out>, fixed=<optimized out>, > limit=<optimized out>, size=<optimized out>, base=<optimized out>) > at ./include/linux/cma.h:28 > #5 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized > out>, limit=<optimized out>, res_cma=0xc07ccbdc > <dma_contiguous_default_area>, fixed=false) > at kernel/dma/contiguous.c:201 > #6 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at > kernel/dma/contiguous.c:171 > #7 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 > <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 > #8 0xc060302c in setup_arch (cmdline_p=<optimized out>) at > arch/arm/kernel/setup.c:1132 > #9 0xc06007d2 in start_kernel () at init/main.c:857 > > there's a call to memset that seems to hang. I wonder if memset() was > defined in terms of __builtin_memset, which oft can result in infinite > loops? (IIRC there's an AEABI related memset; this kind of thing has > hit android userspace before). > > (gdb) layout asm > > shows that the source call to memset is transformed into a call to mmioset. > > (gdb) bt > #0 mmioset () at arch/arm/lib/memset.S:19 > #1 0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized > out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602 > #2 0xc060f0b4 in memblock_alloc (size=<optimized out>, > align=<optimized out>) at ./include/linux/memblock.h:409 > #3 cma_init_reserved_mem (base=1543503872, size=67108864, > order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc > <dma_contiguous_default_area>) at mm/cma.c:190 > #4 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, > size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, > fixed=false, name=0xc04b9240 "reserved", > res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 > #5 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, > order_per_bit=<optimized out>, name=<optimized out>, > res_cma=<optimized out>, fixed=<optimized out>, > limit=<optimized out>, size=<optimized out>, base=<optimized out>) > at ./include/linux/cma.h:28 > #6 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized > out>, limit=<optimized out>, res_cma=0xc07ccbdc > <dma_contiguous_default_area>, fixed=false) > at kernel/dma/contiguous.c:201 > #7 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at > kernel/dma/contiguous.c:171 > #8 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 > <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 > #9 0xc060302c in setup_arch (cmdline_p=<optimized out>) at > arch/arm/kernel/setup.c:1132 > #10 0xc06007d2 in start_kernel () at init/main.c:857 > > Using `si` in gdb, it looks like we maybe hit an exception vector? > x 1202 .section .vectors, "ax", %progbits > > x > x 1203 .L__vectors_start: > > x > x 1204 W(b) vector_rst > > x > x 1205 W(b) vector_und > > x > x 1206 W(ldr) pc, .L__vectors_start + 0x1000 > > x > x >1207 W(b) vector_pabt > > Is the last thing I see, then `si` stops working. Not sure if `pabt` > is a recognizable acronym to anyone more familiar with ARM? > > Happens regardless of your series, FWIW. It seems this is affecting the ARCH=arm defconfig (regardless of toolchain) of linux-next today. I know you've warned me about testing on -next before... Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/ ? That looks arm64 specific though. Maybe 32b ARM needs the same or a similar fix? Oh man, this boots, total shot in the dark: diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 45f9d5ec2360..7118b98c1f5f 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct machine_desc *mdesc) early_init_fdt_reserve_self(); early_init_fdt_scan_reserved_mem(); - /* reserve memory for DMA contiguous allocations */ - dma_contiguous_reserve(arm_dma_limit); - arm_memblock_steal_permitted = false; memblock_dump_all(); } @@ -248,6 +245,9 @@ void __init bootmem_init(void) */ sparse_init(); + /* reserve memory for DMA contiguous allocations */ + dma_contiguous_reserve(arm_dma_limit); + /* * Now free the memory - free_area_init needs * the sparse mem_map arrays initialized by sparse_init() -- Thanks, ~Nick Desaulniers
(added Mike K.) On Wed, Sep 16, 2020 at 05:16:32PM -0700, Nick Desaulniers wrote: > On Wed, Sep 16, 2020 at 2:25 PM Nick Desaulniers > <ndesaulniers@google.com> wrote: ... > Maybe this is: https://lore.kernel.org/linux-next/20200916140437.GL2142832@kernel.org/ > ? That looks arm64 specific though. Maybe 32b ARM needs the same or a > similar fix? Oh man, this boots, total shot in the dark: The CMA change is the problem IMO and it's now removed from -mm and -next trees. > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c > index 45f9d5ec2360..7118b98c1f5f 100644 > --- a/arch/arm/mm/init.c > +++ b/arch/arm/mm/init.c > @@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct > machine_desc *mdesc) > early_init_fdt_reserve_self(); > early_init_fdt_scan_reserved_mem(); > > - /* reserve memory for DMA contiguous allocations */ > - dma_contiguous_reserve(arm_dma_limit); > - > arm_memblock_steal_permitted = false; > memblock_dump_all(); > } > @@ -248,6 +245,9 @@ void __init bootmem_init(void) > */ > sparse_init(); > > + /* reserve memory for DMA contiguous allocations */ > + dma_contiguous_reserve(arm_dma_limit); > + This might be too late for ARM because in paging_init() it calls dma_contiguous_remap() which presumes that the CMA area is already reserved. dma_contiguous_remap() might be NOP, so your fix will boot until it fails eventually :) > /* > * Now free the memory - free_area_init needs > * the sparse mem_map arrays initialized by sparse_init()
On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Wed, Sep 16, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Wed, 16 Sep 2020 at 22:53, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > > > On Tue, Sep 15, 2020 at 10:55 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Wed, 16 Sep 2020 at 02:31, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > > > > > > > On Tue, Sep 15, 2020 at 2:30 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > > > Excellent, thanks for testing. Do you have any coverage for Thumb2 > > > > > > builds as well? (CONFIG_THUMB2_KERNEL=y) > > > > > > > > > > Ah, right, manually testing ARCH=arm defconfig with > > > > > CONFIG_THUMB2_KERNEL enabled via menuconfig: > > > > > > > > > > Builds and boots for clang. > > > > > > > > > > (Also needed https://lore.kernel.org/lkml/20200915225751.274531-1-ndesaulniers@google.com/T/#u > > > > > for an unrelated issue). > > > > > > > > > > For GCC, I observe: > > > > > > > > > > arch/arm/vfp/vfphw.o: in function `vfp_support_entry': > > > > > (.text+0xa): relocation truncated to fit: R_ARM_THM_JUMP19 against > > > > > symbol `vfp_kmode_exception' defined in .text.unlikely section in > > > > > arch/arm/vfp/vfpmodule.o > > > > > > > > > > > > > Interesting. And this is using ld.bfd ? > > > > > > $ arm-linux-gnueabihf-ld --version > > > GNU ld (GNU Binutils for Debian) 2.34 > > > > > > .text.unlikely reminds me of the sections created when profiling data > > > is present. That's obviously not the case here, so maybe there's > > > other ways this section can be created by GCC? I may have added a > > > patch recently for placing this section explicitly, which was a > > > prerequisite for Kees' series explicitly placing all sections. > > > https://lore.kernel.org/lkml/159896087937.20229.4955362311782724603.tip-bot2@tip-bot2/ > > > Maybe that can be improved? > > > > > > IIRC, LLD is able to emit range extension thunks for these cases, but > > > BFD does not. > > > > ld.bfd will usually emit veneers for branches that turn out to be out > > of range in the final link stage. > > > > Does the following help? > > > > diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S > > index 4fcff9f59947..f1468702fbc9 100644 > > --- a/arch/arm/vfp/vfphw.S > > +++ b/arch/arm/vfp/vfphw.S > > @@ -82,6 +82,7 @@ ENTRY(vfp_support_entry) > > ldr r3, [sp, #S_PSR] @ Neither lazy restore nor FP exceptions > > and r3, r3, #MODE_MASK @ are supported in kernel mode > > teq r3, #USR_MODE > > +THUMB( it ne ) > > bne vfp_kmode_exception @ Returns through lr > > > > VFPFMRX r1, FPEXC @ Is the VFP enabled? > > Yes, it does! I'm curious why though? I've seen the "it prefixes" > before (is that what they're called?), but I don't recall what they're > for. How come that solves this issue? > It forces the use of an instruction encoding that has more space for an immediate. > (Feel free to use my tested by tag on a subsequent resulting patch > that uses that). That fix is irrespective of this series, so you > should send it maybe separately? > I will, thanks. > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more > than one bug). (or maybe one of my command line params to QEMU is > wrong). > I understand that this is actually an existing issue in -next, but in general, why would you expect to be able to boot milbeaut_m10v_defconfig on anything other than a Milbeaut MV10 machine? (whatever it is) Or does QEMU emulate a milbeaut machine? If not, better to stick with configs that are intended to boot on the QEMU machine emulation that you are using. Also, while I see the point of regression testing of -next, using it as a base to test arbitrary series and then report failures against it produces a lot of noise. -next is *not* a good base for development, because you get everybody else's half baked crap as well. When you test my stuff, please use a known good base so we're not off on a goose chase every time. > Stepping through start_kernel(), the call to setup_arch() seems to > hang in qemu. For both GCC and Clang builds. A breakpoint in panic() > never gets hit. Looks like the deepest I can get is: > > Looks like: > #0 memblock_alloc_try_nid (size=108, align=4, min_addr=0, max_addr=0, > nid=-1) at mm/memblock.c:1601 > #1 0xc060f0b4 in memblock_alloc (size=<optimized out>, > align=<optimized out>) at ./include/linux/memblock.h:409 > #2 cma_init_reserved_mem (base=1543503872, size=67108864, > order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc > <dma_contiguous_default_area>) at mm/cma.c:190 > #3 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, > size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, > fixed=false, name=0xc04b9240 "reserved", > res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 > #4 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, > order_per_bit=<optimized out>, name=<optimized out>, > res_cma=<optimized out>, fixed=<optimized out>, > limit=<optimized out>, size=<optimized out>, base=<optimized out>) > at ./include/linux/cma.h:28 > #5 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized > out>, limit=<optimized out>, res_cma=0xc07ccbdc > <dma_contiguous_default_area>, fixed=false) > at kernel/dma/contiguous.c:201 > #6 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at > kernel/dma/contiguous.c:171 > #7 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 > <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 > #8 0xc060302c in setup_arch (cmdline_p=<optimized out>) at > arch/arm/kernel/setup.c:1132 > #9 0xc06007d2 in start_kernel () at init/main.c:857 > > there's a call to memset that seems to hang. I wonder if memset() was > defined in terms of __builtin_memset, which oft can result in infinite > loops? (IIRC there's an AEABI related memset; this kind of thing has > hit android userspace before). > > (gdb) layout asm > > shows that the source call to memset is transformed into a call to mmioset. > > (gdb) bt > #0 mmioset () at arch/arm/lib/memset.S:19 > #1 0xc060e2dc in memblock_alloc_try_nid (size=108, align=<optimized > out>, min_addr=0, max_addr=0, nid=-1) at mm/memblock.c:1602 > #2 0xc060f0b4 in memblock_alloc (size=<optimized out>, > align=<optimized out>) at ./include/linux/memblock.h:409 > #3 cma_init_reserved_mem (base=1543503872, size=67108864, > order_per_bit=0, name=0xc04b9240 "reserved", res_cma=0xc07ccbdc > <dma_contiguous_default_area>) at mm/cma.c:190 > #4 0xc060f2c0 in cma_declare_contiguous_nid (base=1543503872, > size=67108864, limit=1610612736, alignment=8388608, order_per_bit=0, > fixed=false, name=0xc04b9240 "reserved", > res_cma=0xc07ccbdc <dma_contiguous_default_area>, nid=-1) at mm/cma.c:352 > #5 0xc0608cb6 in cma_declare_contiguous (alignment=<optimized out>, > order_per_bit=<optimized out>, name=<optimized out>, > res_cma=<optimized out>, fixed=<optimized out>, > limit=<optimized out>, size=<optimized out>, base=<optimized out>) > at ./include/linux/cma.h:28 > #6 dma_contiguous_reserve_area (size=<optimized out>, base=<optimized > out>, limit=<optimized out>, res_cma=0xc07ccbdc > <dma_contiguous_default_area>, fixed=false) > at kernel/dma/contiguous.c:201 > #7 0xc0608d22 in dma_contiguous_reserve (limit=<optimized out>) at > kernel/dma/contiguous.c:171 > #8 0xc0604584 in arm_memblock_init (mdesc=0xc061bfe8 > <__mach_desc_GENERIC_DT.35641>) at arch/arm/mm/init.c:230 > #9 0xc060302c in setup_arch (cmdline_p=<optimized out>) at > arch/arm/kernel/setup.c:1132 > #10 0xc06007d2 in start_kernel () at init/main.c:857 > > Using `si` in gdb, it looks like we maybe hit an exception vector? > x 1202 .section .vectors, "ax", %progbits > > x > x 1203 .L__vectors_start: > > x > x 1204 W(b) vector_rst > > x > x 1205 W(b) vector_und > > x > x 1206 W(ldr) pc, .L__vectors_start + 0x1000 > > x > x >1207 W(b) vector_pabt > > Is the last thing I see, then `si` stops working. Not sure if `pabt` > is a recognizable acronym to anyone more familiar with ARM? > > Happens regardless of your series, FWIW. > -- > Thanks, > ~Nick Desaulniers
On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote: > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more > > than one bug). (or maybe one of my command line params to QEMU is > > wrong). > > > > I understand that this is actually an existing issue in -next, but in > general, why would you expect to be able to boot > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10 > machine? We've been booting other configs in QEMU for a few years now, so I don't see why yet another config would hurt. Maybe there's some hardware dependency, but I guess we'd find that out trying to boot it in QEMU. If it boots in QEMU, I guess not booting on metal wasn't so bad? Maybe this is considered an antipattern, but you can see how if we've been getting away with it for years then that would lead to such expectations. > (whatever it is) Or does QEMU emulate a milbeaut machine? $ qemu-system-arm -machine help doesn't print anything that looks like it, on initial glance. Looks like a socionext part: https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf > If > not, better to stick with configs that are intended to boot on the > QEMU machine emulation that you are using. I can see in our CI that we've been building+boot testing multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a while now without specifying any machine. Is there a preferred machine we should be using for those? (It looks like qemu supports ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!) > Also, while I see the point of regression testing of -next, using it > as a base to test arbitrary series and then report failures against it > produces a lot of noise. -next is *not* a good base for development, > because you get everybody else's half baked crap as well. Ack. > When you > test my stuff, please use a known good base so we're not off on a > goose chase every time. Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything* Monsieur, here at the zoo, chasing the geese is not out of our purview. It's preferable to cleaning up after the elephants. -- Thanks, ~Nick Desaulniers
On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to > > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more > > > than one bug). (or maybe one of my command line params to QEMU is > > > wrong). > > > > > > > I understand that this is actually an existing issue in -next, but in > > general, why would you expect to be able to boot > > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10 > > machine? > > We've been booting other configs in QEMU for a few years now, so I > don't see why yet another config would hurt. Maybe there's some > hardware dependency, but I guess we'd find that out trying to boot it > in QEMU. If it boots in QEMU, I guess not booting on metal wasn't so > bad? Maybe this is considered an antipattern, but you can see how if > we've been getting away with it for years then that would lead to such > expectations. > > > (whatever it is) Or does QEMU emulate a milbeaut machine? > > $ qemu-system-arm -machine help > > doesn't print anything that looks like it, on initial glance. Looks > like a socionext part: > https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf > > > If > > not, better to stick with configs that are intended to boot on the > > QEMU machine emulation that you are using. > > I can see in our CI that we've been building+boot testing > multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a > while now without specifying any machine. Is there a preferred > machine we should be using for those? (It looks like qemu supports > ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what > aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!) > Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART, and its defconfig does not include drivers for the PL011 or 8250/16550 UARTs that you typically find on other boards. So how on earth would you expect to get any output at all if QEMU does not emulate this exact machine? In general, if you use QEMU/mach-virt, the only defconfigs you should reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y. > > Also, while I see the point of regression testing of -next, using it > > as a base to test arbitrary series and then report failures against it > > produces a lot of noise. -next is *not* a good base for development, > > because you get everybody else's half baked crap as well. > > Ack. > > > When you > > test my stuff, please use a known good base so we're not off on a > > goose chase every time. > > Goose Chase?! gOoSe ChAsE?! *gestures broadly at...everything* > Monsieur, here at the zoo, chasing the geese is not out of our > purview. It's preferable to cleaning up after the elephants. :-)
On Fri, Sep 18, 2020 at 1:44 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Fri, 18 Sep 2020 at 22:03, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > On Wed, Sep 16, 2020 at 11:01 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Thu, 17 Sep 2020 at 00:25, Nick Desaulniers <ndesaulniers@google.com> wrote: > > > > Also, it looks like the GCC build of milbeaut_m10v_defconfig fails to > > > > boot for me in QEMU; so maybe not just a Clang bug (or maybe, more > > > > than one bug). (or maybe one of my command line params to QEMU is > > > > wrong). > > > > > > > > > > I understand that this is actually an existing issue in -next, but in > > > general, why would you expect to be able to boot > > > milbeaut_m10v_defconfig on anything other than a Milbeaut MV10 > > > machine? > > > > We've been booting other configs in QEMU for a few years now, so I > > don't see why yet another config would hurt. Maybe there's some > > hardware dependency, but I guess we'd find that out trying to boot it > > in QEMU. If it boots in QEMU, I guess not booting on metal wasn't so > > bad? Maybe this is considered an antipattern, but you can see how if > > we've been getting away with it for years then that would lead to such > > expectations. > > > > > (whatever it is) Or does QEMU emulate a milbeaut machine? > > > > $ qemu-system-arm -machine help > > > > doesn't print anything that looks like it, on initial glance. Looks > > like a socionext part: > > https://www.socionext.com/en/pr/sn_pr20170105_01e.pdf > > > > > If > > > not, better to stick with configs that are intended to boot on the > > > QEMU machine emulation that you are using. > > > > I can see in our CI that we've been building+boot testing > > multi_v5_defconfig, aspeed_g5_defconfig, and multi_v7_defconfig for a > > while now without specifying any machine. Is there a preferred > > machine we should be using for those? (It looks like qemu supports > > ast2500-evb and ast2600-evb; is ARM1176 and ARMv6 core? Is that what > > aspeed_g5 uses? Why is virt versioned? Ahhhh!!!!) > > > > Milbeaut's serial output is on a "socionext,milbeaut-usio-uart" UART, > and its defconfig does not include drivers for the PL011 or 8250/16550 > UARTs that you typically find on other boards. So how on earth would > you expect to get any output at all if QEMU does not emulate this > exact machine? breakpoints in panic()/printk(), lx-dmesg in GDB via CONFIG_GDB_SCRIPTS=y generally works. :^) I guess one thing I don't understand is how to check what UART or what the name of the tty device would be. I can grep for "socionext,milbeaut-usio-uart" and see where it's defined, but I never would have/still don't know how to find that. Please teach me how to fish. I understand the point of DT, and see arch/arm/boot/dts/milbeaut-m10v.dtsi 76: compatible = "socionext,milbeaut-usio-uart"; but the comment about ttyUSI0 is confusing to me; that identifier doesn't appear anywhere else in the kernel sources. I generally have the same problem trying to run Pixel kernels in QEMU; I don't understand how to determine which serial driver is being used, and what the tty device would be named. So I just enable the PLO11 driver. Only last week I found that trying to use a shell as init without any serial output can result in the shell exiting, panicking the kernel, which doesn't print over serial...goose chase... > In general, if you use QEMU/mach-virt, the only defconfigs you should > reasonably be testing are the ones that contain CONFIG_ARCH_VIRT=y. Looks like then that would only be multi_v7_defconfig. How do we continue to verify we can boot ARMv6 or ARMv5 under virtualization? There are such machines in QEMU, but then no defconfigs in the kernel. -- Thanks, ~Nick Desaulniers