Message ID | 20200915172658.1432732-1-rkir@google.com |
---|---|
State | New |
Headers | show |
Series | arch: x86: power: cpu: init %gs before __restore_processor_state (clang) | expand |
On Tue, Sep 15, 2020 at 10:26:58AM -0700, rkir@google.com wrote: > From: Haitao Shan <hshan@google.com> > > This is a workaround which fixes triple fault > in __restore_processor_state on clang when > built with LTO. > > When load_TR_desc and load_mm_ldt are inlined into > fix_processor_context due to LTO, they cause > fix_processor_context (or in this case __restore_processor_state, > as fix_processor_context was inlined into __restore_processor_state) > to access the stack canary through %gs, but before > __restore_processor_state has restored the previous value > of %gs properly. LLVM appears to be inlining functions with stack > protectors into functions compiled with -fno-stack-protector, > which is likely a bug in LLVM's inliner that needs to be fixed. > > The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479 > > Signed-off-by: Haitao Shan <hshan@google.com> > Signed-off-by: Roman Kiryanov <rkir@google.com> Ok, google guys, pls make sure you Cc LKML too as this is where *all* patches and discussions are archived. Adding it now to Cc. > --- > arch/x86/power/cpu.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c > index db1378c6ff26..e5677adb2d28 100644 > --- a/arch/x86/power/cpu.c > +++ b/arch/x86/power/cpu.c > @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) > /* Needed by apm.c */ > void notrace restore_processor_state(void) > { > +#ifdef __clang__ > + // The following code snippet is copied from __restore_processor_state. > + // Its purpose is to prepare GS segment before the function is called. > +#ifdef CONFIG_X86_64 > + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base); > +#else > + loadsegment(fs, __KERNEL_PERCPU); > + loadsegment(gs, __KERNEL_STACK_CANARY); > +#endif > +#endif Ok, so why is the kernel supposed to take yet another ugly workaround because there's a bug in the compiler? If it is too late to fix it there, then maybe disable LTO builds for the buggy version only. We had a similar discussion this week and we already have one buggy compiler to deal with and this second one is not making it any easier...
On Tue, Sep 15, 2020 at 10:46 AM Borislav Petkov <bp@alien8.de> wrote: Hi Borislav, thank you for a quick response. > Ok, google guys, pls make sure you Cc LKML too as this is where *all* > patches and discussions are archived. Adding it now to Cc. Thank you, I did not know this. > Ok, so why is the kernel supposed to take yet another ugly workaround > because there's a bug in the compiler? I believe the kernel makes a questionable assumption on how clang uses registers (gs will not be used if stack protection is disabled). Both kernel and clang behaves unfortunate here. > disable LTO CFI depends on LTO.
On Tue, Sep 15, 2020 at 10:46 AM Borislav Petkov <bp@alien8.de> wrote: > > On Tue, Sep 15, 2020 at 10:26:58AM -0700, rkir@google.com wrote: > > From: Haitao Shan <hshan@google.com> > > > > This is a workaround which fixes triple fault > > in __restore_processor_state on clang when > > built with LTO. > > > > When load_TR_desc and load_mm_ldt are inlined into > > fix_processor_context due to LTO, they cause > > fix_processor_context (or in this case __restore_processor_state, > > as fix_processor_context was inlined into __restore_processor_state) > > to access the stack canary through %gs, but before > > __restore_processor_state has restored the previous value > > of %gs properly. LLVM appears to be inlining functions with stack > > protectors into functions compiled with -fno-stack-protector, > > which is likely a bug in LLVM's inliner that needs to be fixed. > > > > The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479 > > > > Signed-off-by: Haitao Shan <hshan@google.com> > > Signed-off-by: Roman Kiryanov <rkir@google.com> > > Ok, google guys, pls make sure you Cc LKML too as this is where *all* > patches and discussions are archived. Adding it now to Cc. Roman, please use ./scripts/get_maintainer.pl (in the kernel tree) for that. > > > --- > > arch/x86/power/cpu.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c > > index db1378c6ff26..e5677adb2d28 100644 > > --- a/arch/x86/power/cpu.c > > +++ b/arch/x86/power/cpu.c > > @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) > > /* Needed by apm.c */ > > void notrace restore_processor_state(void) > > { > > +#ifdef __clang__ Should be CONFIG_CC_IS_CLANG; is more canonical throughout the tree. Or if this is only a bug when doing builds with LTO, and LTO is not yet upstream, then maybe Sami should carry this in his series, at least until I can fix the bug in Clang. Or guard this with the CONFIG_LTO_CLANG config (not upstream yet; see Sami's series). > > + // The following code snippet is copied from __restore_processor_state. > > + // Its purpose is to prepare GS segment before the function is called. > > +#ifdef CONFIG_X86_64 > > + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base); > > +#else > > + loadsegment(fs, __KERNEL_PERCPU); > > + loadsegment(gs, __KERNEL_STACK_CANARY); > > +#endif > > +#endif > > Ok, so why is the kernel supposed to take yet another ugly workaround > because there's a bug in the compiler? This is exactly the same code from __restore_processor_state. If it's ugly, talk to the author of 7ee18d677989e. ;) All this patch is doing is moving this up a call frame (though now this is effectively being run twice). > If it is too late to fix it there, then maybe disable LTO builds for the > buggy version only. We could do that, too. (We can disable LTO on a per translation unit basis in KBuild). Note the author of the bug report linked above. :^P "Revenge of the stack protector" > > We had a similar discussion this week and we already have one buggy > compiler to deal with and this second one is not making it any easier...
On Tue, Sep 15, 2020 at 11:00:30AM -0700, Nick Desaulniers wrote:
> This is exactly the same code from __restore_processor_state.
No, this patch is adding
#ifdef __clang__
and I don't like the sprinkling around of those compiler-specific
workarounds which we have to carry forward forever or at least until
that compiler version is deprecated. We already carry fixes for broken
hardware, broken BIOSes, broken peripherals,... can you follow the
progression? :)
So your argument about testing unreleased compilers in the other thread
makes a lot of sense so that stuff like that can be fixed in time, and
in the compiler, where it belongs (that is, *if* it belongs there).
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Sep 15, 2020 at 10:57:16AM -0700, Roman Kiryanov wrote: > I believe the kernel makes a questionable assumption on how clang > uses registers (gs will not be used if stack protection is disabled). > Both kernel and clang behaves unfortunate here. If the kernel is at fault here and this same thing happens with GCC, sure, but this is a clang-specific fix.
On Tue, Sep 15, 2020 at 11:27 AM Borislav Petkov <bp@alien8.de> wrote: > > I believe the kernel makes a questionable assumption on how clang > > uses registers (gs will not be used if stack protection is disabled). > > Both kernel and clang behaves unfortunate here. > > If the kernel is at fault here and this same thing happens with GCC, > sure, but this is a clang-specific fix. This is fair. Unfortunately I am not an x86 asm expert. I expect the proper fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs (maybe some more registers) before "jmp restore_processor_state". Regards, Roman.
On Tue, Sep 15, 2020 at 11:36:13AM -0700, Roman Kiryanov wrote: > This is fair. Unfortunately I am not an x86 asm expert. I expect the proper > fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs > (maybe some more registers) before "jmp restore_processor_state". ... because "LLVM appears to be inlining functions with stack protectors into functions compiled with -fno-stack-protector" and now the *kernel* needs to init %gs? How about LLVM stops doing those wrong inlining decisions? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Sep 15, 2020 at 11:52 AM Borislav Petkov <bp@alien8.de> wrote: > > On Tue, Sep 15, 2020 at 11:36:13AM -0700, Roman Kiryanov wrote: > > This is fair. Unfortunately I am not an x86 asm expert. I expect the proper > > fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs > > (maybe some more registers) before "jmp restore_processor_state". > > ... because "LLVM appears to be inlining functions with stack protectors > into functions compiled with -fno-stack-protector" and now the *kernel* > needs to init %gs? > > How about LLVM stops doing those wrong inlining decisions? This will remove the issue for a while until clang/gcc/other decides to use %gs for other purposes before the kernel initializes it. Regards, Roman.
On Tue, Sep 15, 2020 at 11:25 AM Borislav Petkov <bp@alien8.de> wrote: > > On Tue, Sep 15, 2020 at 11:00:30AM -0700, Nick Desaulniers wrote: > > This is exactly the same code from __restore_processor_state. > > No, this patch is adding > > #ifdef __clang__ > > and I don't like the sprinkling around of those compiler-specific > workarounds which we have to carry forward forever or at least until > that compiler version is deprecated. We already carry fixes for broken > hardware, broken BIOSes, broken peripherals,... can you follow the > progression? :) I agree; I also would not have sent the patch though. Until LTO has landed upstream, this is definitely somewhat self inflicted. This was only debugged last week; even with a compiler fix in hand today, it still takes time to ship that compiler and qualify it; for other folks on tighter timelines, I can understand why the patch was sent, and do genuinely appreciate the effort to participate more upstream which I'm trying to encourage more of throughout the company (we're in a lot of technical debt kernel-wise; and I'm not referring to Android...a story over beers perhaps, or ask Greg). It's just that this isn't really appropriate since it works around a bug in a non-upstream feature, and will go away once we fix the toolchain. > > So your argument about testing unreleased compilers in the other thread > makes a lot of sense so that stuff like that can be fixed in time, and > in the compiler, where it belongs (that is, *if* it belongs there). It would be much nicer if we had the flexibility to disable stack protectors per function, rather than per translation unit. I'm going to encourage you to encourage your favorite compile vendor ("write to your senator") to support the function attribute __attribute__((no_stack_protector)) so that one day, we can use that to stop shipping crap like a9a3ed1eff360 ("x86: Fix early boot crash on gcc-10, third try"). Having had that, we could have used a nicer workaround until the toolchain was fixed (and one day revert a9a3ed1eff360, and d0a8d9378d16, and probably more hacks in the kernel). And the case that's causing the compiler bug in question is something all compiler vendors will need to consider in their implementations. -- Thanks, ~Nick Desaulniers
On Tue, Sep 15, 2020 at 12:51:47PM -0700, Nick Desaulniers wrote: > I agree; I also would not have sent the patch though. Maybe google folks should run stuff by you before sending it up... :-) > Until LTO has landed upstream, this is definitely somewhat self > inflicted. This was only debugged last week; even with a compiler fix > in hand today, it still takes time to ship that compiler and qualify > it; for other folks on tighter timelines, I can understand why the > patch was sent, ... because they have the requirement that a patch which gets backported to a kernel used at google needs to be upstream? Because I'm willing to bet a lot of cash that no one runs bleeding egde 5.9-rcX in production over there right now :-) > and do genuinely appreciate the effort to participate more upstream > which I'm trying to encourage more of throughout the company (we're > in a lot of technical debt kernel-wise; and I'm not referring to > Android...a story over beers perhaps, or ask Greg). Beers? Always. But I can imagine the reasons: people working on projects and then those projects getting done and no one cares about upstreaming stuff after the fact or no one has time ... or policy ... but let's keep that for beers. :-) > It's just that this isn't really appropriate since it works around > a bug in a non-upstream feature, and will go away once we fix the > toolchain. Hohumm. > It would be much nicer if we had the flexibility to disable stack > protectors per function, rather than per translation unit. I'm going > to encourage you to encourage your favorite compile vendor ("write to > your senator") to support the function attribute > __attribute__((no_stack_protector)) so that one day, I already forgot why gcc doesn't do that... Martin, do you know? > we can use that to stop shipping crap like a9a3ed1eff360 ("x86: Fix > early boot crash on gcc-10, third try"). Having had that, we could > have used a nicer workaround until the toolchain was fixed (and one > day revert a9a3ed1eff360, and d0a8d9378d16, and probably more hacks in > the kernel). Yap, agreed. I guess with those new compiler features it is always a couple of releases - both kernel, i.e., the user of that feature, and compiler, i.e., the provider of the feature, to both figure out what the proper use cases are, to experiment a bit and then to adjust stuff, change here and there and then cast in stone. Oh well. > And the case that's causing the compiler bug in question is something > all compiler vendors will need to consider in their implementations. Are you talking to gcc folks about it already so that they DTRT too? Btw, if it is any consolation, talking to compiler folks is like a charm in comparison to talking to hardware vendors and trying to get them to agree on something because they seem to think that the kernel is software and sure, can be changed to do whatever. But that's another story for the beers... :-)
On Tue, Sep 15, 2020 at 11:00:30AM -0700, Nick Desaulniers wrote: > On Tue, Sep 15, 2020 at 10:46 AM Borislav Petkov <bp@alien8.de> wrote: > > > > On Tue, Sep 15, 2020 at 10:26:58AM -0700, rkir@google.com wrote: > > > From: Haitao Shan <hshan@google.com> > > > > > > This is a workaround which fixes triple fault > > > in __restore_processor_state on clang when > > > built with LTO. > > > > > > When load_TR_desc and load_mm_ldt are inlined into > > > fix_processor_context due to LTO, they cause Does this apply to load_TR_desc()? That is an inline function even without LTO, no? > > > fix_processor_context (or in this case __restore_processor_state, > > > as fix_processor_context was inlined into __restore_processor_state) > > > to access the stack canary through %gs, but before > > > __restore_processor_state has restored the previous value > > > of %gs properly. LLVM appears to be inlining functions with stack > > > protectors into functions compiled with -fno-stack-protector, > > > which is likely a bug in LLVM's inliner that needs to be fixed. > > > > > > The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479 > > > > > > Signed-off-by: Haitao Shan <hshan@google.com> > > > Signed-off-by: Roman Kiryanov <rkir@google.com> > > > > Ok, google guys, pls make sure you Cc LKML too as this is where *all* > > patches and discussions are archived. Adding it now to Cc. > > Roman, please use ./scripts/get_maintainer.pl (in the kernel tree) for that. > > > > > > --- > > > arch/x86/power/cpu.c | 10 ++++++++++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c > > > index db1378c6ff26..e5677adb2d28 100644 > > > --- a/arch/x86/power/cpu.c > > > +++ b/arch/x86/power/cpu.c > > > @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) > > > /* Needed by apm.c */ > > > void notrace restore_processor_state(void) > > > { > > > +#ifdef __clang__ > > Should be CONFIG_CC_IS_CLANG; is more canonical throughout the tree. > Or if this is only a bug when doing builds with LTO, and LTO is not > yet upstream, then maybe Sami should carry this in his series, at > least until I can fix the bug in Clang. Or guard this with the > CONFIG_LTO_CLANG config (not upstream yet; see Sami's series). > > > > + // The following code snippet is copied from __restore_processor_state. > > > + // Its purpose is to prepare GS segment before the function is called. > > > +#ifdef CONFIG_X86_64 > > > + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base); > > > +#else > > > + loadsegment(fs, __KERNEL_PERCPU); > > > + loadsegment(gs, __KERNEL_STACK_CANARY); > > > +#endif > > > +#endif > > > > Ok, so why is the kernel supposed to take yet another ugly workaround > > because there's a bug in the compiler? > > This is exactly the same code from __restore_processor_state. If it's > ugly, talk to the author of 7ee18d677989e. ;) All this patch is doing > is moving this up a call frame (though now this is effectively being > run twice). > Possibly dumb question: why does this fix anything? Won't __restore_processor_state(), which is a static function with only one caller, in turn get inlined into restore_processor_state(), so that restore_processor_state() will also have stack protection enabled, and the canary will be accessed before the MSR or segment register is loaded? Thanks.
On Tue, Sep 15, 2020 at 1:20 PM Borislav Petkov <bp@alien8.de> wrote: > > On Tue, Sep 15, 2020 at 12:51:47PM -0700, Nick Desaulniers wrote: > > I agree; I also would not have sent the patch though. > > Maybe google folks should run stuff by you before sending it up... :-) Ha! 1. they don't pay me enough for that. 2. even if they did, I wouldn't want that responsibility 3. I'm probably least qualified for that. Google has many strong upstream contributors with much longer contribution history than myself. Maybe toolchain specific stuff though... 4. you generally don't want people like that in any organization. More gatekeepers winds up being a synchronization/contention point. Remember, the goal is to train others to be self sufficient, so you can drink margaritas on the roof. That suggestion goes against the ultimate goal. 5. You'd think a multi-billion-dollar per quarter company could hire a few more people to help; instead stock buybacks are more attractive I guess? Maybe better ROI? I suspect one too many managers internalized the Mythical Man Month's point about "adding more people to a late software project just makes it later" to mean "starve your projects for resources" and run a ghost-ship (ie. big boat, with little to no deck hands to ensure the boat doesn't "Costa Concordia" (noun-as-a-verb...oh well)). To be fair, hiring has been impacted by COVID; my point is more so being stretched incredibly thin. There's been what, 3 Clang related kernel bugs you and I have been CC'ed on today. Hard to fix compiler bugs AND triage from the fire hose. I should probably just put down LKML for today and start fixing the [haunted][damned] compiler. > > > Until LTO has landed upstream, this is definitely somewhat self > > inflicted. This was only debugged last week; even with a compiler fix > > in hand today, it still takes time to ship that compiler and qualify > > it; for other folks on tighter timelines, I can understand why the > > patch was sent, > > ... because they have the requirement that a patch which gets backported > to a kernel used at google needs to be upstream? That's a rule for stable, yes. But also because we have folks that don't seem to understand (moreso maybe haven't considered) that "forking is not free" when upstream moves faster than you and you'd also like to rebase someday; as such acquiring technical debt at a rate that's impossible to pay off. > Because I'm willing to > bet a lot of cash that no one runs bleeding egde 5.9-rcX in production > over there right now :-) I guess you're paying for beers then. "Android Common Kernels" run mainline. (They're a bit esoteric in terms of "production" but cuttlefish virtual devices are running Android on mainline). > > It would be much nicer if we had the flexibility to disable stack > > protectors per function, rather than per translation unit. I'm going > > to encourage you to encourage your favorite compile vendor ("write to > > your senator") to support the function attribute > > __attribute__((no_stack_protector)) so that one day, > > I already forgot why gcc doesn't do that... Martin, do you know? Martin has patches for that, he has CC'ed me when sending them upstream for review. Review was stalled, so I provided some feedback. I'll review a GCC patch (once it's updated with my previous feedback) if I have to; I'm not against it. w/e so long as we have a timeline for a kernel fix. > > And the case that's causing the compiler bug in question is something > > all compiler vendors will need to consider in their implementations. > > Are you talking to gcc folks about it already so that they DTRT too? I CC'ed Martin on the LLVM bug, since this is a case I'm looking for his input on, or at least for him to be aware of the test case. > Btw, if it is any consolation, talking to compiler folks is like a charm > in comparison to talking to hardware vendors and trying to get them > to agree on something because they seem to think that the kernel is > software and sure, can be changed to do whatever. But that's another > story for the beers... :-) I look forward to it.
Hi! > From: Haitao Shan <hshan@google.com> > > This is a workaround which fixes triple fault > in __restore_processor_state on clang when > built with LTO. > > When load_TR_desc and load_mm_ldt are inlined into > fix_processor_context due to LTO, they cause > fix_processor_context (or in this case __restore_processor_state, > as fix_processor_context was inlined into __restore_processor_state) > to access the stack canary through %gs, but before > __restore_processor_state has restored the previous value > of %gs properly. LLVM appears to be inlining functions with stack > protectors into functions compiled with -fno-stack-protector, > which is likely a bug in LLVM's inliner that needs to be fixed. That's rather ugly. Would it be easier to simply mark those functions as noinline or something like that? Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Tue, Sep 15, 2020 at 3:17 PM Pavel Machek <pavel@denx.de> wrote: > > Hi! > > > From: Haitao Shan <hshan@google.com> > > > > This is a workaround which fixes triple fault > > in __restore_processor_state on clang when > > built with LTO. > > > > When load_TR_desc and load_mm_ldt are inlined into > > fix_processor_context due to LTO, they cause > > fix_processor_context (or in this case __restore_processor_state, > > as fix_processor_context was inlined into __restore_processor_state) > > to access the stack canary through %gs, but before > > __restore_processor_state has restored the previous value > > of %gs properly. LLVM appears to be inlining functions with stack > > protectors into functions compiled with -fno-stack-protector, > > which is likely a bug in LLVM's inliner that needs to be fixed. > > That's rather ugly. > > Would it be easier to simply mark those functions as noinline or > something like that? It is possible, and a possible solution here. We discussed that internally, and found it to not be great. You only want to prevent inlining across TUs for LTO; currently there's no great way to express that in source code. As in "go ahead and inline this from calls within the TU that this is defined, but don't inline across TUs other than from where the callee is defined." I think we should create a function level attribute for expressing that intention to the compiler. You could emulate that somewhat by wrapping the noinline attribute in CONFIG_LTO_CLANG, but that doesn't solve allowing inlining within the same TU. If you've been following the thread, there's multiple ways to skin this cat; your suggestion is one we did consider, just not on the public mailing list.
On Tue, Sep 15, 2020 at 3:17 PM Pavel Machek <pavel@denx.de> wrote: > > When load_TR_desc and load_mm_ldt are inlined into > > fix_processor_context due to LTO, they cause > > fix_processor_context (or in this case __restore_processor_state, > > as fix_processor_context was inlined into __restore_processor_state) > > to access the stack canary through %gs, but before > > __restore_processor_state has restored the previous value > > of %gs properly. LLVM appears to be inlining functions with stack > > protectors into functions compiled with -fno-stack-protector, > > which is likely a bug in LLVM's inliner that needs to be fixed. > > That's rather ugly. > > Would it be easier to simply mark those functions as noinline or > something like that? Hi Pavel, thank you for looking. Let's put it another way. Is it correct to assume %gs will not be ever used by the compiler for its internal business if stack protection is disabled? If this assumption is correct we should just fix the compiler. Regards, Roman.
On Tue, Sep 15, 2020 at 12:51:47PM -0700, Nick Desaulniers wrote: > It would be much nicer if we had the flexibility to disable stack > protectors per function, rather than per translation unit. I'm going > to encourage you to encourage your favorite compile vendor ("write to > your senator") to support the function attribute > __attribute__((no_stack_protector)) so that one day, we can use that > to stop shipping crap like a9a3ed1eff360 ("x86: Fix early boot crash > on gcc-10, third try"). I think we were all in favour of having that, not sure why it hasn't happened yet. More important matters I suppose :/
On Tue, Sep 15, 2020 at 02:49:40PM -0700, Nick Desaulniers wrote: > 1. they don't pay me enough for that. They probably should - you're doing it anyway and it's not like they have a shortage of cash. :-P > 2. even if they did, I wouldn't want that responsibility Too late, my friend. :-) > 3. I'm probably least qualified for that. Google has many strong > upstream contributors with much longer contribution history than > myself. Maybe toolchain specific stuff though... Sure, toolchain only, if you prefer. And others can take care of other areas. And yes, those people should have some time allocated only to upstream development. I think that's only fair... > 4. you generally don't want people like that in any organization. > More gatekeepers winds up being a synchronization/contention point. > Remember, the goal is to train others to be self sufficient, so you > can drink margaritas on the roof. That suggestion goes against the > ultimate goal. Sure, but there's the community and we want to support that. Business interest pays the bills but without the community to thrive around it, it is just another crap software. > 5. You'd think a multi-billion-dollar per quarter company could hire a > few more people to help; instead stock buybacks are more attractive I > guess? Maybe better ROI? I suspect one too many managers > internalized the Mythical Man Month's point about "adding more people > to a late software project just makes it later" to mean "starve your > projects for resources" and run a ghost-ship (ie. big boat, with > little to no deck hands to ensure the boat doesn't "Costa Concordia" > (noun-as-a-verb...oh well)). LOL. And true. My experience with managers is that they have no clue about how the open source community works. It is always the egotistic and omnipresent take take take and little or no give. > To be fair, hiring has been impacted by COVID; my point is more so > being stretched incredibly thin. There's been what, 3 Clang related > kernel bugs you and I have been CC'ed on today. Hard to fix compiler > bugs AND triage from the fire hose. I should probably just put down > LKML for today and start fixing the [haunted][damned] compiler. Oh, welcome to drinking from the firehose. This never stops. Ever! So yeah, even if you can hire more maintainers, there's always bottlenecks. > That's a rule for stable, yes. But also because we have folks that > don't seem to understand (moreso maybe haven't considered) that > "forking is not free" when upstream moves faster than you and you'd > also like to rebase someday; as such acquiring technical debt at a > rate that's impossible to pay off. I guess you need an upstreaming team which takes over technology produced by other projects - as those projects cannot slow down to adapt to the upstream pace - so they give the code to the upstreaming team after the project is done and they add it to the mainline kernel eventually. I think that would make a lot of sense for *everybody* involved. > I guess you're paying for beers then. "Android Common Kernels" run > mainline. (They're a bit esoteric in terms of "production" but > cuttlefish virtual devices are running Android on mainline). Only half of the beers - this "production" is a stretch - I mean infrastructure machines, not some funky toys. And *virtual* at that. :-) > Martin has patches for that, he has CC'ed me when sending them > upstream for review. Review was stalled, so I provided some feedback. > I'll review a GCC patch (once it's updated with my previous feedback) > if I have to; I'm not against it. w/e so long as we have a timeline > for a kernel fix. That's good. I guess it'll get there eventually. We'll still hold on to that fix for years, for those gccs which don't have the function attribute. Which is yet another reason for my aversion against compiler workarounds. > I CC'ed Martin on the LLVM bug, since this is a case I'm looking for > his input on, or at least for him to be aware of the test case. Cool. > I look forward to it. Ditto. :-) -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Hi! > > > When load_TR_desc and load_mm_ldt are inlined into > > > fix_processor_context due to LTO, they cause > > > fix_processor_context (or in this case __restore_processor_state, > > > as fix_processor_context was inlined into __restore_processor_state) > > > to access the stack canary through %gs, but before > > > __restore_processor_state has restored the previous value > > > of %gs properly. LLVM appears to be inlining functions with stack > > > protectors into functions compiled with -fno-stack-protector, > > > which is likely a bug in LLVM's inliner that needs to be fixed. > > > > That's rather ugly. > > > > Would it be easier to simply mark those functions as noinline or > > something like that? > > It is possible, and a possible solution here. We discussed that > internally, and found it to not be great. You only want to prevent > inlining across TUs for LTO; currently there's no great way to > express You should really be doing such discussion publicly. I believe this is not performance critical so just disabling inlining (maybe conditional on clang) is best solution. > compiler. You could emulate that somewhat by wrapping the noinline > attribute in CONFIG_LTO_CLANG, but that doesn't solve allowing > inlining within the same TU. If you've been following the thread, Yep, just do that. BR, Pavel
On Tue 2020-09-15 11:36:13, Roman Kiryanov wrote: > On Tue, Sep 15, 2020 at 11:27 AM Borislav Petkov <bp@alien8.de> wrote: > > > I believe the kernel makes a questionable assumption on how clang > > > uses registers (gs will not be used if stack protection is disabled). > > > Both kernel and clang behaves unfortunate here. > > > > If the kernel is at fault here and this same thing happens with GCC, > > sure, but this is a clang-specific fix. > > This is fair. Unfortunately I am not an x86 asm expert. I expect the proper > fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs > (maybe some more registers) before "jmp restore_processor_state". That would certainly be nicer / more acceptable solution than patch being proposed here. Code was written with assumption compiler random C code would not use %gs. If that's no longer true, fixing it in wakeup_64.S _with comments explaining what goes on_ might be solution. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi! > > Because I'm willing to > > bet a lot of cash that no one runs bleeding egde 5.9-rcX in production > > over there right now :-) > > I guess you're paying for beers then. "Android Common Kernels" run > mainline. (They're a bit esoteric in terms of "production" but > cuttlefish virtual devices are running Android on mainline). > You run mainline kernels but modules to support hw are out_of-tree, right? I'm looking for usable cellphone running mainline. Pavel
On Fri, Sep 18, 2020 at 3:25 PM Pavel Machek <pavel@denx.de> wrote: > > On Tue 2020-09-15 11:36:13, Roman Kiryanov wrote: > > On Tue, Sep 15, 2020 at 11:27 AM Borislav Petkov <bp@alien8.de> wrote: > > > > I believe the kernel makes a questionable assumption on how clang > > > > uses registers (gs will not be used if stack protection is disabled). > > > > Both kernel and clang behaves unfortunate here. > > > > > > If the kernel is at fault here and this same thing happens with GCC, > > > sure, but this is a clang-specific fix. > > > > This is fair. Unfortunately I am not an x86 asm expert. I expect the proper > > fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs > > (maybe some more registers) before "jmp restore_processor_state". > > That would certainly be nicer / more acceptable solution than patch > being proposed here. > > Code was written with assumption compiler random C code would not use > %gs. If that's no longer true, fixing it in wakeup_64.S _with comments > explaining what goes on_ might be solution. I looked and restore_processor_state is referenced in several places, so changing wakeup_64.S is not enough. Is moving the beginning of restore_processor_state to an .S file ok? I see restore_processor_state initializes CR registers first, do you know if there is a reason to do so (can I init segment registers before CR ones)? Regards, Roman.
Hi! > > > > > I believe the kernel makes a questionable assumption on how clang > > > > > uses registers (gs will not be used if stack protection is disabled). > > > > > Both kernel and clang behaves unfortunate here. > > > > > > > > If the kernel is at fault here and this same thing happens with GCC, > > > > sure, but this is a clang-specific fix. > > > > > > This is fair. Unfortunately I am not an x86 asm expert. I expect the proper > > > fix should land into arch/x86/kernel/acpi/wakeup_64.S to init %gs > > > (maybe some more registers) before "jmp restore_processor_state". > > > > That would certainly be nicer / more acceptable solution than patch > > being proposed here. > > > > Code was written with assumption compiler random C code would not use > > %gs. If that's no longer true, fixing it in wakeup_64.S _with comments > > explaining what goes on_ might be solution. > > I looked and restore_processor_state is referenced in several places, > so changing > wakeup_64.S is not enough. Is moving the beginning of restore_processor_state > to an .S file ok? I see restore_processor_state initializes CR registers first, > do you know if there is a reason to do so (can I init segment > registers before CR ones)? Yes, moving to .S file should be okay. CR first... makes sense to me, they really select how segment registers will be interpretted, etc. If it will work the other way... not sure. I'd keep existing code if I were you. This is tricky to debug. Pavel
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c index db1378c6ff26..e5677adb2d28 100644 --- a/arch/x86/power/cpu.c +++ b/arch/x86/power/cpu.c @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt) /* Needed by apm.c */ void notrace restore_processor_state(void) { +#ifdef __clang__ + // The following code snippet is copied from __restore_processor_state. + // Its purpose is to prepare GS segment before the function is called. +#ifdef CONFIG_X86_64 + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base); +#else + loadsegment(fs, __KERNEL_PERCPU); + loadsegment(gs, __KERNEL_STACK_CANARY); +#endif +#endif __restore_processor_state(&saved_context); } #ifdef CONFIG_X86_32