mbox series

[0/6] objtool: Detect and warn about indirect calls in __nocfi functions

Message ID 20250414111140.586315004@infradead.org
Headers show
Series objtool: Detect and warn about indirect calls in __nocfi functions | expand

Message

Peter Zijlstra April 14, 2025, 11:11 a.m. UTC
Hi!

On kCFI (CONFIG_CFI_CLANG=y) builds all indirect calls should have the CFI
check on (with very few exceptions). Not having the CFI checks undermines the
protection provided by CFI and will make these sites candidates for people
wanting to steal your cookies.

Specifically the ABI changes are so that doing indirect calls without the CFI
magic, to a CFI adorned function is not compatible (although it happens to work
for some setups, it very much does not for FineIBT).

Rust people tripped over this the other day, since their 'core' happened to
have some no_sanitize(kcfi) bits in, which promptly exploded when ran with
FineIBT on.

Since this is very much not a supported model -- on purpose, have objtool
detect and warn about such constructs.

This effort [1] found all existins [2] non-cfi indirect calls in the kernel.

Notably the KVM fastop emulation stuff -- which reminded me I still had pending
patches there. Included here since they reduce the amount of fastop call sites,
and the final patch includes an annotation for that. Although ideally we should
look at means of doing fastops differently.

KVM has another; the interrupt injection stuff calls the IDT handler directly.
Is there an alternative? Can we keep a table of Linux functions slighly higher
up the call stack (asm_\cfunc ?) and add CFI to those?

HyperV hypercall page stuff, which I've previously suggested use direct calls,
and which I've now converted (after getting properly annoyed with that code).


[1] https://lkml.kernel.org/r/20250410154556.GB9003@noisy.programming.kicks-ass.net
[2] https://lkml.kernel.org/r/20250410194334.GA3248459@google.com

Comments

Peter Zijlstra April 14, 2025, 2:08 p.m. UTC | #1
On Mon, Apr 14, 2025 at 04:06:41PM +0200, Uros Bizjak wrote:
> 
> 
> On 14. 04. 25 13:11, Peter Zijlstra wrote:
> > What used to be a simple few instructions has turned into a giant mess
> > (for x86_64). Not only does it use static_branch wrong, it mixes it
> > with dynamic branches for no apparent reason.
> > 
> > Notably it uses static_branch through an out-of-line function call,
> > which completely defeats the purpose, since instead of a simple
> > JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is
> > absolutely idiotic.
> > 
> > Add to that a dynamic test of hyperv_paravisor_present, something
> > which is set once and never changed.
> > 
> > Replace all this idiocy with a single direct function call to the
> > right hypercall variant.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >   arch/x86/hyperv/hv_init.c       |   21 ++++++
> >   arch/x86/hyperv/ivm.c           |   14 ++++
> >   arch/x86/include/asm/mshyperv.h |  137 +++++++++++-----------------------------
> >   arch/x86/kernel/cpu/mshyperv.c  |   18 +++--
> >   4 files changed, 88 insertions(+), 102 deletions(-)
> > 
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -35,7 +35,28 @@
> >   #include <linux/highmem.h>
> >   void *hv_hypercall_pg;
> > +
> > +#ifdef CONFIG_X86_64
> > +u64 hv_pg_hypercall(u64 control, u64 param1, u64 param2)
> > +{
> > +	u64 hv_status;
> > +
> > +	if (!hv_hypercall_pg)
> > +		return U64_MAX;
> > +
> > +	register u64 __r8 asm("r8") = param2;
> > +	asm volatile (CALL_NOSPEC
> > +		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> > +		        "+c" (control), "+d" (param1)
> > +		      : "r" (__r8),
> 
> r8 is call-clobbered register, so you should use "+r" (__r8) to properly
> clobber it:

Ah, okay.
Josh Poimboeuf April 14, 2025, 10:36 p.m. UTC | #2
On Mon, Apr 14, 2025 at 01:11:43PM +0200, Peter Zijlstra wrote:
> Since there is only a single fastop() function, convert the FASTOP
> stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> thunks and all that jazz.
> 
> Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> which not all of them can trivially do (call depth tracing suffers
> here).
> 
> Objtool strenuously complains about things, therefore fix up the
> various problems:
> 
>  - indirect call without a .rodata, fails to determine JUMP_TABLE,
>    add an annotation for this.
>  - fastop functions fall through, create an exception for this case
>  - unreachable instruction after fastop_return, save/restore

I think this breaks unwinding.  Each of the individual fastops inherits
fastop()'s stack but the ORC doesn't reflect that.

Should they just be moved to a proper .S file?
Peter Zijlstra April 15, 2025, 7:44 a.m. UTC | #3
On Mon, Apr 14, 2025 at 03:36:50PM -0700, Josh Poimboeuf wrote:
> On Mon, Apr 14, 2025 at 01:11:43PM +0200, Peter Zijlstra wrote:
> > Since there is only a single fastop() function, convert the FASTOP
> > stuff from CALL_NOSPEC+RET to JMP_NOSPEC+JMP, avoiding the return
> > thunks and all that jazz.
> > 
> > Specifically FASTOPs rely on the return thunk to preserve EFLAGS,
> > which not all of them can trivially do (call depth tracing suffers
> > here).
> > 
> > Objtool strenuously complains about things, therefore fix up the
> > various problems:
> > 
> >  - indirect call without a .rodata, fails to determine JUMP_TABLE,
> >    add an annotation for this.
> >  - fastop functions fall through, create an exception for this case
> >  - unreachable instruction after fastop_return, save/restore
> 
> I think this breaks unwinding.  Each of the individual fastops inherits
> fastop()'s stack but the ORC doesn't reflect that.

I'm not sure I understand. There is only the one location, and we
simply save/restore the state around the one 'call'.