diff mbox

[v6] arm64: fpsimd: improve stacking logic in non-interruptible context

Message ID 1481632529-24218-1-git-send-email-ard.biesheuvel@linaro.org
State New
Headers show

Commit Message

Ard Biesheuvel Dec. 13, 2016, 12:35 p.m. UTC
Currently, we allow kernel mode NEON in softirq or hardirq context by
stacking and unstacking a slice of the NEON register file for each call
to kernel_neon_begin() and kernel_neon_end(), respectively.

Given that
a) a CPU typically spends most of its time in userland, during which time
   no kernel mode NEON in process context is in progress,
b) a CPU spends most of its time in the kernel doing other things than
   kernel mode NEON when it gets interrupted to perform kernel mode NEON
   in softirq context

the stacking and subsequent unstacking is only necessary if we are
interrupting a thread while it is performing kernel mode NEON in process
context, which means that in all other cases, we can simply preserve the
userland FPSIMD state once, and only restore it upon return to userland,
even if we are being invoked from softirq or hardirq context.

So instead of checking whether we are running in interrupt context, keep
track of the level of nested kernel mode NEON calls in progress, and only
perform the eager stack/unstack if the level exceeds 1.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---
v6:
- use a spinlock instead of disabling interrupts

v5:
- perform the test-and-set and the fpsimd_save_state with interrupts disabled,
  to prevent nested kernel_neon_begin()/_end() pairs to clobber the state
  while it is being preserved

v4:
- use this_cpu_inc/dec, which give sufficient guarantees regarding
  concurrency, but do not imply SMP barriers, which are not needed here

v3:
- avoid corruption by concurrent invocations of kernel_neon_begin()/_end()

v2:
- BUG() on unexpected values of the nesting level
- relax the BUG() on num_regs>32 to a WARN, given that nothing actually
  breaks in that case

 arch/arm64/include/asm/fpsimd.h |  3 +
 arch/arm64/kernel/fpsimd.c      | 77 ++++++++++++++------
 2 files changed, 58 insertions(+), 22 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Comments

Dave Martin Dec. 13, 2016, 1:49 p.m. UTC | #1
On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:
> Currently, we allow kernel mode NEON in softirq or hardirq context by

> stacking and unstacking a slice of the NEON register file for each call

> to kernel_neon_begin() and kernel_neon_end(), respectively.

> 

> Given that

> a) a CPU typically spends most of its time in userland, during which time

>    no kernel mode NEON in process context is in progress,

> b) a CPU spends most of its time in the kernel doing other things than

>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

>    in softirq context

> 

> the stacking and subsequent unstacking is only necessary if we are

> interrupting a thread while it is performing kernel mode NEON in process

> context, which means that in all other cases, we can simply preserve the

> userland FPSIMD state once, and only restore it upon return to userland,

> even if we are being invoked from softirq or hardirq context.

> 

> So instead of checking whether we are running in interrupt context, keep

> track of the level of nested kernel mode NEON calls in progress, and only

> perform the eager stack/unstack if the level exceeds 1.


Hang on ... we could be in the middle of fpsimd_save_state() for other
reasons, say on the context switch path.  Wouldn't we need to take the
lock for those too?

Also, a spinlock can't protect a critical section from code running on
the same CPU... wouldn't it just deadlock in that case?

I still tend to prefer v4 -- there we do a redundant double-save only if
one kernel_neon_begin() interrupts the actual task context save.

Cheers
---Dave

> 

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---

> v6:

> - use a spinlock instead of disabling interrupts

> 

> v5:

> - perform the test-and-set and the fpsimd_save_state with interrupts disabled,

>   to prevent nested kernel_neon_begin()/_end() pairs to clobber the state

>   while it is being preserved

> 

> v4:

> - use this_cpu_inc/dec, which give sufficient guarantees regarding

>   concurrency, but do not imply SMP barriers, which are not needed here

> 

> v3:

> - avoid corruption by concurrent invocations of kernel_neon_begin()/_end()

> 

> v2:

> - BUG() on unexpected values of the nesting level

> - relax the BUG() on num_regs>32 to a WARN, given that nothing actually

>   breaks in that case

> 

>  arch/arm64/include/asm/fpsimd.h |  3 +

>  arch/arm64/kernel/fpsimd.c      | 77 ++++++++++++++------

>  2 files changed, 58 insertions(+), 22 deletions(-)

> 

> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h

> index 50f559f574fe..ee09fe1c37b6 100644

> --- a/arch/arm64/include/asm/fpsimd.h

> +++ b/arch/arm64/include/asm/fpsimd.h

> @@ -17,6 +17,7 @@

>  #define __ASM_FP_H

>  

>  #include <asm/ptrace.h>

> +#include <linux/spinlock.h>

>  

>  #ifndef __ASSEMBLY__

>  

> @@ -37,6 +38,8 @@ struct fpsimd_state {

>  			u32 fpcr;

>  		};

>  	};

> +	/* lock protecting the preserved state while it is being saved */

> +	spinlock_t lock;

>  	/* the id of the last cpu to have restored this state */

>  	unsigned int cpu;

>  };

> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c

> index 394c61db5566..886eea2d4084 100644

> --- a/arch/arm64/kernel/fpsimd.c

> +++ b/arch/arm64/kernel/fpsimd.c

> @@ -160,6 +160,7 @@ void fpsimd_flush_thread(void)

>  	memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));

>  	fpsimd_flush_task_state(current);

>  	set_thread_flag(TIF_FOREIGN_FPSTATE);

> +	spin_lock_init(&current->thread.fpsimd_state.lock);

>  }

>  

>  /*

> @@ -220,45 +221,77 @@ void fpsimd_flush_task_state(struct task_struct *t)

>  

>  #ifdef CONFIG_KERNEL_MODE_NEON

>  

> -static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);

> -static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);

> +/*

> + * Although unlikely, it is possible for three kernel mode NEON contexts to

> + * be live at the same time: process context, softirq context and hardirq

> + * context. So while the userland context is stashed in the thread's fpsimd

> + * state structure, we need two additional levels of storage.

> + */

> +static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);

> +static DEFINE_PER_CPU(int, kernel_neon_nesting_level);

>  

>  /*

>   * Kernel-side NEON support functions

>   */

>  void kernel_neon_begin_partial(u32 num_regs)

>  {

> -	if (in_interrupt()) {

> -		struct fpsimd_partial_state *s = this_cpu_ptr(

> -			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);

> +	struct fpsimd_partial_state *s;

> +	int level;

>  

> -		BUG_ON(num_regs > 32);

> -		fpsimd_save_partial_state(s, roundup(num_regs, 2));

> -	} else {

> +	preempt_disable();

> +

> +	/*

> +	 * Save the userland FPSIMD state if we have one and if we

> +	 * haven't done so already. Clear fpsimd_last_state to indicate

> +	 * that there is no longer userland FPSIMD state in the

> +	 * registers.

> +	 */

> +	if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) {

>  		/*

> -		 * Save the userland FPSIMD state if we have one and if we

> -		 * haven't done so already. Clear fpsimd_last_state to indicate

> -		 * that there is no longer userland FPSIMD state in the

> +		 * Whoever test-and-sets the TIF_FOREIGN_FPSTATE flag should

> +		 * preserve the userland FP/SIMD state without interruption by

> +		 * nested kernel_neon_begin()/_end() calls.

> +		 * The reason is that the FP/SIMD register file is shared with

> +		 * SVE on hardware that supports it, and nested kernel mode NEON

> +		 * calls do not restore the upper part of the shared SVE/SIMD

>  		 * registers.

>  		 */

> -		preempt_disable();

> -		if (current->mm &&

> -		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))

> -			fpsimd_save_state(&current->thread.fpsimd_state);

> -		this_cpu_write(fpsimd_last_state, NULL);

> +		if (spin_trylock(&current->thread.fpsimd_state.lock)) {

> +			if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))

> +				fpsimd_save_state(&current->thread.fpsimd_state);

> +			spin_unlock(&current->thread.fpsimd_state.lock);

> +		}

> +	}

> +	this_cpu_write(fpsimd_last_state, NULL);

> +

> +	level = this_cpu_inc_return(kernel_neon_nesting_level);

> +	BUG_ON(level > 3);

> +

> +	if (level > 1) {

> +		s = this_cpu_ptr(nested_fpsimdstate);

> +

> +		WARN_ON_ONCE(num_regs > 32);

> +		num_regs = min(roundup(num_regs, 2), 32U);

> +

> +		fpsimd_save_partial_state(&s[level - 2], num_regs);

>  	}

>  }

>  EXPORT_SYMBOL(kernel_neon_begin_partial);

>  

>  void kernel_neon_end(void)

>  {

> -	if (in_interrupt()) {

> -		struct fpsimd_partial_state *s = this_cpu_ptr(

> -			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);

> -		fpsimd_load_partial_state(s);

> -	} else {

> -		preempt_enable();

> +	struct fpsimd_partial_state *s;

> +	int level;

> +

> +	level = this_cpu_read(kernel_neon_nesting_level);

> +	BUG_ON(level < 1);

> +

> +	if (level > 1) {

> +		s = this_cpu_ptr(nested_fpsimdstate);

> +		fpsimd_load_partial_state(&s[level - 2]);

>  	}

> +	this_cpu_dec(kernel_neon_nesting_level);

> +	preempt_enable();

>  }

>  EXPORT_SYMBOL(kernel_neon_end);

>  

> -- 

> 2.7.4

> 

> 

> _______________________________________________

> linux-arm-kernel mailing list

> linux-arm-kernel@lists.infradead.org

> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Dec. 13, 2016, 2:17 p.m. UTC | #2
On 13 December 2016 at 13:49, Dave Martin <Dave.Martin@arm.com> wrote:
> On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:

>> Currently, we allow kernel mode NEON in softirq or hardirq context by

>> stacking and unstacking a slice of the NEON register file for each call

>> to kernel_neon_begin() and kernel_neon_end(), respectively.

>>

>> Given that

>> a) a CPU typically spends most of its time in userland, during which time

>>    no kernel mode NEON in process context is in progress,

>> b) a CPU spends most of its time in the kernel doing other things than

>>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

>>    in softirq context

>>

>> the stacking and subsequent unstacking is only necessary if we are

>> interrupting a thread while it is performing kernel mode NEON in process

>> context, which means that in all other cases, we can simply preserve the

>> userland FPSIMD state once, and only restore it upon return to userland,

>> even if we are being invoked from softirq or hardirq context.

>>

>> So instead of checking whether we are running in interrupt context, keep

>> track of the level of nested kernel mode NEON calls in progress, and only

>> perform the eager stack/unstack if the level exceeds 1.

>

> Hang on ... we could be in the middle of fpsimd_save_state() for other

> reasons, say on the context switch path.  Wouldn't we need to take the

> lock for those too?

>


Yes. Saving the state should never be interrupted, and now that I
think of it, this is an issue for your original patch as well: even if
you always preserve the state first in kernel_neon_begin(), any
occurrence of fpsimd_save_state() could be interrupted by a
kernel_neon_begin()/_end() pair, after which the SVE state is nuked
anyway (regardless of whether a patch like this one is applied) So I
think we do need the lock in all cases where the FP/SIMD is copied to
memory and the flag set.

> Also, a spinlock can't protect a critical section from code running on

> the same CPU... wouldn't it just deadlock in that case?

>


That is why I use spin_trylock(). But I realise now that the patch is
incorrect: the nesting level needs to be incremented first, or the
interrupter will still clobber the context while it is being
preserved.

> I still tend to prefer v4 -- there we do a redundant double-save only if

> one kernel_neon_begin() interrupts the actual task context save.

>


I think we're not at the bottom of this rabbit hole yet:-)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Dave Martin Dec. 13, 2016, 3:44 p.m. UTC | #3
On Tue, Dec 13, 2016 at 02:17:14PM +0000, Ard Biesheuvel wrote:
> On 13 December 2016 at 13:49, Dave Martin <Dave.Martin@arm.com> wrote:

> > On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:

> >> Currently, we allow kernel mode NEON in softirq or hardirq context by

> >> stacking and unstacking a slice of the NEON register file for each call

> >> to kernel_neon_begin() and kernel_neon_end(), respectively.

> >>

> >> Given that

> >> a) a CPU typically spends most of its time in userland, during which time

> >>    no kernel mode NEON in process context is in progress,

> >> b) a CPU spends most of its time in the kernel doing other things than

> >>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

> >>    in softirq context

> >>

> >> the stacking and subsequent unstacking is only necessary if we are

> >> interrupting a thread while it is performing kernel mode NEON in process

> >> context, which means that in all other cases, we can simply preserve the

> >> userland FPSIMD state once, and only restore it upon return to userland,

> >> even if we are being invoked from softirq or hardirq context.

> >>

> >> So instead of checking whether we are running in interrupt context, keep

> >> track of the level of nested kernel mode NEON calls in progress, and only

> >> perform the eager stack/unstack if the level exceeds 1.

> >

> > Hang on ... we could be in the middle of fpsimd_save_state() for other

> > reasons, say on the context switch path.  Wouldn't we need to take the

> > lock for those too?

> >

> 

> Yes. Saving the state should never be interrupted, and now that I

> think of it, this is an issue for your original patch as well: even if

> you always preserve the state first in kernel_neon_begin(), any

> occurrence of fpsimd_save_state() could be interrupted by a

> kernel_neon_begin()/_end() pair, after which the SVE state is nuked

> anyway (regardless of whether a patch like this one is applied) So I

> think we do need the lock in all cases where the FP/SIMD is copied to

> memory and the flag set.


Yes, that's true.  I didn't really trust my hack that much yet (for
development I was typically just putting a BUG_ON() on
kernel_neon_begin() to make sure it's not being called...)

> > Also, a spinlock can't protect a critical section from code running on

> > the same CPU... wouldn't it just deadlock in that case?

> >

> 

> That is why I use spin_trylock(). But I realise now that the patch is

> incorrect: the nesting level needs to be incremented first, or the


Ah, I missed the significance of that.

> interrupter will still clobber the context while it is being

> preserved.


Hmmm, yes.
 
> > I still tend to prefer v4 -- there we do a redundant double-save only if

> > one kernel_neon_begin() interrupts the actual task context save.

> >

> 

> I think we're not at the bottom of this rabbit hole yet:-)


Seemingly not...

I think it's better to ignore SVE to begin with and focus on what
changes are desirable for the NEON case by itself.  That's likely to
give cleaner code which can then be extended for SVE as appropriate.


For SVE, the basic change is that if task state saving isn't finished,
then we need to save (partial) SVE state in a nested
kernel_neon_begin(), not just NEON state.  I'm hoping we can detect
this case as TIF_SVE && !TIF_FOREIGN_FPSTATE, if we defer setting
TIF_FOREIGN_FPSTATE until after the task state save is complete ...
unless there's some other reason this won't work.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Catalin Marinas Dec. 13, 2016, 5:34 p.m. UTC | #4
On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:
> Currently, we allow kernel mode NEON in softirq or hardirq context by

> stacking and unstacking a slice of the NEON register file for each call

> to kernel_neon_begin() and kernel_neon_end(), respectively.

> 

> Given that

> a) a CPU typically spends most of its time in userland, during which time

>    no kernel mode NEON in process context is in progress,

> b) a CPU spends most of its time in the kernel doing other things than

>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

>    in softirq context

> 

> the stacking and subsequent unstacking is only necessary if we are

> interrupting a thread while it is performing kernel mode NEON in process

> context, which means that in all other cases, we can simply preserve the

> userland FPSIMD state once, and only restore it upon return to userland,

> even if we are being invoked from softirq or hardirq context.

> 

> So instead of checking whether we are running in interrupt context, keep

> track of the level of nested kernel mode NEON calls in progress, and only

> perform the eager stack/unstack if the level exceeds 1.

> 

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---

> v6:

> - use a spinlock instead of disabling interrupts


My concern with the spinlock or disabling interrupts is the latency if
we ever get some insanely long SVE vectors.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Dec. 13, 2016, 7:17 p.m. UTC | #5
On 13 December 2016 at 17:34, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:

>> Currently, we allow kernel mode NEON in softirq or hardirq context by

>> stacking and unstacking a slice of the NEON register file for each call

>> to kernel_neon_begin() and kernel_neon_end(), respectively.

>>

>> Given that

>> a) a CPU typically spends most of its time in userland, during which time

>>    no kernel mode NEON in process context is in progress,

>> b) a CPU spends most of its time in the kernel doing other things than

>>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

>>    in softirq context

>>

>> the stacking and subsequent unstacking is only necessary if we are

>> interrupting a thread while it is performing kernel mode NEON in process

>> context, which means that in all other cases, we can simply preserve the

>> userland FPSIMD state once, and only restore it upon return to userland,

>> even if we are being invoked from softirq or hardirq context.

>>

>> So instead of checking whether we are running in interrupt context, keep

>> track of the level of nested kernel mode NEON calls in progress, and only

>> perform the eager stack/unstack if the level exceeds 1.

>>

>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>> ---

>> v6:

>> - use a spinlock instead of disabling interrupts

>

> My concern with the spinlock or disabling interrupts is the latency if

> we ever get some insanely long SVE vectors.

>


Thinking about this a bit more, and trying a couple of things in code,
I think this looks like it could be a nasty problem.

The primary issue is that *any* code that handles the FP/SIMD state,
i.e., preserve it, restore it, etc, could be interrupted, and if
kernel_neon_begin()/_end() was used during that time, the top SVE end
of the registers is just blown away if we don't make the nested
save/restore SVE aware.

For instance, looking at

void fpsimd_restore_current_state(void)
{
  preempt_disable();
  if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
    struct fpsimd_state *st = &current->thread.fpsimd_state;

    fpsimd_load_state(st);
    this_cpu_write(fpsimd_last_state, st);
    st->cpu = smp_processor_id();
  }
  preempt_enable();
}

if fpsimd_load_state() is interrupted and the NEON is used, the
registers that were restored before the interruption will have
incorrect values if SVE is being used by userland.

On the preserve side, we can deal with this by preserving into a temp
buffer, and only store the value that was recorded when the flag was
set, i.e.,

static void safe_preserve_fpsimd_state(void)
{
  union __fpsimd_percpu_state *s = this_cpu_ptr(nested_fpsimdstate);

  if (in_irq()) {
  /* No interruptions possible, so just proceed */
    if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
      fpsimd_save_state(&current->thread.fpsimd_state);
    } else {
      struct fpsimd_state *fp;

      fp = in_interrupt() ? &s[0].full : &s[1].full;

      fpsimd_save_state(fp);
      if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
        current->thread.fpsimd_state = *fp;
  }
}

which is already cumbersome if the full FP/SIMD state grows to 64 KB

However, on the restore side, I fail to see how we can tolerate
interruptions in a similar way.

-- 
Ard.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Dec. 13, 2016, 7:19 p.m. UTC | #6
On 13 December 2016 at 19:17, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 13 December 2016 at 17:34, Catalin Marinas <catalin.marinas@arm.com> wrote:

>> On Tue, Dec 13, 2016 at 12:35:29PM +0000, Ard Biesheuvel wrote:

>>> Currently, we allow kernel mode NEON in softirq or hardirq context by

>>> stacking and unstacking a slice of the NEON register file for each call

>>> to kernel_neon_begin() and kernel_neon_end(), respectively.

>>>

>>> Given that

>>> a) a CPU typically spends most of its time in userland, during which time

>>>    no kernel mode NEON in process context is in progress,

>>> b) a CPU spends most of its time in the kernel doing other things than

>>>    kernel mode NEON when it gets interrupted to perform kernel mode NEON

>>>    in softirq context

>>>

>>> the stacking and subsequent unstacking is only necessary if we are

>>> interrupting a thread while it is performing kernel mode NEON in process

>>> context, which means that in all other cases, we can simply preserve the

>>> userland FPSIMD state once, and only restore it upon return to userland,

>>> even if we are being invoked from softirq or hardirq context.

>>>

>>> So instead of checking whether we are running in interrupt context, keep

>>> track of the level of nested kernel mode NEON calls in progress, and only

>>> perform the eager stack/unstack if the level exceeds 1.

>>>

>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>>> ---

>>> v6:

>>> - use a spinlock instead of disabling interrupts

>>

>> My concern with the spinlock or disabling interrupts is the latency if

>> we ever get some insanely long SVE vectors.

>>

>

> Thinking about this a bit more, and trying a couple of things in code,

> I think this looks like it could be a nasty problem.

>

> The primary issue is that *any* code that handles the FP/SIMD state,

> i.e., preserve it, restore it, etc, could be interrupted, and if

> kernel_neon_begin()/_end() was used during that time, the top SVE end

> of the registers is just blown away if we don't make the nested

> save/restore SVE aware.

>

> For instance, looking at

>

> void fpsimd_restore_current_state(void)

> {

>   preempt_disable();

>   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {

>     struct fpsimd_state *st = &current->thread.fpsimd_state;

>

>     fpsimd_load_state(st);

>     this_cpu_write(fpsimd_last_state, st);

>     st->cpu = smp_processor_id();

>   }

>   preempt_enable();

> }

>

> if fpsimd_load_state() is interrupted and the NEON is used, the

> registers that were restored before the interruption will have

> incorrect values if SVE is being used by userland.

>

> On the preserve side, we can deal with this by preserving into a temp

> buffer, and only store the value that was recorded when the flag was

> set, i.e.,

>

> static void safe_preserve_fpsimd_state(void)

> {

>   union __fpsimd_percpu_state *s = this_cpu_ptr(nested_fpsimdstate);

>

>   if (in_irq()) {

>   /* No interruptions possible, so just proceed */

>     if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))

>       fpsimd_save_state(&current->thread.fpsimd_state);


The indentation is slightly off here: the 'else' treats the !in_irq() case

>     } else {

>       struct fpsimd_state *fp;

>

>       fp = in_interrupt() ? &s[0].full : &s[1].full;

>

>       fpsimd_save_state(fp);

>       if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))

>         current->thread.fpsimd_state = *fp;

>   }

> }

>

> which is already cumbersome if the full FP/SIMD state grows to 64 KB

>

> However, on the restore side, I fail to see how we can tolerate

> interruptions in a similar way.

>

> --

> Ard.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox

Patch

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50f559f574fe..ee09fe1c37b6 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -17,6 +17,7 @@ 
 #define __ASM_FP_H
 
 #include <asm/ptrace.h>
+#include <linux/spinlock.h>
 
 #ifndef __ASSEMBLY__
 
@@ -37,6 +38,8 @@  struct fpsimd_state {
 			u32 fpcr;
 		};
 	};
+	/* lock protecting the preserved state while it is being saved */
+	spinlock_t lock;
 	/* the id of the last cpu to have restored this state */
 	unsigned int cpu;
 };
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 394c61db5566..886eea2d4084 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -160,6 +160,7 @@  void fpsimd_flush_thread(void)
 	memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
 	fpsimd_flush_task_state(current);
 	set_thread_flag(TIF_FOREIGN_FPSTATE);
+	spin_lock_init(&current->thread.fpsimd_state.lock);
 }
 
 /*
@@ -220,45 +221,77 @@  void fpsimd_flush_task_state(struct task_struct *t)
 
 #ifdef CONFIG_KERNEL_MODE_NEON
 
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+/*
+ * Although unlikely, it is possible for three kernel mode NEON contexts to
+ * be live at the same time: process context, softirq context and hardirq
+ * context. So while the userland context is stashed in the thread's fpsimd
+ * state structure, we need two additional levels of storage.
+ */
+static DEFINE_PER_CPU(struct fpsimd_partial_state, nested_fpsimdstate[2]);
+static DEFINE_PER_CPU(int, kernel_neon_nesting_level);
 
 /*
  * Kernel-side NEON support functions
  */
 void kernel_neon_begin_partial(u32 num_regs)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
+	struct fpsimd_partial_state *s;
+	int level;
 
-		BUG_ON(num_regs > 32);
-		fpsimd_save_partial_state(s, roundup(num_regs, 2));
-	} else {
+	preempt_disable();
+
+	/*
+	 * Save the userland FPSIMD state if we have one and if we
+	 * haven't done so already. Clear fpsimd_last_state to indicate
+	 * that there is no longer userland FPSIMD state in the
+	 * registers.
+	 */
+	if (current->mm && !test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		/*
-		 * Save the userland FPSIMD state if we have one and if we
-		 * haven't done so already. Clear fpsimd_last_state to indicate
-		 * that there is no longer userland FPSIMD state in the
+		 * Whoever test-and-sets the TIF_FOREIGN_FPSTATE flag should
+		 * preserve the userland FP/SIMD state without interruption by
+		 * nested kernel_neon_begin()/_end() calls.
+		 * The reason is that the FP/SIMD register file is shared with
+		 * SVE on hardware that supports it, and nested kernel mode NEON
+		 * calls do not restore the upper part of the shared SVE/SIMD
 		 * registers.
 		 */
-		preempt_disable();
-		if (current->mm &&
-		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
-			fpsimd_save_state(&current->thread.fpsimd_state);
-		this_cpu_write(fpsimd_last_state, NULL);
+		if (spin_trylock(&current->thread.fpsimd_state.lock)) {
+			if (!test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
+				fpsimd_save_state(&current->thread.fpsimd_state);
+			spin_unlock(&current->thread.fpsimd_state.lock);
+		}
+	}
+	this_cpu_write(fpsimd_last_state, NULL);
+
+	level = this_cpu_inc_return(kernel_neon_nesting_level);
+	BUG_ON(level > 3);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
+
+		WARN_ON_ONCE(num_regs > 32);
+		num_regs = min(roundup(num_regs, 2), 32U);
+
+		fpsimd_save_partial_state(&s[level - 2], num_regs);
 	}
 }
 EXPORT_SYMBOL(kernel_neon_begin_partial);
 
 void kernel_neon_end(void)
 {
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
-		fpsimd_load_partial_state(s);
-	} else {
-		preempt_enable();
+	struct fpsimd_partial_state *s;
+	int level;
+
+	level = this_cpu_read(kernel_neon_nesting_level);
+	BUG_ON(level < 1);
+
+	if (level > 1) {
+		s = this_cpu_ptr(nested_fpsimdstate);
+		fpsimd_load_partial_state(&s[level - 2]);
 	}
+	this_cpu_dec(kernel_neon_nesting_level);
+	preempt_enable();
 }
 EXPORT_SYMBOL(kernel_neon_end);