Message ID | 1442316997-32282-1-git-send-email-will.deacon@arm.com |
---|---|
State | New |
Headers | show |
Hi Will, On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: > From: Dave P Martin <Dave.Martin@arm.com> > > The arm64 context switch implementation uses a flag > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are > out of sync with the logical state of current's registers. > > During sigreturn, the registers and task_struct are temporarily out > of sync, between writing the task_struct and loading its contents > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is > not set. This can cause the context switch code to discard some or > all of the restored FPSIMD state if preemption occurs during the > critical region of rt_sigreturn. > > This patch sets TIF_FOREIGN_FPSTATE before transferring the > sigframe's saved registers back to the task_struct, so that the > task_struct data will take precedence over the hardware registers > if a context switch occurs before everything is back in sync. > > Signed-off-by: Dave Martin <Dave.Martin@arm.com> > [will: removed preempt_{enable,disable} calls, added compat version] > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > arch/arm64/kernel/signal.c | 3 +++ > arch/arm64/kernel/signal32.c | 2 ++ > 2 files changed, 5 insertions(+) > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c > index e18c48cb6db1..6d50d839b6e9 100644 > --- a/arch/arm64/kernel/signal.c > +++ b/arch/arm64/kernel/signal.c > @@ -79,6 +79,9 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) > if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context)) > return -EINVAL; > > + /* Ensure we don't reload stale data from the hardware registers */ > + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); > + > /* copy the FP and status/control registers */ > err = __copy_from_user(fpsimd.vregs, ctx->vregs, > sizeof(fpsimd.vregs)); I am not following something here. If the task gets preempted before the call to fpsimd_update_current_state(), it will proceed to overwrite the registers with whatever is on the stack in fpsimd after being scheduled back in, and everything should work fine, right?. So the problem must lie /after/ (or during) the call to fpsimd_update_current_state(), which does the following void fpsimd_update_current_state(struct fpsimd_state *state) { preempt_disable(); fpsimd_load_state(state); if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) { struct fpsimd_state *st = ¤t->thread.fpsimd_state; this_cpu_write(fpsimd_last_state, st); st->cpu = smp_processor_id(); } preempt_enable(); } i.e., it copies the FP/SIMD state into the registers, and then marks the h/w state of this cpu as the most recent for current. It is not clear to me at which point the preemption is hitting when it causes this problem. Other than that, I think it may be more appropriate to call fpsimd_flush_task_state() to invalidate any in-register copies of the task's FP/SIMD state rather than setting the TIF flag directly (which is normally only used as a shorthand to indicate that this cpu's fpsimd_last_state and this tasks fpsimd_state::cpu are out of sync, and set automatically by the context switch code)
On Tue, Sep 15, 2015 at 01:11:10PM +0100, Ard Biesheuvel wrote: > Hi Will, Hi Ard, Thanks for having a look. > On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: > > From: Dave P Martin <Dave.Martin@arm.com> > > > > The arm64 context switch implementation uses a flag > > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are > > out of sync with the logical state of current's registers. > > > > During sigreturn, the registers and task_struct are temporarily out > > of sync, between writing the task_struct and loading its contents > > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is > > not set. This can cause the context switch code to discard some or > > all of the restored FPSIMD state if preemption occurs during the > > critical region of rt_sigreturn. > > > > This patch sets TIF_FOREIGN_FPSTATE before transferring the > > sigframe's saved registers back to the task_struct, so that the > > task_struct data will take precedence over the hardware registers > > if a context switch occurs before everything is back in sync. > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com> > > [will: removed preempt_{enable,disable} calls, added compat version] > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > arch/arm64/kernel/signal.c | 3 +++ > > arch/arm64/kernel/signal32.c | 2 ++ > > 2 files changed, 5 insertions(+) > > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c > > index e18c48cb6db1..6d50d839b6e9 100644 > > --- a/arch/arm64/kernel/signal.c > > +++ b/arch/arm64/kernel/signal.c > > @@ -79,6 +79,9 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) > > if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context)) > > return -EINVAL; > > > > + /* Ensure we don't reload stale data from the hardware registers */ > > + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); > > + > > /* copy the FP and status/control registers */ > > err = __copy_from_user(fpsimd.vregs, ctx->vregs, > > sizeof(fpsimd.vregs)); > > I am not following something here. If the task gets preempted before > the call to fpsimd_update_current_state(), it will proceed to > overwrite the registers with whatever is on the stack in fpsimd after > being scheduled back in, and everything should work fine, right?. You're completely right, sorry for wasting your time. Dave and I have been working on the fpsimd code recently and have another patch that restores the state directly into the current thread to avoid the stack entirely. *That* is what caused this problem, and I incorrectly identified this as a fix for mainline (it's extremely similar to 2af276dfb172 ("ARM: 7306/1: vfp: flush thread hwstate before restoring context from sigframe"), which continues to haunt me). Anyway, that also explains why we hadn't seen any problems in our previous testing! The second patch (compat big-endian fix) still stands, however. Will
On Tue, Sep 15, 2015 at 03:39:53PM +0100, Will Deacon wrote: > On Tue, Sep 15, 2015 at 01:11:10PM +0100, Ard Biesheuvel wrote: > > Hi Will, > > Hi Ard, > > Thanks for having a look. > > > On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: > > > From: Dave P Martin <Dave.Martin@arm.com> > > > > > > The arm64 context switch implementation uses a flag > > > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are > > > out of sync with the logical state of current's registers. > > > > > > During sigreturn, the registers and task_struct are temporarily out > > > of sync, between writing the task_struct and loading its contents > > > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is > > > not set. This can cause the context switch code to discard some or > > > all of the restored FPSIMD state if preemption occurs during the > > > critical region of rt_sigreturn. > > > > > > This patch sets TIF_FOREIGN_FPSTATE before transferring the > > > sigframe's saved registers back to the task_struct, so that the > > > task_struct data will take precedence over the hardware registers > > > if a context switch occurs before everything is back in sync. > > > > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com> > > > [will: removed preempt_{enable,disable} calls, added compat version] > > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > > --- > > > arch/arm64/kernel/signal.c | 3 +++ > > > arch/arm64/kernel/signal32.c | 2 ++ > > > 2 files changed, 5 insertions(+) > > > > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c > > > index e18c48cb6db1..6d50d839b6e9 100644 > > > --- a/arch/arm64/kernel/signal.c > > > +++ b/arch/arm64/kernel/signal.c > > > @@ -79,6 +79,9 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) > > > if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context)) > > > return -EINVAL; > > > > > > + /* Ensure we don't reload stale data from the hardware registers */ > > > + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); > > > + > > > /* copy the FP and status/control registers */ > > > err = __copy_from_user(fpsimd.vregs, ctx->vregs, > > > sizeof(fpsimd.vregs)); > > > > I am not following something here. If the task gets preempted before > > the call to fpsimd_update_current_state(), it will proceed to > > overwrite the registers with whatever is on the stack in fpsimd after > > being scheduled back in, and everything should work fine, right?. > > You're completely right, sorry for wasting your time. Dave and I have > been working on the fpsimd code recently and have another patch that > restores the state directly into the current thread to avoid the stack > entirely. *That* is what caused this problem, and I incorrectly > identified this as a fix for mainline (it's extremely similar to > 2af276dfb172 ("ARM: 7306/1: vfp: flush thread hwstate before restoring > context from sigframe"), which continues to haunt me). Urg, looks like I confused myself here. Ard's assessment looks correct. Not staging the data via the stack seemed a harmless change which I thought was an improvement, since at half a K the amount of data is non- trivial. I'll go back and make some measurements to see whether getting rid of the extra copy is really worth it. I think reading straight into the task_struct with __get_user() can work, but it looks like I'll need to fiddle with the preemption region, or otherwise handle the ensuing race... > Anyway, that also explains why we hadn't seen any problems in our > previous testing! The second patch (compat big-endian fix) still stands, > however. I misidentified the problem as being independent of my hacks -- apologies for the confusion there. Cheers ---Dave
On 21 September 2015 at 08:37, Dave P Martin <Dave.Martin@arm.com> wrote: > On Tue, Sep 15, 2015 at 03:39:53PM +0100, Will Deacon wrote: >> On Tue, Sep 15, 2015 at 01:11:10PM +0100, Ard Biesheuvel wrote: >> > Hi Will, >> >> Hi Ard, >> >> Thanks for having a look. >> >> > On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: >> > > From: Dave P Martin <Dave.Martin@arm.com> >> > > >> > > The arm64 context switch implementation uses a flag >> > > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are >> > > out of sync with the logical state of current's registers. >> > > >> > > During sigreturn, the registers and task_struct are temporarily out >> > > of sync, between writing the task_struct and loading its contents >> > > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is >> > > not set. This can cause the context switch code to discard some or >> > > all of the restored FPSIMD state if preemption occurs during the >> > > critical region of rt_sigreturn. >> > > >> > > This patch sets TIF_FOREIGN_FPSTATE before transferring the >> > > sigframe's saved registers back to the task_struct, so that the >> > > task_struct data will take precedence over the hardware registers >> > > if a context switch occurs before everything is back in sync. >> > > >> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com> >> > > [will: removed preempt_{enable,disable} calls, added compat version] >> > > Signed-off-by: Will Deacon <will.deacon@arm.com> >> > > --- >> > > arch/arm64/kernel/signal.c | 3 +++ >> > > arch/arm64/kernel/signal32.c | 2 ++ >> > > 2 files changed, 5 insertions(+) >> > > >> > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c >> > > index e18c48cb6db1..6d50d839b6e9 100644 >> > > --- a/arch/arm64/kernel/signal.c >> > > +++ b/arch/arm64/kernel/signal.c >> > > @@ -79,6 +79,9 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) >> > > if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context)) >> > > return -EINVAL; >> > > >> > > + /* Ensure we don't reload stale data from the hardware registers */ >> > > + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); >> > > + >> > > /* copy the FP and status/control registers */ >> > > err = __copy_from_user(fpsimd.vregs, ctx->vregs, >> > > sizeof(fpsimd.vregs)); >> > >> > I am not following something here. If the task gets preempted before >> > the call to fpsimd_update_current_state(), it will proceed to >> > overwrite the registers with whatever is on the stack in fpsimd after >> > being scheduled back in, and everything should work fine, right?. >> >> You're completely right, sorry for wasting your time. Dave and I have >> been working on the fpsimd code recently and have another patch that >> restores the state directly into the current thread to avoid the stack >> entirely. *That* is what caused this problem, and I incorrectly >> identified this as a fix for mainline (it's extremely similar to >> 2af276dfb172 ("ARM: 7306/1: vfp: flush thread hwstate before restoring >> context from sigframe"), which continues to haunt me). > > Urg, looks like I confused myself here. Ard's assessment looks correct. > > Not staging the data via the stack seemed a harmless change which I > thought was an improvement, since at half a K the amount of data is non- > trivial. > > I'll go back and make some measurements to see whether getting rid > of the extra copy is really worth it. I think reading straight into > the task_struct with __get_user() can work, but it looks like I'll > need to fiddle with the preemption region, or otherwise handle the > ensuing race... > Indeed. I think you will need something like fpsimd_load_state_user which runs with preemption disabled and restarts if it is interrupted. >> Anyway, that also explains why we hadn't seen any problems in our >> previous testing! The second patch (compat big-endian fix) still stands, >> however. > > I misidentified the problem as being independent of my hacks -- > apologies for the confusion there. > No worries. Ard.
On Tue, Sep 22, 2015 at 06:20:29PM -0700, Ard Biesheuvel wrote: > On 21 September 2015 at 08:37, Dave P Martin <Dave.Martin@arm.com> wrote: > > On Tue, Sep 15, 2015 at 03:39:53PM +0100, Will Deacon wrote: > >> On Tue, Sep 15, 2015 at 01:11:10PM +0100, Ard Biesheuvel wrote: [...] > >> > On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: > >> > > From: Dave P Martin <Dave.Martin@arm.com> > >> > > > >> > > The arm64 context switch implementation uses a flag > >> > > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are > >> > > out of sync with the logical state of current's registers. > >> > > > >> > > During sigreturn, the registers and task_struct are temporarily out > >> > > of sync, between writing the task_struct and loading its contents > >> > > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is > >> > > not set. This can cause the context switch code to discard some or > >> > > all of the restored FPSIMD state if preemption occurs during the > >> > > critical region of rt_sigreturn. [...] > >> > I am not following something here. If the task gets preempted before > >> > the call to fpsimd_update_current_state(), it will proceed to > >> > overwrite the registers with whatever is on the stack in fpsimd after > >> > being scheduled back in, and everything should work fine, right?. [...] > > Urg, looks like I confused myself here. Ard's assessment looks correct. > > > > Not staging the data via the stack seemed a harmless change which I > > thought was an improvement, since at half a K the amount of data is non- > > trivial. > > > > I'll go back and make some measurements to see whether getting rid > > of the extra copy is really worth it. I think reading straight into > > the task_struct with __get_user() can work, but it looks like I'll > > need to fiddle with the preemption region, or otherwise handle the > > ensuing race... > > > > Indeed. I think you will need something like fpsimd_load_state_user > which runs with preemption disabled and restarts if it is interrupted. Yep -- I'll take another look at it, and scratch my head some more. Cheers ---Dave
On 23 September 2015 at 04:05, Dave Martin <Dave.Martin@arm.com> wrote: > On Tue, Sep 22, 2015 at 06:20:29PM -0700, Ard Biesheuvel wrote: >> On 21 September 2015 at 08:37, Dave P Martin <Dave.Martin@arm.com> wrote: >> > On Tue, Sep 15, 2015 at 03:39:53PM +0100, Will Deacon wrote: >> >> On Tue, Sep 15, 2015 at 01:11:10PM +0100, Ard Biesheuvel wrote: > > [...] > >> >> > On 15 September 2015 at 13:36, Will Deacon <will.deacon@arm.com> wrote: >> >> > > From: Dave P Martin <Dave.Martin@arm.com> >> >> > > >> >> > > The arm64 context switch implementation uses a flag >> >> > > TIF_FOREIGN_FPSTATE to track whether the hardware FPSIMD regs are >> >> > > out of sync with the logical state of current's registers. >> >> > > >> >> > > During sigreturn, the registers and task_struct are temporarily out >> >> > > of sync, between writing the task_struct and loading its contents >> >> > > back into the FPSIMD registers -- however, TIF_FOREIGN_FPSTATE is >> >> > > not set. This can cause the context switch code to discard some or >> >> > > all of the restored FPSIMD state if preemption occurs during the >> >> > > critical region of rt_sigreturn. > > [...] > >> >> > I am not following something here. If the task gets preempted before >> >> > the call to fpsimd_update_current_state(), it will proceed to >> >> > overwrite the registers with whatever is on the stack in fpsimd after >> >> > being scheduled back in, and everything should work fine, right?. > > [...] > >> > Urg, looks like I confused myself here. Ard's assessment looks correct. >> > >> > Not staging the data via the stack seemed a harmless change which I >> > thought was an improvement, since at half a K the amount of data is non- >> > trivial. >> > >> > I'll go back and make some measurements to see whether getting rid >> > of the extra copy is really worth it. I think reading straight into >> > the task_struct with __get_user() can work, but it looks like I'll >> > need to fiddle with the preemption region, or otherwise handle the >> > ensuing race... >> > >> >> Indeed. I think you will need something like fpsimd_load_state_user >> which runs with preemption disabled and restarts if it is interrupted. > > Yep -- I'll take another look at it, and scratch my head some more. > Actually, I meant 'preemption enabled' (given the __user bit)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c index e18c48cb6db1..6d50d839b6e9 100644 --- a/arch/arm64/kernel/signal.c +++ b/arch/arm64/kernel/signal.c @@ -79,6 +79,9 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx) if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context)) return -EINVAL; + /* Ensure we don't reload stale data from the hardware registers */ + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); + /* copy the FP and status/control registers */ err = __copy_from_user(fpsimd.vregs, ctx->vregs, sizeof(fpsimd.vregs)); diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c index 948f0ad2de23..ae46ffad5aea 100644 --- a/arch/arm64/kernel/signal32.c +++ b/arch/arm64/kernel/signal32.c @@ -273,6 +273,8 @@ static int compat_restore_vfp_context(struct compat_vfp_sigframe __user *frame) if (magic != VFP_MAGIC || size != VFP_STORAGE_SIZE) return -EINVAL; + set_ti_thread_flag(current_thread_info(), TIF_FOREIGN_FPSTATE); + /* * Copy the FP registers into the start of the fpsimd_state. * FIXME: Won't work if big endian.