Message ID | 20180522174254.27551-14-julien.grall@arm.com |
---|---|
State | Superseded |
Headers | show |
Series | xen/arm: SSBD (aka Spectre-v4) mitigation (XSA-263) | expand |
On Tue, 22 May 2018, Julien Grall wrote: > Using current is fairly expensive, so save up into a variable. > > Signed-off-by: Julien Grall <julien.grall@arm.com> Good idea. I am curious to know actually how much this patch would save but I am not going to ask you run the tests. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > --- > xen/arch/arm/traps.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > index 020b0b8eef..b1546f6907 100644 > --- a/xen/arch/arm/traps.c > +++ b/xen/arch/arm/traps.c > @@ -2024,8 +2024,10 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) > { > if ( guest_mode(regs) ) > { > + struct vcpu *v = current; > + > /* If the guest has disabled the workaround, bring it back on. */ > - if ( needs_ssbd_flip(current) ) > + if ( needs_ssbd_flip(v) ) > arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL); > > /* > @@ -2034,8 +2036,8 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) > * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE > * (alias of HCR.VA) is cleared to 0." > */ > - if ( current->arch.hcr_el2 & HCR_VA ) > - current->arch.hcr_el2 = READ_SYSREG(HCR_EL2); > + if ( v->arch.hcr_el2 & HCR_VA ) > + v->arch.hcr_el2 = READ_SYSREG(HCR_EL2); > > #ifdef CONFIG_NEW_VGIC > /* > @@ -2045,11 +2047,11 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) > * TODO: Investigate whether this is necessary to do on every > * trap and how it can be optimised. > */ > - vtimer_update_irqs(current); > - vcpu_update_evtchn_irq(current); > + vtimer_update_irqs(v); > + vcpu_update_evtchn_irq(v); > #endif > > - vgic_sync_from_lrs(current); > + vgic_sync_from_lrs(v); > } > } > > -- > 2.11.0 >
Hi Stefano, On 24/05/18 00:47, Stefano Stabellini wrote: > On Tue, 22 May 2018, Julien Grall wrote: >> Using current is fairly expensive, so save up into a variable. >> >> Signed-off-by: Julien Grall <julien.grall@arm.com> > > Good idea. I am curious to know actually how much this patch would save > but I am not going to ask you run the tests. I haven't benchmark it but looked at the resulting assembly code. This reduces by about ~20% the number of instructions in the function. AFAIU, this is because of the way per-cpu access have been implemented. The per-cpu offset is stored in a system register (TPIDR_EL2), all the read to it cannot be optimized (access using volatile). So every direct use of "current" will require at least a system register access and then a load from memory. Cheers, > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > >> --- >> xen/arch/arm/traps.c | 14 ++++++++------ >> 1 file changed, 8 insertions(+), 6 deletions(-) >> >> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c >> index 020b0b8eef..b1546f6907 100644 >> --- a/xen/arch/arm/traps.c >> +++ b/xen/arch/arm/traps.c >> @@ -2024,8 +2024,10 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) >> { >> if ( guest_mode(regs) ) >> { >> + struct vcpu *v = current; >> + >> /* If the guest has disabled the workaround, bring it back on. */ >> - if ( needs_ssbd_flip(current) ) >> + if ( needs_ssbd_flip(v) ) >> arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL); >> >> /* >> @@ -2034,8 +2036,8 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) >> * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE >> * (alias of HCR.VA) is cleared to 0." >> */ >> - if ( current->arch.hcr_el2 & HCR_VA ) >> - current->arch.hcr_el2 = READ_SYSREG(HCR_EL2); >> + if ( v->arch.hcr_el2 & HCR_VA ) >> + v->arch.hcr_el2 = READ_SYSREG(HCR_EL2); >> >> #ifdef CONFIG_NEW_VGIC >> /* >> @@ -2045,11 +2047,11 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) >> * TODO: Investigate whether this is necessary to do on every >> * trap and how it can be optimised. >> */ >> - vtimer_update_irqs(current); >> - vcpu_update_evtchn_irq(current); >> + vtimer_update_irqs(v); >> + vcpu_update_evtchn_irq(v); >> #endif >> >> - vgic_sync_from_lrs(current); >> + vgic_sync_from_lrs(v); >> } >> } >> >> -- >> 2.11.0 >>
On Thu, 24 May 2018, Julien Grall wrote: > Hi Stefano, > > On 24/05/18 00:47, Stefano Stabellini wrote: > > On Tue, 22 May 2018, Julien Grall wrote: > > > Using current is fairly expensive, so save up into a variable. > > > > > > Signed-off-by: Julien Grall <julien.grall@arm.com> > > > > Good idea. I am curious to know actually how much this patch would save > > but I am not going to ask you run the tests. > > I haven't benchmark it but looked at the resulting assembly code. This reduces > by about ~20% the number of instructions in the function. > > AFAIU, this is because of the way per-cpu access have been implemented. The > per-cpu offset is stored in a system register (TPIDR_EL2), all the read to it > cannot be optimized (access using volatile). > > So every direct use of "current" will require at least a system register > access and then a load from memory. Very nice, thank you! > > > > > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> > > > > > --- > > > xen/arch/arm/traps.c | 14 ++++++++------ > > > 1 file changed, 8 insertions(+), 6 deletions(-) > > > > > > diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c > > > index 020b0b8eef..b1546f6907 100644 > > > --- a/xen/arch/arm/traps.c > > > +++ b/xen/arch/arm/traps.c > > > @@ -2024,8 +2024,10 @@ static void enter_hypervisor_head(struct > > > cpu_user_regs *regs) > > > { > > > if ( guest_mode(regs) ) > > > { > > > + struct vcpu *v = current; > > > + > > > /* If the guest has disabled the workaround, bring it back on. > > > */ > > > - if ( needs_ssbd_flip(current) ) > > > + if ( needs_ssbd_flip(v) ) > > > arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL); > > > /* > > > @@ -2034,8 +2036,8 @@ static void enter_hypervisor_head(struct > > > cpu_user_regs *regs) > > > * but the crucial bit is "On taking a vSError interrupt, > > > HCR_EL2.VSE > > > * (alias of HCR.VA) is cleared to 0." > > > */ > > > - if ( current->arch.hcr_el2 & HCR_VA ) > > > - current->arch.hcr_el2 = READ_SYSREG(HCR_EL2); > > > + if ( v->arch.hcr_el2 & HCR_VA ) > > > + v->arch.hcr_el2 = READ_SYSREG(HCR_EL2); > > > #ifdef CONFIG_NEW_VGIC > > > /* > > > @@ -2045,11 +2047,11 @@ static void enter_hypervisor_head(struct > > > cpu_user_regs *regs) > > > * TODO: Investigate whether this is necessary to do on every > > > * trap and how it can be optimised. > > > */ > > > - vtimer_update_irqs(current); > > > - vcpu_update_evtchn_irq(current); > > > + vtimer_update_irqs(v); > > > + vcpu_update_evtchn_irq(v); > > > #endif > > > - vgic_sync_from_lrs(current); > > > + vgic_sync_from_lrs(v); > > > } > > > } > > > > > > -- > > > 2.11.0 > > > > > -- > Julien Grall >
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index 020b0b8eef..b1546f6907 100644 --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -2024,8 +2024,10 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) { if ( guest_mode(regs) ) { + struct vcpu *v = current; + /* If the guest has disabled the workaround, bring it back on. */ - if ( needs_ssbd_flip(current) ) + if ( needs_ssbd_flip(v) ) arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL); /* @@ -2034,8 +2036,8 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE * (alias of HCR.VA) is cleared to 0." */ - if ( current->arch.hcr_el2 & HCR_VA ) - current->arch.hcr_el2 = READ_SYSREG(HCR_EL2); + if ( v->arch.hcr_el2 & HCR_VA ) + v->arch.hcr_el2 = READ_SYSREG(HCR_EL2); #ifdef CONFIG_NEW_VGIC /* @@ -2045,11 +2047,11 @@ static void enter_hypervisor_head(struct cpu_user_regs *regs) * TODO: Investigate whether this is necessary to do on every * trap and how it can be optimised. */ - vtimer_update_irqs(current); - vcpu_update_evtchn_irq(current); + vtimer_update_irqs(v); + vcpu_update_evtchn_irq(v); #endif - vgic_sync_from_lrs(current); + vgic_sync_from_lrs(v); } }
Using current is fairly expensive, so save up into a variable. Signed-off-by: Julien Grall <julien.grall@arm.com> --- xen/arch/arm/traps.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)