Message ID | 20220325022219.829-4-chang.seok.bae@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | x86/fpu: Make AMX state ready for CPU idle | expand |
On Fri, Mar 25, 2022 at 3:30 AM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > The non-initialized AMX state can be the cause of C-state demotion from C6 > to C1E. This low-power idle state may improve power savings and thus result > in a higher available turbo frequency budget. > > This behavior is implementation-specific. Initialize the state for the C6 > entrance of Sapphire Rapids as needed. > > Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> > Tested-by : Zhang Rui <rui.zhang@intel.com> > Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> > Cc: linux-kernel@vger.kernel.org > Cc: linux-pm@vger.kernel.org Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> and I'm expecting this to be routed along with the rest of the series. > --- > Changes from v2: > * Remove an unnecessary backslash (Rafael Wysocki). > > Changes from v1: > * Simplify the code with a new flag (Rui). > * Rebase on Artem's patches for SPR intel_idle. > * Massage the changelog. > --- > drivers/idle/intel_idle.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index b7640cfe0020..d35790890a3f 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -54,6 +54,7 @@ > #include <asm/intel-family.h> > #include <asm/mwait.h> > #include <asm/msr.h> > +#include <asm/fpu/api.h> > > #define INTEL_IDLE_VERSION "0.5.1" > > @@ -100,6 +101,11 @@ static unsigned int mwait_substates __initdata; > */ > #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) > > +/* > + * Initialize large xstate for the C6-state entrance. > + */ > +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) > + > /* > * MWAIT takes an 8-bit "hint" in EAX "suggesting" > * the C-state (top nibble) and sub-state (bottom nibble) > @@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, > if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) > local_irq_enable(); > > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > + > mwait_idle_with_hints(eax, ecx); > > return index; > @@ -154,8 +163,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, > static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int index) > { > - unsigned long eax = flg2MWAIT(drv->states[index].flags); > unsigned long ecx = 1; /* break on interrupt flag */ > + struct cpuidle_state *state = &drv->states[index]; > + unsigned long eax = flg2MWAIT(state->flags); > + > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > > mwait_idle_with_hints(eax, ecx); > > @@ -790,7 +803,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { > { > .name = "C6", > .desc = "MWAIT 0x20", > - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, > + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | > + CPUIDLE_FLAG_INIT_XSTATE, > .exit_latency = 290, > .target_residency = 800, > .enter = &intel_idle, > -- > 2.17.1 >
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index b7640cfe0020..d35790890a3f 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -100,6 +101,11 @@ static unsigned int mwait_substates __initdata; */ #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) +/* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + /* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) @@ -134,6 +140,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) local_irq_enable(); + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); + mwait_idle_with_hints(eax, ecx); return index; @@ -154,8 +163,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -790,7 +803,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle,
The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Tested-by : Zhang Rui <rui.zhang@intel.com> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: linux-kernel@vger.kernel.org Cc: linux-pm@vger.kernel.org --- Changes from v2: * Remove an unnecessary backslash (Rafael Wysocki). Changes from v1: * Simplify the code with a new flag (Rui). * Rebase on Artem's patches for SPR intel_idle. * Massage the changelog. --- drivers/idle/intel_idle.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)