Message ID | 20220309223431.26560-3-chang.seok.bae@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | x86/fpu: Make AMX state ready for CPU idle | expand |
On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <chang.seok.bae@intel.com> wrote: > > The non-initialized AMX state can be the cause of C-state demotion from C6 > to C1E. This low-power idle state may improve power savings and thus result > in a higher available turbo frequency budget. > > This behavior is implementation-specific. Initialize the state for the C6 > entrance of Sapphire Rapids as needed. > > Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> > Tested-by : Zhang Rui <rui.zhang@intel.com> > Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> > Cc: linux-pm@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > Changes from v1: > * Simplify the code with a new flag (Rui). > * Rebase on Artem's patches for SPR intel_idle. > * Massage the changelog. > > Dependency on the new C-state table for SPR: > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9edf3c0ffef0ec1bed8300315852b5c6a0997130 > --- > drivers/idle/intel_idle.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 1c7c25909e54..6ecbeffdf785 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -54,6 +54,7 @@ > #include <asm/intel-family.h> > #include <asm/mwait.h> > #include <asm/msr.h> > +#include <asm/fpu/api.h> > > #define INTEL_IDLE_VERSION "0.5.1" > > @@ -99,6 +100,11 @@ static unsigned int mwait_substates __initdata; > */ > #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) > > +/* > + * Initialize large xstate for the C6-state entrance. > + */ > +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) > + > /* > * MWAIT takes an 8-bit "hint" in EAX "suggesting" > * the C-state (top nibble) and sub-state (bottom nibble) > @@ -136,6 +142,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, > if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) > local_irq_enable(); > > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > + > mwait_idle_with_hints(eax, ecx); > > return index; > @@ -156,8 +165,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, > static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, > struct cpuidle_driver *drv, int index) > { > - unsigned long eax = flg2MWAIT(drv->states[index].flags); > unsigned long ecx = 1; /* break on interrupt flag */ > + struct cpuidle_state *state = &drv->states[index]; > + unsigned long eax = flg2MWAIT(state->flags); > + > + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) > + fpu_idle_fpregs(); > > mwait_idle_with_hints(eax, ecx); > > @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { > { > .name = "C6", > .desc = "MWAIT 0x20", > - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, > + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \ Why is the backslash at the end of the line needed? > + CPUIDLE_FLAG_INIT_XSTATE, > .exit_latency = 290, > .target_residency = 800, > .enter = &intel_idle, > --
On 3/10/2022 10:34 AM, Rafael J. Wysocki wrote: > On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <chang.seok.bae@intel.com> wrote: >> [...] >> @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { >> { >> .name = "C6", >> .desc = "MWAIT 0x20", >> - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, >> + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \ > > Why is the backslash at the end of the line needed? No, it is not needed. Sorry, I think I was mindlessly following the style in this new c-state table: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/drivers/idle/intel_idle.c?h=linux-next#n787 Thanks, Chang
On Thu, 2022-03-10 at 10:50 -0800, Chang S. Bae wrote: > On 3/10/2022 10:34 AM, Rafael J. Wysocki wrote: > > On Wed, Mar 9, 2022 at 11:42 PM Chang S. Bae <chang.seok.bae@intel.com> > > wrote: > > > > [...] > > > @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = > > > { > > > { > > > .name = "C6", > > > .desc = "MWAIT 0x20", > > > - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, > > > + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \ > > > > Why is the backslash at the end of the line needed? > > No, it is not needed. > > Sorry, I think I was mindlessly following the style in this new c-state > table: > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux- > pm.git/tree/drivers/idle/intel_idle.c?h=linux-next#n787 > > Thanks, > Chang > Sorry, too much python programming lately, so I automatically added the back- slash. Let me know if you would remove that backslash at the same time, or I can submit a patch. Artem.
On 3/10/2022 11:33 PM, Artem Bityutskiy wrote: > > Let me know if you would remove that backslash at the same time, or I can > submit a patch. Looks like the table was merged without the backslash. Rafael thankfully removed it via: https://lore.kernel.org/linux-pm/4731792.31r3eYUQgx@kreacher/ Thanks, Chang
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 1c7c25909e54..6ecbeffdf785 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -54,6 +54,7 @@ #include <asm/intel-family.h> #include <asm/mwait.h> #include <asm/msr.h> +#include <asm/fpu/api.h> #define INTEL_IDLE_VERSION "0.5.1" @@ -99,6 +100,11 @@ static unsigned int mwait_substates __initdata; */ #define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15) +/* + * Initialize large xstate for the C6-state entrance. + */ +#define CPUIDLE_FLAG_INIT_XSTATE BIT(16) + /* * MWAIT takes an 8-bit "hint" in EAX "suggesting" * the C-state (top nibble) and sub-state (bottom nibble) @@ -136,6 +142,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) local_irq_enable(); + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); + mwait_idle_with_hints(eax, ecx); return index; @@ -156,8 +165,12 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev, static __cpuidle int intel_idle_s2idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long eax = flg2MWAIT(drv->states[index].flags); unsigned long ecx = 1; /* break on interrupt flag */ + struct cpuidle_state *state = &drv->states[index]; + unsigned long eax = flg2MWAIT(state->flags); + + if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) + fpu_idle_fpregs(); mwait_idle_with_hints(eax, ecx); @@ -792,7 +805,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C6", .desc = "MWAIT 0x20", - .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED, + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED | \ + CPUIDLE_FLAG_INIT_XSTATE, .exit_latency = 290, .target_residency = 800, .enter = &intel_idle,
The non-initialized AMX state can be the cause of C-state demotion from C6 to C1E. This low-power idle state may improve power savings and thus result in a higher available turbo frequency budget. This behavior is implementation-specific. Initialize the state for the C6 entrance of Sapphire Rapids as needed. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Tested-by : Zhang Rui <rui.zhang@intel.com> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- Changes from v1: * Simplify the code with a new flag (Rui). * Rebase on Artem's patches for SPR intel_idle. * Massage the changelog. Dependency on the new C-state table for SPR: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=linux-next&id=9edf3c0ffef0ec1bed8300315852b5c6a0997130 --- drivers/idle/intel_idle.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)