Message ID | 20220630194309.40465-1-jon@nutanix.com |
---|---|
State | New |
Headers | show |
Series | intel_idle: add CPUIDLE_FLAG_IRQ_ENABLE to SPR C1 and C1E | expand |
Hi Jon, On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote: > Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will > allow local IRQs to be enabled during fast idle transitions on SPR. Did you have a chance to measure this? When I was doing this for ICX and CLX, I was using cyclictest and wult for measuring IRQ latency. I was planning to do this for SPR as well. > Note: Enabling this for both C1 and C1E is slightly different than > the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only > enabled on C1; however, given that SPR target/exit latency is 1/1 > for c1 and 2/4 for C1E, respectively, which is slower than C1 > for SKX, it seems prudent to now enable it on both states. I was also going to measure this for C1E. Could we please hold on this a bit - I'd like to measure this before we merge it. Artem.
> On Jul 1, 2022, at 9:30 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote: > > Hi Jon, > > On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote: >> Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will >> allow local IRQs to be enabled during fast idle transitions on SPR. > > Did you have a chance to measure this? When I was doing this for ICX and CLX, I > was using cyclictest and wult for measuring IRQ latency. > > I was planning to do this for SPR as well. We have the ‘before’ baseline from wult, and realized after doing it that IRQ_ENABLE config wasn’t set. I’ve provided this patch to our internal team working on SPR enablement to get another wult run in next week. That said, if you’ve got access to an SPR system setup as well, we’d certainly appreciate a second set of eyes. This is the first generation of enablement for a new platform that we’ve done where wult has been on the ‘checklist’ so to speak, so we don’t have as much ’stick time’ on it as someone like yourself would :) > >> Note: Enabling this for both C1 and C1E is slightly different than >> the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only >> enabled on C1; however, given that SPR target/exit latency is 1/1 >> for c1 and 2/4 for C1E, respectively, which is slower than C1 >> for SKX, it seems prudent to now enable it on both states. > > I was also going to measure this for C1E. > > Could we please hold on this a bit - I'd like to measure this before we merge > it. Yea no problem, happy to get help and a second set of eyes on this. Thanks - Jon > > Artem. >
> On Jul 1, 2022, at 10:06 AM, Jon Kohler <jon@nutanix.com> wrote: > > > >> On Jul 1, 2022, at 9:30 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote: >> >> Hi Jon, >> >> On Thu, 2022-06-30 at 15:43 -0400, Jon Kohler wrote: >>> Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will >>> allow local IRQs to be enabled during fast idle transitions on SPR. >> >> Did you have a chance to measure this? When I was doing this for ICX and CLX, I >> was using cyclictest and wult for measuring IRQ latency. >> >> I was planning to do this for SPR as well. > > We have the ‘before’ baseline from wult, and realized after doing it that > IRQ_ENABLE config wasn’t set. I’ve provided this patch to our internal > team working on SPR enablement to get another wult run in next week. > > That said, if you’ve got access to an SPR system setup as well, we’d > certainly appreciate a second set of eyes. This is the first generation > of enablement for a new platform that we’ve done where wult has been > on the ‘checklist’ so to speak, so we don’t have as much ’stick time’ > on it as someone like yourself would :) > >> >>> Note: Enabling this for both C1 and C1E is slightly different than >>> the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only >>> enabled on C1; however, given that SPR target/exit latency is 1/1 >>> for c1 and 2/4 for C1E, respectively, which is slower than C1 >>> for SKX, it seems prudent to now enable it on both states. >> >> I was also going to measure this for C1E. >> >> Could we please hold on this a bit - I'd like to measure this before we merge >> it. > > Yea no problem, happy to get help and a second set of eyes on this. > > Thanks - Jon Hey Artem, Coming back around on this, I realized this fell through the cracks. I was wondering if you happened to run through this testing on your side already as part of other efforts? If not, I’ll see if we can get it spun back up on our side. Thanks, Jon > >> >> Artem.
On Fri, 2023-11-10 at 20:00 +0000, Jon Kohler wrote:
> CPUIDLE_FLAG_IRQ_ENABLE
Hi, yes, I did run several experiments, and found that this change would make
some micro benchmarks give worse score. I did a lot of repetitions assuming I
was mixing noise with signal, but every time confirmed that enabling interrupts
in C1 made the score worse (like 1%).
I do not have explanation for this phenomena, but decided to not pursue this
idea.
Artem.
> On Nov 12, 2023, at 6:52 AM, Artem Bityutskiy <artem.bityutskiy@linux.intel.com> wrote: > > On Fri, 2023-11-10 at 20:00 +0000, Jon Kohler wrote: >> CPUIDLE_FLAG_IRQ_ENABLE > > Hi, yes, I did run several experiments, and found that this change would make > some micro benchmarks give worse score. I did a lot of repetitions assuming I > was mixing noise with signal, but every time confirmed that enabling interrupts > in C1 made the score worse (like 1%). > > I do not have explanation for this phenomena, but decided to not pursue this > idea. > > Artem. Thanks for the follow up. Indeed that is strange on why it would make it worse, but it is good to know that the shipping configuration is going to work out best. Thanks, Jon
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 424ef470223d..f51857cddf2b 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -893,7 +893,7 @@ static struct cpuidle_state spr_cstates[] __initdata = { { .name = "C1", .desc = "MWAIT 0x00", - .flags = MWAIT2flg(0x00), + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE, .exit_latency = 1, .target_residency = 1, .enter = &intel_idle, @@ -902,7 +902,8 @@ static struct cpuidle_state spr_cstates[] __initdata = { .name = "C1E", .desc = "MWAIT 0x01", .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE | - CPUIDLE_FLAG_UNUSABLE, + CPUIDLE_FLAG_UNUSABLE | + CPUIDLE_FLAG_IRQ_ENABLE, .exit_latency = 2, .target_residency = 4, .enter = &intel_idle,
Add CPUIDLE_FLAG_IRQ_ENABLE to spr_cstates C1 and C1E, which will allow local IRQs to be enabled during fast idle transitions on SPR. Note: Enabling this for both C1 and C1E is slightly different than the approach for SKX/ICX, where CPUIDLE_FLAG_IRQ_ENABLE is only enabled on C1; however, given that SPR target/exit latency is 1/1 for c1 and 2/4 for C1E, respectively, which is slower than C1 for SKX, it seems prudent to now enable it on both states. This is also important as on SPR it is possible for only C1 or only C1E to be enabled (i.e. one of them would be disabled), so only enabling C1 would short change C1E-only configurations. Fixes: 9edf3c0ffef0 ("intel_idle: add SPR support") Signed-off-by: Jon Kohler <jon@nutanix.com> --- drivers/idle/intel_idle.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)