Message ID | 20220403062322.3168-1-akihiko.odaki@gmail.com |
---|---|
State | Accepted |
Commit | dfbba2518aac4204203b0697a894d3b2f80134d3 |
Headers | show |
Series | Revert "ACPI: processor: idle: Only flush cache on entering C3" | expand |
I can confirm that this reversion fixes a very similar issue I have with kernel 5.17 (only difference is resume always fails on my desktop), here's the bisect log: # bad: [f443e374ae131c168a065ea1748feac6b2e76613] Linux 5.17 # good: [df0cc57e057f18e44dac8e6c18aba47ab53202f9] Linux 5.16 git bisect start 'v5.17' 'v5.16' # bad: [22ef12195e13c5ec58320dbf99ef85059a2c0820] Merge tag 'staging-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging git bisect bad 22ef12195e13c5ec58320dbf99ef85059a2c0820 # good: [9bcbf894b6872216ef61faf17248ec234e3db6bc] Merge tag 'media/v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media git bisect good 9bcbf894b6872216ef61faf17248ec234e3db6bc # good: [208dd45d8d050360b46ded439a057bcc7cbf3b09] tcp: tcp_send_challenge_ack delete useless param `skb` git bisect good 208dd45d8d050360b46ded439a057bcc7cbf3b09 # bad: [c288ea679840de4dee2ce6da5d0f139e3774ad86] Merge tag 'gpio-updates-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux git bisect bad c288ea679840de4dee2ce6da5d0f139e3774ad86 # bad: [5c947d0dbae8038ec1c8b538891f6475350542ee] Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 git bisect bad 5c947d0dbae8038ec1c8b538891f6475350542ee # bad: [a229327733b86aa585effdb0d27a87b12aa51597] Merge tag 'printk-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux git bisect bad a229327733b86aa585effdb0d27a87b12aa51597 # bad: [bca21755b9fc00dbe371994b53389eb5d70b8e72] Merge tag 'acpi-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect bad bca21755b9fc00dbe371994b53389eb5d70b8e72 # bad: [af8fefd7444480bb8fd8d74f977dbac4693ac3ed] Merge branches 'acpi-x86', 'acpi-pmic' and 'acpi-dptf' git bisect bad af8fefd7444480bb8fd8d74f977dbac4693ac3ed # good: [b659ea768ae372e2f82c6346120f2e7272a42ac9] Merge branches 'acpi-scan', 'acpi-pm', 'acpi-power' and 'acpi-pci' git bisect good b659ea768ae372e2f82c6346120f2e7272a42ac9 # bad: [5847d2d2efaab724b7ab374b6fca105e24509c92] Merge branches 'acpi-ec' and 'acpi-processor' git bisect bad 5847d2d2efaab724b7ab374b6fca105e24509c92 # good: [c793570d8725e44b64dbe466eb8ecda34c5eb8ac] ACPI: EC: Avoid queuing unnecessary work in acpi_ec_submit_event() git bisect good c793570d8725e44b64dbe466eb8ecda34c5eb8ac # bad: [8120832d8f82aa7316c578fbccf11e385a5b3601] ACPI: processor: thermal: avoid cpufreq_get_policy() git bisect bad 8120832d8f82aa7316c578fbccf11e385a5b3601 # good: [0e6078c3c6737df7d0bd0c890fbadf24a27fffbb] ACPI: processor idle: Use swap() instead of open coding it git bisect good 0e6078c3c6737df7d0bd0c890fbadf24a27fffbb # bad: [87ebbb8c612b1214f227ebb8f25442c6d163e802] ACPI: processor: idle: Only flush cache on entering C3 git bisect bad 87ebbb8c612b1214f227ebb8f25442c6d163e802 # first bad commit: [87ebbb8c612b1214f227ebb8f25442c6d163e802] ACPI: processor: idle: Only flush cache on entering C3
On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > > This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802. > > ACPI processor power states can be transitioned in two distinct > situations: 1. when CPU goes idle and 2. before CPU goes offline > ("playing dead") to suspend or hibernate. Case 1 is handled by > acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by > acpi_idle_play_dead. > > It is necessary to flush CPU caches in case 2 even if it is not > required to transit ACPI processor power states as CPU will go > offline soon. However, the reverted commit incorrectly removed CPU > cache flushing in such a condition. I think what you mean is that the CPU cache must always be flushed in acpi_idle_play_dead(), regardless of the target C-state that is going to be requested, because this is likely to be part of a CPU offline procedure or preparation for entering a system-wide sleep state and the stale cache contents may lead to problems going forward, for example when the CPU is taken back online. If so, I will put the above information into the patch changelog. > In fact, it made resuming from > suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga. So this probably means that resume from suspend-to-RAM occasionally fails on Lenovo ThinkPad C13 Yoga and reverting the commit in question fixes this problem. Is that correct? > Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> > --- > drivers/acpi/processor_idle.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index f8e9fa82cb9b..05b3985a1984 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -570,8 +570,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) > { > struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu); > > - if (cx->type == ACPI_STATE_C3) > - ACPI_FLUSH_CPU_CACHE(); > + ACPI_FLUSH_CPU_CACHE(); > > while (1) { > > -- > 2.35.1 >
On 2022/04/05 3:13, Rafael J. Wysocki wrote: > On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: >> >> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802. >> >> ACPI processor power states can be transitioned in two distinct >> situations: 1. when CPU goes idle and 2. before CPU goes offline >> ("playing dead") to suspend or hibernate. Case 1 is handled by >> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by >> acpi_idle_play_dead. >> >> It is necessary to flush CPU caches in case 2 even if it is not >> required to transit ACPI processor power states as CPU will go >> offline soon. However, the reverted commit incorrectly removed CPU >> cache flushing in such a condition. > > I think what you mean is that the CPU cache must always be flushed in > acpi_idle_play_dead(), regardless of the target C-state that is going > to be requested, because this is likely to be part of a CPU offline > procedure or preparation for entering a system-wide sleep state and > the stale cache contents may lead to problems going forward, for > example when the CPU is taken back online. > > If so, I will put the above information into the patch changelog. I guess it is causing problems because the dirty caches will not get written back and the RAM becomes stale if they are not flushed. From my understanding, the CPU should have an empty cache and read back contents from RAM when it is taken back online. > >> In fact, it made resuming from >> suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga. > > So this probably means that resume from suspend-to-RAM occasionally > fails on Lenovo ThinkPad C13 Yoga and reverting the commit in question > fixes this problem. Is that correct? Yes, that is what I meant. Regards, Akihiko Odaki > >> Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> >> --- >> drivers/acpi/processor_idle.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c >> index f8e9fa82cb9b..05b3985a1984 100644 >> --- a/drivers/acpi/processor_idle.c >> +++ b/drivers/acpi/processor_idle.c >> @@ -570,8 +570,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) >> { >> struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu); >> >> - if (cx->type == ACPI_STATE_C3) >> - ACPI_FLUSH_CPU_CACHE(); >> + ACPI_FLUSH_CPU_CACHE(); >> >> while (1) { >> >> -- >> 2.35.1 >>
On Mon, Apr 4, 2022 at 8:25 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > > On 2022/04/05 3:13, Rafael J. Wysocki wrote: > > On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > >> > >> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802. > >> > >> ACPI processor power states can be transitioned in two distinct > >> situations: 1. when CPU goes idle and 2. before CPU goes offline > >> ("playing dead") to suspend or hibernate. Case 1 is handled by > >> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by > >> acpi_idle_play_dead. > >> > >> It is necessary to flush CPU caches in case 2 even if it is not > >> required to transit ACPI processor power states as CPU will go > >> offline soon. However, the reverted commit incorrectly removed CPU > >> cache flushing in such a condition. > > > > I think what you mean is that the CPU cache must always be flushed in > > acpi_idle_play_dead(), regardless of the target C-state that is going > > to be requested, because this is likely to be part of a CPU offline > > procedure or preparation for entering a system-wide sleep state and > > the stale cache contents may lead to problems going forward, for > > example when the CPU is taken back online. > > > > If so, I will put the above information into the patch changelog. > > I guess it is causing problems because the dirty caches will not get > written back and the RAM becomes stale if they are not flushed. From my > understanding, the CPU should have an empty cache and read back contents > from RAM when it is taken back online. OK, please see if the changelog I've added to the patch is looking good: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=dfbba2518aac4204203b0697a894d3b2f80134d3
On 2022/04/05 3:56, Rafael J. Wysocki wrote: > On Mon, Apr 4, 2022 at 8:25 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: >> >> On 2022/04/05 3:13, Rafael J. Wysocki wrote: >>> On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: >>>> >>>> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802. >>>> >>>> ACPI processor power states can be transitioned in two distinct >>>> situations: 1. when CPU goes idle and 2. before CPU goes offline >>>> ("playing dead") to suspend or hibernate. Case 1 is handled by >>>> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by >>>> acpi_idle_play_dead. >>>> >>>> It is necessary to flush CPU caches in case 2 even if it is not >>>> required to transit ACPI processor power states as CPU will go >>>> offline soon. However, the reverted commit incorrectly removed CPU >>>> cache flushing in such a condition. >>> >>> I think what you mean is that the CPU cache must always be flushed in >>> acpi_idle_play_dead(), regardless of the target C-state that is going >>> to be requested, because this is likely to be part of a CPU offline >>> procedure or preparation for entering a system-wide sleep state and >>> the stale cache contents may lead to problems going forward, for >>> example when the CPU is taken back online. >>> >>> If so, I will put the above information into the patch changelog. >> >> I guess it is causing problems because the dirty caches will not get >> written back and the RAM becomes stale if they are not flushed. From my >> understanding, the CPU should have an empty cache and read back contents >> from RAM when it is taken back online. > > OK, please see if the changelog I've added to the patch is looking good: > > https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=dfbba2518aac4204203b0697a894d3b2f80134d3 It looks good for me. Thanks, Akihiko Odaki
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index f8e9fa82cb9b..05b3985a1984 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -570,8 +570,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index) { struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu); - if (cx->type == ACPI_STATE_C3) - ACPI_FLUSH_CPU_CACHE(); + ACPI_FLUSH_CPU_CACHE(); while (1) {
This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802. ACPI processor power states can be transitioned in two distinct situations: 1. when CPU goes idle and 2. before CPU goes offline ("playing dead") to suspend or hibernate. Case 1 is handled by acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by acpi_idle_play_dead. It is necessary to flush CPU caches in case 2 even if it is not required to transit ACPI processor power states as CPU will go offline soon. However, the reverted commit incorrectly removed CPU cache flushing in such a condition. In fact, it made resuming from suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga. Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com> --- drivers/acpi/processor_idle.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)