diff mbox series

Revert "ACPI: processor: idle: Only flush cache on entering C3"

Message ID 20220403062322.3168-1-akihiko.odaki@gmail.com
State Accepted
Commit dfbba2518aac4204203b0697a894d3b2f80134d3
Headers show
Series Revert "ACPI: processor: idle: Only flush cache on entering C3" | expand

Commit Message

Akihiko Odaki April 3, 2022, 6:23 a.m. UTC
This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802.

ACPI processor power states can be transitioned in two distinct
situations: 1. when CPU goes idle and 2. before CPU goes offline
("playing dead") to suspend or hibernate. Case 1 is handled by
acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by
acpi_idle_play_dead.

It is necessary to flush CPU caches in case 2 even if it is not
required to transit ACPI processor power states as CPU will go
offline soon. However, the reverted commit incorrectly removed CPU
cache flushing in such a condition. In fact, it made resuming from
suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga.

Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
---
 drivers/acpi/processor_idle.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Ketsui April 4, 2022, 5:09 p.m. UTC | #1
I can confirm that this reversion fixes a very similar issue I have with kernel
5.17 (only difference is resume always fails on my desktop), here's the bisect
log:

# bad: [f443e374ae131c168a065ea1748feac6b2e76613] Linux 5.17
# good: [df0cc57e057f18e44dac8e6c18aba47ab53202f9] Linux 5.16
git bisect start 'v5.17' 'v5.16'
# bad: [22ef12195e13c5ec58320dbf99ef85059a2c0820] Merge tag 'staging-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
git bisect bad 22ef12195e13c5ec58320dbf99ef85059a2c0820
# good: [9bcbf894b6872216ef61faf17248ec234e3db6bc] Merge tag 'media/v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect good 9bcbf894b6872216ef61faf17248ec234e3db6bc
# good: [208dd45d8d050360b46ded439a057bcc7cbf3b09] tcp: tcp_send_challenge_ack delete useless param `skb`
git bisect good 208dd45d8d050360b46ded439a057bcc7cbf3b09
# bad: [c288ea679840de4dee2ce6da5d0f139e3774ad86] Merge tag 'gpio-updates-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
git bisect bad c288ea679840de4dee2ce6da5d0f139e3774ad86
# bad: [5c947d0dbae8038ec1c8b538891f6475350542ee] Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect bad 5c947d0dbae8038ec1c8b538891f6475350542ee
# bad: [a229327733b86aa585effdb0d27a87b12aa51597] Merge tag 'printk-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
git bisect bad a229327733b86aa585effdb0d27a87b12aa51597
# bad: [bca21755b9fc00dbe371994b53389eb5d70b8e72] Merge tag 'acpi-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad bca21755b9fc00dbe371994b53389eb5d70b8e72
# bad: [af8fefd7444480bb8fd8d74f977dbac4693ac3ed] Merge branches 'acpi-x86', 'acpi-pmic' and 'acpi-dptf'
git bisect bad af8fefd7444480bb8fd8d74f977dbac4693ac3ed
# good: [b659ea768ae372e2f82c6346120f2e7272a42ac9] Merge branches 'acpi-scan', 'acpi-pm', 'acpi-power' and 'acpi-pci'
git bisect good b659ea768ae372e2f82c6346120f2e7272a42ac9
# bad: [5847d2d2efaab724b7ab374b6fca105e24509c92] Merge branches 'acpi-ec' and 'acpi-processor'
git bisect bad 5847d2d2efaab724b7ab374b6fca105e24509c92
# good: [c793570d8725e44b64dbe466eb8ecda34c5eb8ac] ACPI: EC: Avoid queuing unnecessary work in acpi_ec_submit_event()
git bisect good c793570d8725e44b64dbe466eb8ecda34c5eb8ac
# bad: [8120832d8f82aa7316c578fbccf11e385a5b3601] ACPI: processor: thermal: avoid cpufreq_get_policy()
git bisect bad 8120832d8f82aa7316c578fbccf11e385a5b3601
# good: [0e6078c3c6737df7d0bd0c890fbadf24a27fffbb] ACPI: processor idle: Use swap() instead of open coding it
git bisect good 0e6078c3c6737df7d0bd0c890fbadf24a27fffbb
# bad: [87ebbb8c612b1214f227ebb8f25442c6d163e802] ACPI: processor: idle: Only flush cache on entering C3
git bisect bad 87ebbb8c612b1214f227ebb8f25442c6d163e802
# first bad commit: [87ebbb8c612b1214f227ebb8f25442c6d163e802] ACPI: processor: idle: Only flush cache on entering C3
Rafael J. Wysocki April 4, 2022, 6:13 p.m. UTC | #2
On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>
> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802.
>
> ACPI processor power states can be transitioned in two distinct
> situations: 1. when CPU goes idle and 2. before CPU goes offline
> ("playing dead") to suspend or hibernate. Case 1 is handled by
> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by
> acpi_idle_play_dead.
>
> It is necessary to flush CPU caches in case 2 even if it is not
> required to transit ACPI processor power states as CPU will go
> offline soon. However, the reverted commit incorrectly removed CPU
> cache flushing in such a condition.

I think what you mean is that the CPU cache must always be flushed in
acpi_idle_play_dead(), regardless of the target C-state that is going
to be requested, because this is likely to be part of a CPU offline
procedure or preparation for entering a system-wide sleep state and
the stale cache contents may lead to problems going forward, for
example when the CPU is taken back online.

If so, I will put the above information into the patch changelog.

> In fact, it made resuming from
> suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga.

So this probably means that resume from suspend-to-RAM occasionally
fails on Lenovo ThinkPad C13 Yoga and reverting the commit in question
fixes this problem.  Is that correct?

> Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
> ---
>  drivers/acpi/processor_idle.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index f8e9fa82cb9b..05b3985a1984 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -570,8 +570,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index)
>  {
>         struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
>
> -       if (cx->type == ACPI_STATE_C3)
> -               ACPI_FLUSH_CPU_CACHE();
> +       ACPI_FLUSH_CPU_CACHE();
>
>         while (1) {
>
> --
> 2.35.1
>
Akihiko Odaki April 4, 2022, 6:25 p.m. UTC | #3
On 2022/04/05 3:13, Rafael J. Wysocki wrote:
> On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>>
>> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802.
>>
>> ACPI processor power states can be transitioned in two distinct
>> situations: 1. when CPU goes idle and 2. before CPU goes offline
>> ("playing dead") to suspend or hibernate. Case 1 is handled by
>> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by
>> acpi_idle_play_dead.
>>
>> It is necessary to flush CPU caches in case 2 even if it is not
>> required to transit ACPI processor power states as CPU will go
>> offline soon. However, the reverted commit incorrectly removed CPU
>> cache flushing in such a condition.
> 
> I think what you mean is that the CPU cache must always be flushed in
> acpi_idle_play_dead(), regardless of the target C-state that is going
> to be requested, because this is likely to be part of a CPU offline
> procedure or preparation for entering a system-wide sleep state and
> the stale cache contents may lead to problems going forward, for
> example when the CPU is taken back online.
> 
> If so, I will put the above information into the patch changelog.

I guess it is causing problems because the dirty caches will not get 
written back and the RAM becomes stale if they are not flushed. From my 
understanding, the CPU should have an empty cache and read back contents 
from RAM when it is taken back online.

> 
>> In fact, it made resuming from
>> suspend-to-RAM occasionally fail on Lenovo ThinkPad C13 Yoga.
> 
> So this probably means that resume from suspend-to-RAM occasionally
> fails on Lenovo ThinkPad C13 Yoga and reverting the commit in question
> fixes this problem.  Is that correct?

Yes, that is what I meant.

Regards,
Akihiko Odaki

> 
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
>> ---
>>   drivers/acpi/processor_idle.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
>> index f8e9fa82cb9b..05b3985a1984 100644
>> --- a/drivers/acpi/processor_idle.c
>> +++ b/drivers/acpi/processor_idle.c
>> @@ -570,8 +570,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, int index)
>>   {
>>          struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
>>
>> -       if (cx->type == ACPI_STATE_C3)
>> -               ACPI_FLUSH_CPU_CACHE();
>> +       ACPI_FLUSH_CPU_CACHE();
>>
>>          while (1) {
>>
>> --
>> 2.35.1
>>
Rafael J. Wysocki April 4, 2022, 6:56 p.m. UTC | #4
On Mon, Apr 4, 2022 at 8:25 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>
> On 2022/04/05 3:13, Rafael J. Wysocki wrote:
> > On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
> >>
> >> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802.
> >>
> >> ACPI processor power states can be transitioned in two distinct
> >> situations: 1. when CPU goes idle and 2. before CPU goes offline
> >> ("playing dead") to suspend or hibernate. Case 1 is handled by
> >> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by
> >> acpi_idle_play_dead.
> >>
> >> It is necessary to flush CPU caches in case 2 even if it is not
> >> required to transit ACPI processor power states as CPU will go
> >> offline soon. However, the reverted commit incorrectly removed CPU
> >> cache flushing in such a condition.
> >
> > I think what you mean is that the CPU cache must always be flushed in
> > acpi_idle_play_dead(), regardless of the target C-state that is going
> > to be requested, because this is likely to be part of a CPU offline
> > procedure or preparation for entering a system-wide sleep state and
> > the stale cache contents may lead to problems going forward, for
> > example when the CPU is taken back online.
> >
> > If so, I will put the above information into the patch changelog.
>
> I guess it is causing problems because the dirty caches will not get
> written back and the RAM becomes stale if they are not flushed. From my
> understanding, the CPU should have an empty cache and read back contents
> from RAM when it is taken back online.

OK, please see if the changelog I've added to the patch is looking good:

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=dfbba2518aac4204203b0697a894d3b2f80134d3
Akihiko Odaki April 4, 2022, 6:59 p.m. UTC | #5
On 2022/04/05 3:56, Rafael J. Wysocki wrote:
> On Mon, Apr 4, 2022 at 8:25 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>>
>> On 2022/04/05 3:13, Rafael J. Wysocki wrote:
>>> On Sun, Apr 3, 2022 at 8:25 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>>>>
>>>> This reverts commit 87ebbb8c612b1214f227ebb8f25442c6d163e802.
>>>>
>>>> ACPI processor power states can be transitioned in two distinct
>>>> situations: 1. when CPU goes idle and 2. before CPU goes offline
>>>> ("playing dead") to suspend or hibernate. Case 1 is handled by
>>>> acpi_idle_enter or acpi_idle_enter_s2idle. Case 2 is handled by
>>>> acpi_idle_play_dead.
>>>>
>>>> It is necessary to flush CPU caches in case 2 even if it is not
>>>> required to transit ACPI processor power states as CPU will go
>>>> offline soon. However, the reverted commit incorrectly removed CPU
>>>> cache flushing in such a condition.
>>>
>>> I think what you mean is that the CPU cache must always be flushed in
>>> acpi_idle_play_dead(), regardless of the target C-state that is going
>>> to be requested, because this is likely to be part of a CPU offline
>>> procedure or preparation for entering a system-wide sleep state and
>>> the stale cache contents may lead to problems going forward, for
>>> example when the CPU is taken back online.
>>>
>>> If so, I will put the above information into the patch changelog.
>>
>> I guess it is causing problems because the dirty caches will not get
>> written back and the RAM becomes stale if they are not flushed. From my
>> understanding, the CPU should have an empty cache and read back contents
>> from RAM when it is taken back online.
> 
> OK, please see if the changelog I've added to the patch is looking good:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/commit/?h=bleeding-edge&id=dfbba2518aac4204203b0697a894d3b2f80134d3

It looks good for me.

Thanks,
Akihiko Odaki
diff mbox series

Patch

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index f8e9fa82cb9b..05b3985a1984 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -570,8 +570,7 @@  static int acpi_idle_play_dead(struct cpuidle_device *dev, int index)
 {
 	struct acpi_processor_cx *cx = per_cpu(acpi_cstate[index], dev->cpu);
 
-	if (cx->type == ACPI_STATE_C3)
-		ACPI_FLUSH_CPU_CACHE();
+	ACPI_FLUSH_CPU_CACHE();
 
 	while (1) {