mbox series

[v3,0/5] PM: sleep: Improvements of async suspend and resume of devices

Message ID 10629535.nUPlyArG6x@rjwysocki.net
Headers show
Series PM: sleep: Improvements of async suspend and resume of devices | expand

Message

Rafael J. Wysocki March 14, 2025, 12:46 p.m. UTC
Hi Everyone,

This is a new iteration of the async suspend/resume improvements work:

https://lore.kernel.org/linux-pm/1915694.tdWV9SEqCh@rjwysocki.net/

which includes some rework and fixes of the patches in the series linked
above.  The most significant differences are splitting the second patch
into two patches and adding a change to treat consumers like children
during resume.

This new iteration is based on linux-pm.git/linux-next and on the recent
fix related to direct-complete:

https://lore.kernel.org/linux-pm/12627587.O9o76ZdvQC@rjwysocki.net/

The overall idea is still to start async processing for devices that have
at least some dependencies met, but not necessarily all of them, to avoid
overhead related to queuing too many async work items that will have to
wait for the processing of other devices before they can make progress.

Patch [1/5] does this in all resume phases, but it just takes children
into account (that is, async processing is started upfront for devices
without parents and then, after resuming each device, it is started for
the device's children).

Patches [2/5] does this in the suspend phase of system suspend and only
takes parents into account (that is, async processing is started upfront
for devices without any children and then, after suspending each device,
it is started for the device's parent).

Patch [3/5] extends it to the "late" and "noirq" suspend phases.

Patch [4/5] adds changes to treat suppliers like parents during suspend.
That is, async processing is started upfront for devices without any
children or consumers and then, after suspending each device, it is
started for the device's parent and suppliers.

Patch [5/5] adds changes to treat consumers like children during resume.
That is, async processing is started upfront for devices without a parent
or any suppliers and then, after resuming each device, it is started for
the device's children and consumers.

Preliminary test results from one sample system are below.

"Baseline" is the linux-pm.git/testing branch, "Parent/child"
is that branch with patches [1-3/5] applied and "Device links"
is that branch with patches [1-5/5] applied.

"s/r" means "regular" suspend/resume, noRPM is "late" suspend
and "early" resume, and noIRQ means the "noirq" phases of
suspend and resume, respectively.  The numbers are suspend
and resume times for each phase, in milliseconds.

         Baseline       Parent/child    Device links

       Suspend Resume  Suspend Resume  Suspend Resume

s/r    427     449     298     450     294     442
noRPM  13      1       13      1       13      1
noIRQ  31      25      28      24      28      26

s/r    408     442     298     443     301     447
noRPM  13      1       13      1       13      1
noIRQ  32      25      30      25      28      25

s/r    408     444     310     450     298     439
noRPM  13      1       13      1       13      1
noIRQ  31      24      31      26      31      24

It clearly shows an improvement in the suspend path after
applying patches [1-3/5], easily attributable to patch [2/5],
and clear difference after updating the async processing of
suppliers and consumers.

Note that there are systems where resume times are shorter after
patches [1-3/5] too, but more testing is necessary.

I do realize that this code can be optimized further, but it is not
particularly clear to me that any further optimizations would make
a significant difference and the changes in this series are deep
enough to do in one go.

Thanks!