Message ID | 1403705421-17597-2-git-send-email-lorenzo.pieralisi@arm.com |
---|---|
State | New |
Headers | show |
Hi Lorenzo, On Wed, Jun 25, 2014 at 03:10:14PM +0100, Lorenzo Pieralisi wrote: > ARM based platforms implement a variety of power management schemes that > allow processors to enter idle states at run-time. > The parameters defining these idle states vary on a per-platform basis forcing > the OS to hardcode the state parameters in platform specific static tables > whose size grows as the number of platforms supported in the kernel increases > and hampers device drivers standardization. > > Therefore, this patch aims at standardizing idle state device tree bindings for > ARM platforms. Bindings define idle state parameters inclusive of entry methods > and state latencies, to allow operating systems to retrieve the configuration > entries from the device tree and initialize the related power management > drivers, paving the way for common code in the kernel to deal with idle > states and removing the need for static data in current and previous kernel > versions. > > Reviewed-by: Sebastian Capella <sebcape@gmail.com> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > --- > Documentation/devicetree/bindings/arm/cpus.txt | 8 + > .../devicetree/bindings/arm/idle-states.txt | 733 +++++++++++++++++++++ > 2 files changed, 741 insertions(+) > create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt [...] > +=========================================== > +3 - idle-states node > +=========================================== > + > +ARM processor idle states are defined within the idle-states node, which is > +a direct child of the cpus node [1] and provides a container where the > +processor idle states, defined as device tree nodes, are listed. > + > +- idle-states node > + > + Usage: Optional - On ARM systems, it is a container of processor idle > + states nodes. If the system does not provide CPU > + power management capabilities or the processor just > + supports idle_standby an idle-states node is not > + required. > + > + Description: idle-states node is a container node, where its > + subnodes describe the CPU idle states. > + > + Node name must be "idle-states". > + > + The idle-states node's parent node must be the cpus node. > + > + The idle-states node's child nodes can be: > + > + - one or more state nodes > + > + Any other configuration is considered invalid. > + > + An idle-states node defines the following properties: > + > + - entry-method > + Usage: Required > + Value type: <stringlist> > + Definition: Describes the method by which a CPU enters the > + idle states. This property is required and must be > + one of: > + > + - "arm,psci" > + ARM PSCI firmware interface [2]. > + > + - "[vendor],[method]" > + An implementation dependent string with > + format "vendor,method", where vendor is a string > + denoting the name of the manufacturer and > + method is a string specifying the mechanism > + used to enter the idle state. > + > +The nodes describing the idle states (state) can only be defined within the > +idle-states node, any other configuration is considered invalid and therefore > +must be ignored. > + > +=========================================== > +4 - state node > +=========================================== > + > +A state node represents an idle state description and must be defined as > +follows: > + > +- state node > + > + Description: must be child of the idle-states node > + > + The state node name shall follow standard device tree naming > + rules ([5], 2.2.1 "Node names"), in particular state nodes which > + are siblings within a single common parent must be given a unique name. > + > + The idle state entered by executing the wfi instruction (idle_standby > + SBSA,[3][4]) is considered standard on all ARM platforms and therefore > + must not be listed. > + > + With the definitions provided above, the following list represents > + the valid properties for a state node: > + > + - compatible > + Usage: Required > + Value type: <stringlist> > + Definition: Must be "arm,idle-state". > + > + - logic-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present logic is retained on state entry, > + otherwise it is lost. What logic state is retained? All system registers? > + - cache-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present cache memory is retained on state entry, > + otherwise it is lost. Likewise, how much of the cache hierarchy is affected? Any of it? All of it? > + - timer-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present the timer control logic is retained on > + state entry, otherwise it is lost. The architected generic timers? Any CPU-local timers? Or any timers whatsoever? > + - power-rank > + Usage: Required > + Value type: <u32> > + Definition: It represents the idle state power-rank. > + An increasing value implies less power > + consumption. It must be given a sequential > + value = {0, 1, ....}, starting from 0. > + Phandles in the cpu nodes [1] cpu-idle-states > + array property are not allowed to point at idle > + state nodes having the same power-rank value. Why can't this be implicit in the order of the cpu-idle-states list? That way it's impossible to violate the ordering requirement. > + - entry-method-param > + Usage: See definition. > + Value type: <u32> > + Definition: Depends on the idle-states node entry-method > + property value. Refer to the entry-method bindings > + for this property value definition. Should this not be left up to the particular mechanism to describe? e.g. for PSCI we could have a arm,psci-suspend-param property. Are we sure a single u32 value is going to be sufficient? Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 25 Jun 2014, Lorenzo Pieralisi wrote: > ARM based platforms implement a variety of power management schemes that > allow processors to enter idle states at run-time. > The parameters defining these idle states vary on a per-platform basis forcing > the OS to hardcode the state parameters in platform specific static tables > whose size grows as the number of platforms supported in the kernel increases > and hampers device drivers standardization. > > Therefore, this patch aims at standardizing idle state device tree bindings for > ARM platforms. Bindings define idle state parameters inclusive of entry methods > and state latencies, to allow operating systems to retrieve the configuration > entries from the device tree and initialize the related power management > drivers, paving the way for common code in the kernel to deal with idle > states and removing the need for static data in current and previous kernel > versions. > > Reviewed-by: Sebastian Capella <sebcape@gmail.com> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Excellent. Reviewed-by: Nicolas Pitre <nico@linaro.org> > --- > Documentation/devicetree/bindings/arm/cpus.txt | 8 + > .../devicetree/bindings/arm/idle-states.txt | 733 +++++++++++++++++++++ > 2 files changed, 741 insertions(+) > create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt > > diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt > index 1fe72a0..a44d4fd 100644 > --- a/Documentation/devicetree/bindings/arm/cpus.txt > +++ b/Documentation/devicetree/bindings/arm/cpus.txt > @@ -215,6 +215,12 @@ nodes to be present and contain the properties described below. > Value type: <phandle> > Definition: Specifies the ACC[2] node associated with this CPU. > > + - cpu-idle-states > + Usage: Optional > + Value type: <prop-encoded-array> > + Definition: > + # List of phandles to idle state nodes supported > + by this cpu [3]. > > Example 1 (dual-cluster big.LITTLE system 32-bit): > > @@ -411,3 +417,5 @@ cpus { > -- > [1] arm/msm/qcom,saw2.txt > [2] arm/msm/qcom,kpss-acc.txt > +[3] ARM Linux kernel documentation - idle states bindings > + Documentation/devicetree/bindings/arm/idle-states.txt > diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt > new file mode 100644 > index 0000000..5efd198 > --- /dev/null > +++ b/Documentation/devicetree/bindings/arm/idle-states.txt > @@ -0,0 +1,733 @@ > +========================================== > +ARM idle states binding description > +========================================== > + > +========================================== > +1 - Introduction > +========================================== > + > +ARM systems contain HW capable of managing power consumption dynamically, > +where cores can be put in different low-power states (ranging from simple > +wfi to power gating) according to OS PM policies. The CPU states representing > +the range of dynamic idle states that a processor can enter at run-time, can be > +specified through device tree bindings representing the parameters required > +to enter/exit specific idle states on a given processor. > + > +According to the Server Base System Architecture document (SBSA, [3]), the > +power states an ARM CPU can be put into are identified by the following list: > + > +- Running > +- Idle_standby > +- Idle_retention > +- Sleep > +- Off > + > +The power states described in the SBSA document define the basic CPU states on > +top of which ARM platforms implement power management schemes that allow an OS > +PM implementation to put the processor in different idle states (which include > +states listed above; "off" state is not an idle state since it does not have > +wake-up capabilities, hence it is not considered in this document). > + > +Idle state parameters (eg entry latency) are platform specific and need to be > +characterized with bindings that provide the required information to OS PM > +code so that it can build the required tables and use them at runtime. > + > +The device tree binding definition for ARM idle states is the subject of this > +document. > + > +=========================================== > +2 - idle-states definitions > +=========================================== > + > +Idle states are characterized for a specific system through a set of > +timing and energy related properties, that underline the HW behaviour > +triggered upon idle states entry and exit. > + > +The following diagram depicts the CPU execution phases and related timing > +properties required to enter and exit an idle state: > + > +..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__.. > + | | | | | > + > + |<------ entry ------->| > + | latency | > + |<- exit ->| > + | latency | > + |<-------- min-residency -------->| > + |<------- wakeup-latency ------->| > + > + Diagram 1: CPU idle state execution phases > + > +EXEC: Normal CPU execution. > + > +PREP: Preparation phase before committing the hardware to idle mode > + like cache flushing. This is abortable on pending wake-up > + event conditions. The abort latency is assumed to be negligible > + (i.e. less than the ENTRY + EXIT duration). If aborted, CPU > + goes back to EXEC. This phase is optional. If not abortable, > + this should be included in the ENTRY phase instead. > + > +ENTRY: The hardware is committed to idle mode. This period must run > + to completion up to IDLE before anything else can happen. > + > +IDLE: This is the actual energy-saving idle period. This may last > + between 0 and infinite time, until a wake-up event occurs. > + > +EXIT: Period during which the CPU is brought back to operational > + mode (EXEC). > + > +entry-latency: Worst case latency required to enter the idle state. The > +exit-latency may be guaranteed only after entry-latency has passed. > + > +min-residency: Minimum period, including preparation and entry, for a given > +idle state to be worthwhile energywise. > + > +wakeup-latency: Maximum delay between the signaling of a wake-up event and the > +CPU being able to execute normal code again. If not specified, this is assumed > +to be entry-latency + exit-latency. > + > +These timing parameters can be used by an OS in different circumstances. > + > +An idle CPU requires the expected min-residency time to select the most > +appropriate idle state based on the expected expiry time of the next IRQ > +(ie wake-up) that causes the CPU to return to the EXEC phase. > + > +An operating system scheduler may need to compute the shortest wake-up delay > +for CPUs in the system by detecting how long will it take to get a CPU out > +of an idle state, eg: > + > +wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0) > + > +In other words, the scheduler can make its scheduling decision by selecting > +(eg waking-up) the CPU with the shortest wake-up latency. > +The wake-up latency must take into account the entry latency if that period > +has not expired. The abortable nature of the PREP period can be ignored > +if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than > +the worst case since it depends on the CPU operating conditions, ie caches > +state). > + > +An OS has to reliably probe the wakeup-latency since some devices can enforce > +latency constraints guarantees to work properly, so the OS has to detect the > +worst case wake-up latency it can incur if a CPU is allowed to enter an > +idle state, and possibly to prevent that to guarantee reliable device > +functioning. > + > +The min-residency time parameter deserves further explanation since it is > +expressed in time units but must factor in energy consumption coefficients. > + > +The energy consumption of a cpu when it enters a power state can be roughly > +characterised by the following graph: > + > + | > + | > + | > + e | > + n | /--- > + e | /------ > + r | /------ > + g | /----- > + y | /------ > + | ---- > + | /| > + | / | > + | / | > + | / | > + | / | > + | / | > + |/ | > + -----|-------+---------------------------------- > + 0| 1 time(ms) > + > + Graph 1: Energy vs time example > + > +The graph is split in two parts delimited by time 1ms on the X-axis. > +The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope > +and denotes the energy costs incurred whilst entering and leaving the idle > +state. > +The graph curve in the area delimited by X-axis values = {x | x > 1ms } has > +shallower slope and essentially represents the energy consumption of the idle > +state. > + > +min-residency is defined for a given idle state as the minimum expected > +residency time for a state (inclusive of preparation and entry) after > +which choosing that state become the most energy efficient option. A good > +way to visualise this, is by taking the same graph above and comparing some > +states energy consumptions plots. > + > +For sake of simplicity, let's consider a system with two idle states IDLE1, > +and IDLE2: > + > + | > + | > + | > + | /-- IDLE1 > + e | /--- > + n | /---- > + e | /--- > + r | /-----/--------- IDLE2 > + g | /-------/--------- > + y | ------------ /---| > + | / /---- | > + | / /--- | > + | / /---- | > + | / /--- | > + | --- | > + | / | > + | / | > + |/ | time > + ---/----------------------------+------------------------ > + |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy > + | > + IDLE2-min-residency > + > + Graph 2: idle states min-residency example > + > +In graph 2 above, that takes into account idle states entry/exit energy > +costs, it is clear that if the idle state residency time (ie time till next > +wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state > +choice energywise. > + > +This is mainly down to the fact that IDLE1 entry/exit energy costs are lower > +than IDLE2. > + > +However, the lower power consumption (ie shallower energy curve slope) of idle > +state IDLE2 implies that after a suitable time, IDLE2 becomes more energy > +efficient. > + > +The time at which IDLE2 becomes more energy efficient than IDLE1 (and other > +shallower states in a system with multiple idle states) is defined > +IDLE2-min-residency and corresponds to the time when energy consumption of > +IDLE1 and IDLE2 states breaks even. > + > +The definitions provided in this section underpin the idle states > +properties specification that is the subject of the following sections. > + > +=========================================== > +3 - idle-states node > +=========================================== > + > +ARM processor idle states are defined within the idle-states node, which is > +a direct child of the cpus node [1] and provides a container where the > +processor idle states, defined as device tree nodes, are listed. > + > +- idle-states node > + > + Usage: Optional - On ARM systems, it is a container of processor idle > + states nodes. If the system does not provide CPU > + power management capabilities or the processor just > + supports idle_standby an idle-states node is not > + required. > + > + Description: idle-states node is a container node, where its > + subnodes describe the CPU idle states. > + > + Node name must be "idle-states". > + > + The idle-states node's parent node must be the cpus node. > + > + The idle-states node's child nodes can be: > + > + - one or more state nodes > + > + Any other configuration is considered invalid. > + > + An idle-states node defines the following properties: > + > + - entry-method > + Usage: Required > + Value type: <stringlist> > + Definition: Describes the method by which a CPU enters the > + idle states. This property is required and must be > + one of: > + > + - "arm,psci" > + ARM PSCI firmware interface [2]. > + > + - "[vendor],[method]" > + An implementation dependent string with > + format "vendor,method", where vendor is a string > + denoting the name of the manufacturer and > + method is a string specifying the mechanism > + used to enter the idle state. > + > +The nodes describing the idle states (state) can only be defined within the > +idle-states node, any other configuration is considered invalid and therefore > +must be ignored. > + > +=========================================== > +4 - state node > +=========================================== > + > +A state node represents an idle state description and must be defined as > +follows: > + > +- state node > + > + Description: must be child of the idle-states node > + > + The state node name shall follow standard device tree naming > + rules ([5], 2.2.1 "Node names"), in particular state nodes which > + are siblings within a single common parent must be given a unique name. > + > + The idle state entered by executing the wfi instruction (idle_standby > + SBSA,[3][4]) is considered standard on all ARM platforms and therefore > + must not be listed. > + > + With the definitions provided above, the following list represents > + the valid properties for a state node: > + > + - compatible > + Usage: Required > + Value type: <stringlist> > + Definition: Must be "arm,idle-state". > + > + - logic-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present logic is retained on state entry, > + otherwise it is lost. > + > + - cache-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present cache memory is retained on state entry, > + otherwise it is lost. > + > + - timer-state-retained > + Usage: See definition > + Value type: <none> > + Definition: if present the timer control logic is retained on > + state entry, otherwise it is lost. > + > + - power-rank > + Usage: Required > + Value type: <u32> > + Definition: It represents the idle state power-rank. > + An increasing value implies less power > + consumption. It must be given a sequential > + value = {0, 1, ....}, starting from 0. > + Phandles in the cpu nodes [1] cpu-idle-states > + array property are not allowed to point at idle > + state nodes having the same power-rank value. > + > + - entry-method-param > + Usage: See definition. > + Value type: <u32> > + Definition: Depends on the idle-states node entry-method > + property value. Refer to the entry-method bindings > + for this property value definition. > + > + - entry-latency-us > + Usage: Required > + Value type: <prop-encoded-array> > + Definition: u32 value representing worst case latency in > + microseconds required to enter the idle state. > + The exit-latency-us duration may be guaranteed > + only after entry-latency-us has passed. > + > + - exit-latency-us > + Usage: Required > + Value type: <prop-encoded-array> > + Definition: u32 value representing worst case latency > + in microseconds required to exit the idle state. > + > + - min-residency-us > + Usage: Required > + Value type: <prop-encoded-array> > + Definition: u32 value representing minimum residency duration > + in microseconds, inclusive of preparation and > + entry, for this idle state to be considered > + worthwhile energy wise (refer to section 2 of > + this document for a complete description). > + > + - wakeup-latency-us: > + Usage: Optional > + Value type: <prop-encoded-array> > + Definition: u32 value representing maximum delay between the > + signaling of a wake-up event and the CPU being > + able to execute normal code again. If omitted, > + this is assumed to be equal to: > + > + entry-latency-us + exit-latency-us > + > + It is important to supply this value on systems > + where the duration of PREP phase (see diagram 1, > + section 2) is non-neglibigle. > + In such systems entry-latency-us + exit-latency-us > + will exceed wakeup-latency-us by this duration. > + > +=========================================== > +4 - Examples > +=========================================== > + > +Example 1 (ARM 64-bit, 16-cpu system): > + > +cpus { > + #size-cells = <0>; > + #address-cells = <2>; > + > + idle-states { > + entry-method = "arm,psci"; > + > + CPU_RETENTION_0_0: cpu-retention-0-0 { > + compatible = "arm,idle-state"; > + power-rank = <0>; > + logic-state-retained; > + cache-state-retained; > + entry-method-param = <0x0010000>; > + entry-latency-us = <20>; > + exit-latency-us = <40>; > + min-residency-us = <80>; > + }; > + > + CLUSTER_RETENTION_0: cluster-retention-0 { > + compatible = "arm,idle-state"; > + power-rank = <2>; > + cache-state-retained; > + entry-method-param = <0x1010000>; > + entry-latency-us = <50>; > + exit-latency-us = <100>; > + min-residency-us = <250>; > + wakeup-latency-us = <130>; > + }; > + > + CPU_SLEEP_0_0: cpu-sleep-0-0 { > + compatible = "arm,idle-state"; > + power-rank = <1>; > + entry-method-param = <0x0010000>; > + entry-latency-us = <250>; > + exit-latency-us = <500>; > + min-residency-us = <950>; > + }; > + > + CLUSTER_SLEEP_0: cluster-sleep-0 { > + compatible = "arm,idle-state"; > + power-rank = <3>; > + entry-method-param = <0x1010000>; > + entry-latency-us = <600>; > + exit-latency-us = <1100>; > + min-residency-us = <2700>; > + wakeup-latency-us = <1500>; > + }; > + > + CPU_RETENTION_1_0: cpu-retention-1-0 { > + compatible = "arm,idle-state"; > + power-rank = <0>; > + logic-state-retained; > + cache-state-retained; > + entry-method-param = <0x0010000>; > + entry-latency-us = <20>; > + exit-latency-us = <40>; > + min-residency-us = <90>; > + }; > + > + CLUSTER_RETENTION_1: cluster-retention-1 { > + compatible = "arm,idle-state"; > + power-rank = <2>; > + cache-state-retained; > + entry-method-param = <0x1010000>; > + entry-latency-us = <50>; > + exit-latency-us = <100>; > + min-residency-us = <270>; > + wakeup-latency-us = <100>; > + }; > + > + CPU_SLEEP_1_0: cpu-sleep-1-0 { > + compatible = "arm,idle-state"; > + power-rank = <1>; > + entry-method-param = <0x0010000>; > + entry-latency-us = <70>; > + exit-latency-us = <100>; > + min-residency-us = <300>; > + wakeup-latency-us = <150>; > + }; > + > + CLUSTER_SLEEP_1: cluster-sleep-1 { > + compatible = "arm,idle-state"; > + power-rank = <3>; > + entry-method-param = <0x1010000>; > + entry-latency-us = <500>; > + exit-latency-us = <1200>; > + min-residency-us = <3500>; > + wakeup-latency-us = <1300>; > + }; > + }; > + > + CPU0: cpu@0 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x0>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU1: cpu@1 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x1>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU2: cpu@100 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x100>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU3: cpu@101 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x101>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU4: cpu@10000 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x10000>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU5: cpu@10001 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x10001>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU6: cpu@10100 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x10100>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU7: cpu@10101 { > + device_type = "cpu"; > + compatible = "arm,cortex-a57"; > + reg = <0x0 0x10101>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 > + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU8: cpu@100000000 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x0>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU9: cpu@100000001 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x1>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU10: cpu@100000100 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x100>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU11: cpu@100000101 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x101>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU12: cpu@100010000 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x10000>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU13: cpu@100010001 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x10001>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU14: cpu@100010100 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x10100>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > + > + CPU15: cpu@100010101 { > + device_type = "cpu"; > + compatible = "arm,cortex-a53"; > + reg = <0x1 0x10101>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 > + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; > + }; > +}; > + > +Example 2 (ARM 32-bit, 8-cpu system, two clusters): > + > +cpus { > + #size-cells = <0>; > + #address-cells = <1>; > + > + idle-states { > + entry-method = "arm,psci"; > + > + CPU_SLEEP_0_0: cpu-sleep-0-0 { > + compatible = "arm,idle-state"; > + power-rank = <0>; > + entry-method-param = <0x0010000>; > + entry-latency-us = <200>; > + exit-latency-us = <100>; > + min-residency-us = <400>; > + wakeup-latency-us = <250>; > + }; > + > + CLUSTER_SLEEP_0: cluster-sleep-0 { > + compatible = "arm,idle-state"; > + power-rank = <2>; > + entry-method-param = <0x1010000>; > + entry-latency-us = <500>; > + exit-latency-us = <1500>; > + min-residency-us = <2500>; > + wakeup-latency-us = <1700>; > + }; > + > + CPU_SLEEP_1_0: cpu-sleep-1-0 { > + compatible = "arm,idle-state"; > + power-rank = <1>; > + entry-method-param = <0x0010000>; > + entry-latency-us = <300>; > + exit-latency-us = <500>; > + min-residency-us = <900>; > + wakeup-latency-us = <600>; > + }; > + > + CLUSTER_SLEEP_1: cluster-sleep-1 { > + compatible = "arm,idle-state"; > + power-rank = <3>; > + entry-method-param = <0x1010000>; > + entry-latency-us = <800>; > + exit-latency-us = <2000>; > + min-residency-us = <6500>; > + wakeup-latency-us = <2300>; > + }; > + }; > + > + CPU0: cpu@0 { > + device_type = "cpu"; > + compatible = "arm,cortex-a15"; > + reg = <0x0>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU1: cpu@1 { > + device_type = "cpu"; > + compatible = "arm,cortex-a15"; > + reg = <0x1>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU2: cpu@2 { > + device_type = "cpu"; > + compatible = "arm,cortex-a15"; > + reg = <0x2>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU3: cpu@3 { > + device_type = "cpu"; > + compatible = "arm,cortex-a15"; > + reg = <0x3>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; > + }; > + > + CPU4: cpu@100 { > + device_type = "cpu"; > + compatible = "arm,cortex-a7"; > + reg = <0x100>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; > + }; > + > + CPU5: cpu@101 { > + device_type = "cpu"; > + compatible = "arm,cortex-a7"; > + reg = <0x101>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; > + }; > + > + CPU6: cpu@102 { > + device_type = "cpu"; > + compatible = "arm,cortex-a7"; > + reg = <0x102>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; > + }; > + > + CPU7: cpu@103 { > + device_type = "cpu"; > + compatible = "arm,cortex-a7"; > + reg = <0x103>; > + enable-method = "psci"; > + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; > + }; > +}; > + > +=========================================== > +5 - References > +=========================================== > + > +[1] ARM Linux Kernel documentation - CPUs bindings > + Documentation/devicetree/bindings/arm/cpus.txt > + > +[2] ARM Linux Kernel documentation - PSCI bindings > + Documentation/devicetree/bindings/arm/psci.txt > + > +[3] ARM Server Base System Architecture (SBSA) > + http://infocenter.arm.com/help/index.jsp > + > +[4] ARM Architecture Reference Manuals > + http://infocenter.arm.com/help/index.jsp > + > +[5] ePAPR standard > + https://www.power.org/documentation/epapr-version-1-1/ > -- > 1.9.1 > > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 25, 2014 at 03:58:49PM +0100, Mark Rutland wrote: > Hi Lorenzo, > > On Wed, Jun 25, 2014 at 03:10:14PM +0100, Lorenzo Pieralisi wrote: > > ARM based platforms implement a variety of power management schemes that > > allow processors to enter idle states at run-time. > > The parameters defining these idle states vary on a per-platform basis forcing > > the OS to hardcode the state parameters in platform specific static tables > > whose size grows as the number of platforms supported in the kernel increases > > and hampers device drivers standardization. > > > > Therefore, this patch aims at standardizing idle state device tree bindings for > > ARM platforms. Bindings define idle state parameters inclusive of entry methods > > and state latencies, to allow operating systems to retrieve the configuration > > entries from the device tree and initialize the related power management > > drivers, paving the way for common code in the kernel to deal with idle > > states and removing the need for static data in current and previous kernel > > versions. > > > > Reviewed-by: Sebastian Capella <sebcape@gmail.com> > > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > --- > > Documentation/devicetree/bindings/arm/cpus.txt | 8 + > > .../devicetree/bindings/arm/idle-states.txt | 733 +++++++++++++++++++++ > > 2 files changed, 741 insertions(+) > > create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt > > [...] > > > +=========================================== > > +3 - idle-states node > > +=========================================== > > + > > +ARM processor idle states are defined within the idle-states node, which is > > +a direct child of the cpus node [1] and provides a container where the > > +processor idle states, defined as device tree nodes, are listed. > > + > > +- idle-states node > > + > > + Usage: Optional - On ARM systems, it is a container of processor idle > > + states nodes. If the system does not provide CPU > > + power management capabilities or the processor just > > + supports idle_standby an idle-states node is not > > + required. > > + > > + Description: idle-states node is a container node, where its > > + subnodes describe the CPU idle states. > > + > > + Node name must be "idle-states". > > + > > + The idle-states node's parent node must be the cpus node. > > + > > + The idle-states node's child nodes can be: > > + > > + - one or more state nodes > > + > > + Any other configuration is considered invalid. > > + > > + An idle-states node defines the following properties: > > + > > + - entry-method > > + Usage: Required > > + Value type: <stringlist> > > + Definition: Describes the method by which a CPU enters the > > + idle states. This property is required and must be > > + one of: > > + > > + - "arm,psci" > > + ARM PSCI firmware interface [2]. > > + > > + - "[vendor],[method]" > > + An implementation dependent string with > > + format "vendor,method", where vendor is a string > > + denoting the name of the manufacturer and > > + method is a string specifying the mechanism > > + used to enter the idle state. > > + > > +The nodes describing the idle states (state) can only be defined within the > > +idle-states node, any other configuration is considered invalid and therefore > > +must be ignored. > > + > > +=========================================== > > +4 - state node > > +=========================================== > > + > > +A state node represents an idle state description and must be defined as > > +follows: > > + > > +- state node > > + > > + Description: must be child of the idle-states node > > + > > + The state node name shall follow standard device tree naming > > + rules ([5], 2.2.1 "Node names"), in particular state nodes which > > + are siblings within a single common parent must be given a unique name. > > + > > + The idle state entered by executing the wfi instruction (idle_standby > > + SBSA,[3][4]) is considered standard on all ARM platforms and therefore > > + must not be listed. > > + > > + With the definitions provided above, the following list represents > > + the valid properties for a state node: > > + > > + - compatible > > + Usage: Required > > + Value type: <stringlist> > > + Definition: Must be "arm,idle-state". > > + > > + - logic-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present logic is retained on state entry, > > + otherwise it is lost. > > What logic state is retained? All system registers? > > > + - cache-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present cache memory is retained on state entry, > > + otherwise it is lost. > > Likewise, how much of the cache hierarchy is affected? Any of it? All of > it? Well, to be honest these properties are shortcuts. If we wanted to do things properly, I should have added power domains into the picture (actually I did in the earlier versions of the bindings and later streamlined them) so that every device inclusive of CPUs and caches can be linked to a power domain, and from that linkage we could detect what's lost when an idle state is entered. PSCI does not need the two properties above (but that's no valid reason to remove them, or to avoid adding power domains). In case power domains are added, we need to know if the caches are lost or retained and this flag specifies that. I can add these properties when they are needed ie not in the current bindings. > > + - timer-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present the timer control logic is retained on > > + state entry, otherwise it is lost. > > The architected generic timers? Any CPU-local timers? Or any timers > whatsoever? See above. Without power domains (and even with power domains attaching a tick device to a power domain is far from being a simple job) it is impossible to know if the tick device (what timer lies behind it is unknown to the idle driver) is lost on idle state entry. PowerPC guys got around that by adding a flag to DT which is a Linux specific thing, this property is also a Linux specific property, but I think it is a problem present in other OS too (and that ACPI solves the same way I did). On x86, the idle state index defines what states lose the local timer so this stuff is not needed at all. My question is, and that's a very important one: is it worth going the whole nine yards, implementing bindings with power domains and parse all this stuff in the kernel to set a flag for some CPUidle states ? Complexity behind this is significant, but using the power domains is the proper way to do it. The only alternative lies in always setting the CPUIDLE_FLAG_TIMER_STOP on all idle states (which turns out as a nop if the tick device does not have C3STOP in its features). Or maybe we can get away with adding the compatible string of the timer that is lost on idle state entry if any ? That's horrible but that's another possibility. I know I am talking DT with kernel code in mind, but in this specific case it is pretty hard to do otherwise. Comments very welcome and encouraged because that's a blocking point. > > + - power-rank > > + Usage: Required > > + Value type: <u32> > > + Definition: It represents the idle state power-rank. > > + An increasing value implies less power > > + consumption. It must be given a sequential > > + value = {0, 1, ....}, starting from 0. > > + Phandles in the cpu nodes [1] cpu-idle-states > > + array property are not allowed to point at idle > > + state nodes having the same power-rank value. > > Why can't this be implicit in the order of the cpu-idle-states list? > That way it's impossible to violate the ordering requirement. You mean the phandles list in the cpu nodes ? Maybe, but this would require the list to be the same order for all cpu nodes on which the idle states are valid, or just take one and use that. It can be viable, as long as everyone agrees, every time I post this code someone comes up with a new idea on how to sort the states and honestly I would like to be done with that. > > > + - entry-method-param > > + Usage: See definition. > > + Value type: <u32> > > + Definition: Depends on the idle-states node entry-method > > + property value. Refer to the entry-method bindings > > + for this property value definition. > > Should this not be left up to the particular mechanism to describe? > e.g. for PSCI we could have a arm,psci-suspend-param property. It was like that in early postings, and probably was better than the current definition. I need to think about that but I am almost convinced you are right. > Are we sure a single u32 value is going to be sufficient? Well, it is for PSCI, so see above, adding generality when it is not present is a risky business, hoping that a u32 parameter will work for other entry methods is an unsafe bet, you are right. Thanks, Lorenzo -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 25, 2014 at 04:56:02PM +0100, Nicolas Pitre wrote: > On Wed, 25 Jun 2014, Lorenzo Pieralisi wrote: > > > ARM based platforms implement a variety of power management schemes that > > allow processors to enter idle states at run-time. > > The parameters defining these idle states vary on a per-platform basis forcing > > the OS to hardcode the state parameters in platform specific static tables > > whose size grows as the number of platforms supported in the kernel increases > > and hampers device drivers standardization. > > > > Therefore, this patch aims at standardizing idle state device tree bindings for > > ARM platforms. Bindings define idle state parameters inclusive of entry methods > > and state latencies, to allow operating systems to retrieve the configuration > > entries from the device tree and initialize the related power management > > drivers, paving the way for common code in the kernel to deal with idle > > states and removing the need for static data in current and previous kernel > > versions. > > > > Reviewed-by: Sebastian Capella <sebcape@gmail.com> > > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > Excellent. > > Reviewed-by: Nicolas Pitre <nico@linaro.org> Thanks Nico, there are still a couple of niggles to sort out (ie local timer state), but the bulk of the document should be complete I hope. I will postpone adding your (and Seb's) Reviewed-by until we have a final agreement if it is ok with you. Thanks ! Lorenzo -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 25, 2014 at 12:37 PM, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote: > On Wed, Jun 25, 2014 at 03:58:49PM +0100, Mark Rutland wrote: >> Hi Lorenzo, >> >> On Wed, Jun 25, 2014 at 03:10:14PM +0100, Lorenzo Pieralisi wrote: >> > ARM based platforms implement a variety of power management schemes that >> > allow processors to enter idle states at run-time. >> > The parameters defining these idle states vary on a per-platform basis forcing >> > the OS to hardcode the state parameters in platform specific static tables >> > whose size grows as the number of platforms supported in the kernel increases >> > and hampers device drivers standardization. >> > >> > Therefore, this patch aims at standardizing idle state device tree bindings for >> > ARM platforms. Bindings define idle state parameters inclusive of entry methods >> > and state latencies, to allow operating systems to retrieve the configuration >> > entries from the device tree and initialize the related power management >> > drivers, paving the way for common code in the kernel to deal with idle >> > states and removing the need for static data in current and previous kernel >> > versions. >> > >> > Reviewed-by: Sebastian Capella <sebcape@gmail.com> >> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> >> > --- >> > Documentation/devicetree/bindings/arm/cpus.txt | 8 + >> > .../devicetree/bindings/arm/idle-states.txt | 733 +++++++++++++++++++++ >> > 2 files changed, 741 insertions(+) >> > create mode 100644 Documentation/devicetree/bindings/arm/idle-states.txt >> >> [...] >> > + - power-rank >> > + Usage: Required >> > + Value type: <u32> >> > + Definition: It represents the idle state power-rank. >> > + An increasing value implies less power >> > + consumption. It must be given a sequential >> > + value = {0, 1, ....}, starting from 0. >> > + Phandles in the cpu nodes [1] cpu-idle-states >> > + array property are not allowed to point at idle >> > + state nodes having the same power-rank value. >> >> Why can't this be implicit in the order of the cpu-idle-states list? >> That way it's impossible to violate the ordering requirement. > > You mean the phandles list in the cpu nodes ? Maybe, but this would > require the list to be the same order for all cpu nodes on which the > idle states are valid, or just take one and use that. > > It can be viable, as long as everyone agrees, every time I post this > code someone comes up with a new idea on how to sort the states and > honestly I would like to be done with that. power-rank feels like an index in disguise. I agree with the phandle list defining the order. >> > + - entry-method-param >> > + Usage: See definition. >> > + Value type: <u32> >> > + Definition: Depends on the idle-states node entry-method >> > + property value. Refer to the entry-method bindings >> > + for this property value definition. >> >> Should this not be left up to the particular mechanism to describe? >> e.g. for PSCI we could have a arm,psci-suspend-param property. > > It was like that in early postings, and probably was better than the > current definition. I need to think about that but I am almost convinced > you are right. I think arm,psci-suspend-param is the right way to go. Rob -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 26 Jun 2014, Lorenzo Pieralisi wrote: > On Wed, Jun 25, 2014 at 04:56:02PM +0100, Nicolas Pitre wrote: > > On Wed, 25 Jun 2014, Lorenzo Pieralisi wrote: > > > > > ARM based platforms implement a variety of power management schemes that > > > allow processors to enter idle states at run-time. > > > The parameters defining these idle states vary on a per-platform basis forcing > > > the OS to hardcode the state parameters in platform specific static tables > > > whose size grows as the number of platforms supported in the kernel increases > > > and hampers device drivers standardization. > > > > > > Therefore, this patch aims at standardizing idle state device tree bindings for > > > ARM platforms. Bindings define idle state parameters inclusive of entry methods > > > and state latencies, to allow operating systems to retrieve the configuration > > > entries from the device tree and initialize the related power management > > > drivers, paving the way for common code in the kernel to deal with idle > > > states and removing the need for static data in current and previous kernel > > > versions. > > > > > > Reviewed-by: Sebastian Capella <sebcape@gmail.com> > > > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > > > Excellent. > > > > Reviewed-by: Nicolas Pitre <nico@linaro.org> > > Thanks Nico, there are still a couple of niggles to sort out (ie local > timer state), but the bulk of the document should be complete I hope. > > I will postpone adding your (and Seb's) Reviewed-by until we have a > final agreement if it is ok with you. As you wish. The parts I care about are now well covered. I don't think I know enough about timers to comment further. Nicolas -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 25, 2014 at 03:58:49PM +0100, Mark Rutland wrote: [...] > > +=========================================== > > +4 - state node > > +=========================================== > > + > > +A state node represents an idle state description and must be defined as > > +follows: > > + > > +- state node > > + > > + Description: must be child of the idle-states node > > + > > + The state node name shall follow standard device tree naming > > + rules ([5], 2.2.1 "Node names"), in particular state nodes which > > + are siblings within a single common parent must be given a unique name. > > + > > + The idle state entered by executing the wfi instruction (idle_standby > > + SBSA,[3][4]) is considered standard on all ARM platforms and therefore > > + must not be listed. > > + > > + With the definitions provided above, the following list represents > > + the valid properties for a state node: > > + > > + - compatible > > + Usage: Required > > + Value type: <stringlist> > > + Definition: Must be "arm,idle-state". > > + > > + - logic-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present logic is retained on state entry, > > + otherwise it is lost. > > What logic state is retained? All system registers? > > > + - cache-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present cache memory is retained on state entry, > > + otherwise it is lost. > > Likewise, how much of the cache hierarchy is affected? Any of it? All of > it? > > > + - timer-state-retained > > + Usage: See definition > > + Value type: <none> > > + Definition: if present the timer control logic is retained on > > + state entry, otherwise it is lost. > > The architected generic timers? Any CPU-local timers? Or any timers > whatsoever? Ok, as I mentioned this timer property is a blocking point for the entire set. I gave it more thought, and it is a very hard nut to crack, even if we resort to power domains (tick devices do not even contain struct device or device node pointers, even if I added a list of phandles to timers that are lost on idle state entry I would not be able to figure out if the tick device is lost on idle state entry). I am reasoning in kernel terms, I know it is bad but I can't help it in this case. Would a boolean property like the following one be deemed acceptable, eg: - local-timer-stop I want to be 100% honest here, this might turn out a Linux specific thing, or might be not, but I still think it is representative of how HW works. Comments welcome and would be very appreciated on this specific detail. Thanks, Lorenzo -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt index 1fe72a0..a44d4fd 100644 --- a/Documentation/devicetree/bindings/arm/cpus.txt +++ b/Documentation/devicetree/bindings/arm/cpus.txt @@ -215,6 +215,12 @@ nodes to be present and contain the properties described below. Value type: <phandle> Definition: Specifies the ACC[2] node associated with this CPU. + - cpu-idle-states + Usage: Optional + Value type: <prop-encoded-array> + Definition: + # List of phandles to idle state nodes supported + by this cpu [3]. Example 1 (dual-cluster big.LITTLE system 32-bit): @@ -411,3 +417,5 @@ cpus { -- [1] arm/msm/qcom,saw2.txt [2] arm/msm/qcom,kpss-acc.txt +[3] ARM Linux kernel documentation - idle states bindings + Documentation/devicetree/bindings/arm/idle-states.txt diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt new file mode 100644 index 0000000..5efd198 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/idle-states.txt @@ -0,0 +1,733 @@ +========================================== +ARM idle states binding description +========================================== + +========================================== +1 - Introduction +========================================== + +ARM systems contain HW capable of managing power consumption dynamically, +where cores can be put in different low-power states (ranging from simple +wfi to power gating) according to OS PM policies. The CPU states representing +the range of dynamic idle states that a processor can enter at run-time, can be +specified through device tree bindings representing the parameters required +to enter/exit specific idle states on a given processor. + +According to the Server Base System Architecture document (SBSA, [3]), the +power states an ARM CPU can be put into are identified by the following list: + +- Running +- Idle_standby +- Idle_retention +- Sleep +- Off + +The power states described in the SBSA document define the basic CPU states on +top of which ARM platforms implement power management schemes that allow an OS +PM implementation to put the processor in different idle states (which include +states listed above; "off" state is not an idle state since it does not have +wake-up capabilities, hence it is not considered in this document). + +Idle state parameters (eg entry latency) are platform specific and need to be +characterized with bindings that provide the required information to OS PM +code so that it can build the required tables and use them at runtime. + +The device tree binding definition for ARM idle states is the subject of this +document. + +=========================================== +2 - idle-states definitions +=========================================== + +Idle states are characterized for a specific system through a set of +timing and energy related properties, that underline the HW behaviour +triggered upon idle states entry and exit. + +The following diagram depicts the CPU execution phases and related timing +properties required to enter and exit an idle state: + +..__[EXEC]__|__[PREP]__|__[ENTRY]__|__[IDLE]__|__[EXIT]__|__[EXEC]__.. + | | | | | + + |<------ entry ------->| + | latency | + |<- exit ->| + | latency | + |<-------- min-residency -------->| + |<------- wakeup-latency ------->| + + Diagram 1: CPU idle state execution phases + +EXEC: Normal CPU execution. + +PREP: Preparation phase before committing the hardware to idle mode + like cache flushing. This is abortable on pending wake-up + event conditions. The abort latency is assumed to be negligible + (i.e. less than the ENTRY + EXIT duration). If aborted, CPU + goes back to EXEC. This phase is optional. If not abortable, + this should be included in the ENTRY phase instead. + +ENTRY: The hardware is committed to idle mode. This period must run + to completion up to IDLE before anything else can happen. + +IDLE: This is the actual energy-saving idle period. This may last + between 0 and infinite time, until a wake-up event occurs. + +EXIT: Period during which the CPU is brought back to operational + mode (EXEC). + +entry-latency: Worst case latency required to enter the idle state. The +exit-latency may be guaranteed only after entry-latency has passed. + +min-residency: Minimum period, including preparation and entry, for a given +idle state to be worthwhile energywise. + +wakeup-latency: Maximum delay between the signaling of a wake-up event and the +CPU being able to execute normal code again. If not specified, this is assumed +to be entry-latency + exit-latency. + +These timing parameters can be used by an OS in different circumstances. + +An idle CPU requires the expected min-residency time to select the most +appropriate idle state based on the expected expiry time of the next IRQ +(ie wake-up) that causes the CPU to return to the EXEC phase. + +An operating system scheduler may need to compute the shortest wake-up delay +for CPUs in the system by detecting how long will it take to get a CPU out +of an idle state, eg: + +wakeup-delay = exit-latency + max(entry-latency - (now - entry-timestamp), 0) + +In other words, the scheduler can make its scheduling decision by selecting +(eg waking-up) the CPU with the shortest wake-up latency. +The wake-up latency must take into account the entry latency if that period +has not expired. The abortable nature of the PREP period can be ignored +if it cannot be relied upon (e.g. the PREP deadline may occur much sooner than +the worst case since it depends on the CPU operating conditions, ie caches +state). + +An OS has to reliably probe the wakeup-latency since some devices can enforce +latency constraints guarantees to work properly, so the OS has to detect the +worst case wake-up latency it can incur if a CPU is allowed to enter an +idle state, and possibly to prevent that to guarantee reliable device +functioning. + +The min-residency time parameter deserves further explanation since it is +expressed in time units but must factor in energy consumption coefficients. + +The energy consumption of a cpu when it enters a power state can be roughly +characterised by the following graph: + + | + | + | + e | + n | /--- + e | /------ + r | /------ + g | /----- + y | /------ + | ---- + | /| + | / | + | / | + | / | + | / | + | / | + |/ | + -----|-------+---------------------------------- + 0| 1 time(ms) + + Graph 1: Energy vs time example + +The graph is split in two parts delimited by time 1ms on the X-axis. +The graph curve with X-axis values = { x | 0 < x < 1ms } has a steep slope +and denotes the energy costs incurred whilst entering and leaving the idle +state. +The graph curve in the area delimited by X-axis values = {x | x > 1ms } has +shallower slope and essentially represents the energy consumption of the idle +state. + +min-residency is defined for a given idle state as the minimum expected +residency time for a state (inclusive of preparation and entry) after +which choosing that state become the most energy efficient option. A good +way to visualise this, is by taking the same graph above and comparing some +states energy consumptions plots. + +For sake of simplicity, let's consider a system with two idle states IDLE1, +and IDLE2: + + | + | + | + | /-- IDLE1 + e | /--- + n | /---- + e | /--- + r | /-----/--------- IDLE2 + g | /-------/--------- + y | ------------ /---| + | / /---- | + | / /--- | + | / /---- | + | / /--- | + | --- | + | / | + | / | + |/ | time + ---/----------------------------+------------------------ + |IDLE1-energy < IDLE2-energy | IDLE2-energy < IDLE1-energy + | + IDLE2-min-residency + + Graph 2: idle states min-residency example + +In graph 2 above, that takes into account idle states entry/exit energy +costs, it is clear that if the idle state residency time (ie time till next +wake-up IRQ) is less than IDLE2-min-residency, IDLE1 is the better idle state +choice energywise. + +This is mainly down to the fact that IDLE1 entry/exit energy costs are lower +than IDLE2. + +However, the lower power consumption (ie shallower energy curve slope) of idle +state IDLE2 implies that after a suitable time, IDLE2 becomes more energy +efficient. + +The time at which IDLE2 becomes more energy efficient than IDLE1 (and other +shallower states in a system with multiple idle states) is defined +IDLE2-min-residency and corresponds to the time when energy consumption of +IDLE1 and IDLE2 states breaks even. + +The definitions provided in this section underpin the idle states +properties specification that is the subject of the following sections. + +=========================================== +3 - idle-states node +=========================================== + +ARM processor idle states are defined within the idle-states node, which is +a direct child of the cpus node [1] and provides a container where the +processor idle states, defined as device tree nodes, are listed. + +- idle-states node + + Usage: Optional - On ARM systems, it is a container of processor idle + states nodes. If the system does not provide CPU + power management capabilities or the processor just + supports idle_standby an idle-states node is not + required. + + Description: idle-states node is a container node, where its + subnodes describe the CPU idle states. + + Node name must be "idle-states". + + The idle-states node's parent node must be the cpus node. + + The idle-states node's child nodes can be: + + - one or more state nodes + + Any other configuration is considered invalid. + + An idle-states node defines the following properties: + + - entry-method + Usage: Required + Value type: <stringlist> + Definition: Describes the method by which a CPU enters the + idle states. This property is required and must be + one of: + + - "arm,psci" + ARM PSCI firmware interface [2]. + + - "[vendor],[method]" + An implementation dependent string with + format "vendor,method", where vendor is a string + denoting the name of the manufacturer and + method is a string specifying the mechanism + used to enter the idle state. + +The nodes describing the idle states (state) can only be defined within the +idle-states node, any other configuration is considered invalid and therefore +must be ignored. + +=========================================== +4 - state node +=========================================== + +A state node represents an idle state description and must be defined as +follows: + +- state node + + Description: must be child of the idle-states node + + The state node name shall follow standard device tree naming + rules ([5], 2.2.1 "Node names"), in particular state nodes which + are siblings within a single common parent must be given a unique name. + + The idle state entered by executing the wfi instruction (idle_standby + SBSA,[3][4]) is considered standard on all ARM platforms and therefore + must not be listed. + + With the definitions provided above, the following list represents + the valid properties for a state node: + + - compatible + Usage: Required + Value type: <stringlist> + Definition: Must be "arm,idle-state". + + - logic-state-retained + Usage: See definition + Value type: <none> + Definition: if present logic is retained on state entry, + otherwise it is lost. + + - cache-state-retained + Usage: See definition + Value type: <none> + Definition: if present cache memory is retained on state entry, + otherwise it is lost. + + - timer-state-retained + Usage: See definition + Value type: <none> + Definition: if present the timer control logic is retained on + state entry, otherwise it is lost. + + - power-rank + Usage: Required + Value type: <u32> + Definition: It represents the idle state power-rank. + An increasing value implies less power + consumption. It must be given a sequential + value = {0, 1, ....}, starting from 0. + Phandles in the cpu nodes [1] cpu-idle-states + array property are not allowed to point at idle + state nodes having the same power-rank value. + + - entry-method-param + Usage: See definition. + Value type: <u32> + Definition: Depends on the idle-states node entry-method + property value. Refer to the entry-method bindings + for this property value definition. + + - entry-latency-us + Usage: Required + Value type: <prop-encoded-array> + Definition: u32 value representing worst case latency in + microseconds required to enter the idle state. + The exit-latency-us duration may be guaranteed + only after entry-latency-us has passed. + + - exit-latency-us + Usage: Required + Value type: <prop-encoded-array> + Definition: u32 value representing worst case latency + in microseconds required to exit the idle state. + + - min-residency-us + Usage: Required + Value type: <prop-encoded-array> + Definition: u32 value representing minimum residency duration + in microseconds, inclusive of preparation and + entry, for this idle state to be considered + worthwhile energy wise (refer to section 2 of + this document for a complete description). + + - wakeup-latency-us: + Usage: Optional + Value type: <prop-encoded-array> + Definition: u32 value representing maximum delay between the + signaling of a wake-up event and the CPU being + able to execute normal code again. If omitted, + this is assumed to be equal to: + + entry-latency-us + exit-latency-us + + It is important to supply this value on systems + where the duration of PREP phase (see diagram 1, + section 2) is non-neglibigle. + In such systems entry-latency-us + exit-latency-us + will exceed wakeup-latency-us by this duration. + +=========================================== +4 - Examples +=========================================== + +Example 1 (ARM 64-bit, 16-cpu system): + +cpus { + #size-cells = <0>; + #address-cells = <2>; + + idle-states { + entry-method = "arm,psci"; + + CPU_RETENTION_0_0: cpu-retention-0-0 { + compatible = "arm,idle-state"; + power-rank = <0>; + logic-state-retained; + cache-state-retained; + entry-method-param = <0x0010000>; + entry-latency-us = <20>; + exit-latency-us = <40>; + min-residency-us = <80>; + }; + + CLUSTER_RETENTION_0: cluster-retention-0 { + compatible = "arm,idle-state"; + power-rank = <2>; + cache-state-retained; + entry-method-param = <0x1010000>; + entry-latency-us = <50>; + exit-latency-us = <100>; + min-residency-us = <250>; + wakeup-latency-us = <130>; + }; + + CPU_SLEEP_0_0: cpu-sleep-0-0 { + compatible = "arm,idle-state"; + power-rank = <1>; + entry-method-param = <0x0010000>; + entry-latency-us = <250>; + exit-latency-us = <500>; + min-residency-us = <950>; + }; + + CLUSTER_SLEEP_0: cluster-sleep-0 { + compatible = "arm,idle-state"; + power-rank = <3>; + entry-method-param = <0x1010000>; + entry-latency-us = <600>; + exit-latency-us = <1100>; + min-residency-us = <2700>; + wakeup-latency-us = <1500>; + }; + + CPU_RETENTION_1_0: cpu-retention-1-0 { + compatible = "arm,idle-state"; + power-rank = <0>; + logic-state-retained; + cache-state-retained; + entry-method-param = <0x0010000>; + entry-latency-us = <20>; + exit-latency-us = <40>; + min-residency-us = <90>; + }; + + CLUSTER_RETENTION_1: cluster-retention-1 { + compatible = "arm,idle-state"; + power-rank = <2>; + cache-state-retained; + entry-method-param = <0x1010000>; + entry-latency-us = <50>; + exit-latency-us = <100>; + min-residency-us = <270>; + wakeup-latency-us = <100>; + }; + + CPU_SLEEP_1_0: cpu-sleep-1-0 { + compatible = "arm,idle-state"; + power-rank = <1>; + entry-method-param = <0x0010000>; + entry-latency-us = <70>; + exit-latency-us = <100>; + min-residency-us = <300>; + wakeup-latency-us = <150>; + }; + + CLUSTER_SLEEP_1: cluster-sleep-1 { + compatible = "arm,idle-state"; + power-rank = <3>; + entry-method-param = <0x1010000>; + entry-latency-us = <500>; + exit-latency-us = <1200>; + min-residency-us = <3500>; + wakeup-latency-us = <1300>; + }; + }; + + CPU0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x0>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x1>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU2: cpu@100 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x100>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU3: cpu@101 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x101>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU4: cpu@10000 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x10000>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU5: cpu@10001 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x10001>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU6: cpu@10100 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x10100>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU7: cpu@10101 { + device_type = "cpu"; + compatible = "arm,cortex-a57"; + reg = <0x0 0x10101>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0 + &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>; + }; + + CPU8: cpu@100000000 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x0>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU9: cpu@100000001 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x1>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU10: cpu@100000100 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x100>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU11: cpu@100000101 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x101>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU12: cpu@100010000 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x10000>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU13: cpu@100010001 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x10001>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU14: cpu@100010100 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x10100>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; + + CPU15: cpu@100010101 { + device_type = "cpu"; + compatible = "arm,cortex-a53"; + reg = <0x1 0x10101>; + enable-method = "psci"; + cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0 + &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>; + }; +}; + +Example 2 (ARM 32-bit, 8-cpu system, two clusters): + +cpus { + #size-cells = <0>; + #address-cells = <1>; + + idle-states { + entry-method = "arm,psci"; + + CPU_SLEEP_0_0: cpu-sleep-0-0 { + compatible = "arm,idle-state"; + power-rank = <0>; + entry-method-param = <0x0010000>; + entry-latency-us = <200>; + exit-latency-us = <100>; + min-residency-us = <400>; + wakeup-latency-us = <250>; + }; + + CLUSTER_SLEEP_0: cluster-sleep-0 { + compatible = "arm,idle-state"; + power-rank = <2>; + entry-method-param = <0x1010000>; + entry-latency-us = <500>; + exit-latency-us = <1500>; + min-residency-us = <2500>; + wakeup-latency-us = <1700>; + }; + + CPU_SLEEP_1_0: cpu-sleep-1-0 { + compatible = "arm,idle-state"; + power-rank = <1>; + entry-method-param = <0x0010000>; + entry-latency-us = <300>; + exit-latency-us = <500>; + min-residency-us = <900>; + wakeup-latency-us = <600>; + }; + + CLUSTER_SLEEP_1: cluster-sleep-1 { + compatible = "arm,idle-state"; + power-rank = <3>; + entry-method-param = <0x1010000>; + entry-latency-us = <800>; + exit-latency-us = <2000>; + min-residency-us = <6500>; + wakeup-latency-us = <2300>; + }; + }; + + CPU0: cpu@0 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + reg = <0x0>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; + }; + + CPU1: cpu@1 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + reg = <0x1>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; + }; + + CPU2: cpu@2 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + reg = <0x2>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; + }; + + CPU3: cpu@3 { + device_type = "cpu"; + compatible = "arm,cortex-a15"; + reg = <0x3>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_0_0 &CLUSTER_SLEEP_0>; + }; + + CPU4: cpu@100 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x100>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; + }; + + CPU5: cpu@101 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x101>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; + }; + + CPU6: cpu@102 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x102>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; + }; + + CPU7: cpu@103 { + device_type = "cpu"; + compatible = "arm,cortex-a7"; + reg = <0x103>; + enable-method = "psci"; + cpu-idle-states = <&CPU_SLEEP_1_0 &CLUSTER_SLEEP_1>; + }; +}; + +=========================================== +5 - References +=========================================== + +[1] ARM Linux Kernel documentation - CPUs bindings + Documentation/devicetree/bindings/arm/cpus.txt + +[2] ARM Linux Kernel documentation - PSCI bindings + Documentation/devicetree/bindings/arm/psci.txt + +[3] ARM Server Base System Architecture (SBSA) + http://infocenter.arm.com/help/index.jsp + +[4] ARM Architecture Reference Manuals + http://infocenter.arm.com/help/index.jsp + +[5] ePAPR standard + https://www.power.org/documentation/epapr-version-1-1/