mbox series

[V4,0/5] arm_scmi: vendors: Qualcomm Generic Vendor Extensions

Message ID 20241007061023.1978380-1-quic_sibis@quicinc.com
Headers show
Series arm_scmi: vendors: Qualcomm Generic Vendor Extensions | expand

Message

Sibi Sankar Oct. 7, 2024, 6:10 a.m. UTC
The QCOM SCMI vendor protocol provides a generic way of exposing a
number of Qualcomm SoC specific features (like memory bus scaling)
through a mixture of pre-determined algorithm strings and param_id
pairs hosted on the SCMI controller. Introduce a client driver that
uses the memlat algorithm string hosted on QCOM SCMI Vendor Protocol
to detect memory latency workloads and control frequency/level of
the various memory buses (DDR/LLCC/DDR_QOS).

QCOM SCMI Generic Vendor protocol background:
It was found that a lot of the vendor protocol used internally was
for debug/internal development purposes that would either be super
SoC specific or had to be disabled because of some features being
fused out during production. This lead to a large number of vendor
protocol numbers being quickly consumed and were never released
either. Using a generic vendor protocol with functionality abstracted
behind algorithm strings gave us the flexibility of allowing such
functionality exist during initial development/debugging while
still being able to expose functionality like memlat once they have
matured enough. The param-ids are certainly expected to act as ABI
for algorithms strings like MEMLAT.

Thanks in advance for taking time to review the series.

V3:
* Restructure the bindings to mimic IMX [Christian]
* Add documentation for the qcom generic vendor protocol [Christian/Sudeep]
* Pick up Rb tag and fixup/re-order elements of the vendor ops [Christian]
* Follow naming convention and folder structure used by IMX
* Add missing enum in the vendor protocol and fix documentation [Konrad]
* Add missing enum in the scmi memlat driver and fix documentation [Konrad]
* Add checks for max memory and monitor [Shivnandan]
* Fix typo from START_TIMER -> STOP_TIMER [Shivnandan]
* Make populate_physical_mask func to void [Shivnandan]
* Remove unecessary zero set [Shivnandan]
* Use __free(device node) in init_cpufreq-memfreqmap [Christian/Konrad]
* Use sdev->dev.of_node directly [Christian]
* use return dev_err_probe in multiple places [Christian]

V2:
* Drop container dvfs memlat container node. [Rob]
* Move scmi-memlat.yaml to protocol level given that a lot of vendors might end up
* using the same protocol number. [Rob]
* Replace qcom,cpulist with the standard "cpus" property. [Rob]
* Fix up compute-type/ipm-ceil required. [Rob]
* Make driver changes to the accommodate bindings changes. [Rob]
* Minor fixups in subjects/coverletter.
* Minor style fixes in dts.

V1:
* Add missing bindings for the protocol. [Konrad/Dmitry]
* Use alternate bindings. [Dmitry/Konrad]
* Rebase on top of Cristian's "SCMI multiple vendor protocol support" series. [Cristian]
* Add more documentation wherever possible. [Sudeep]
* Replace pr_err/info with it's dev equivalents.
* Mixed tabs and initialization cleanups in the memlat driver. [Konrad]
* Commit message update for the memlat driver. [Dmitry]
* Cleanups/Fixes suggested for the client driver. [Dmitry/Konrad/Cristian]
* Use opp-tables instead of memfreq-tbl. [Dmitry/Konrad]
* Detect physical cpu to deal with variants with reduced cpu count.
* Add support for DDR_QOS mem_type.

Sibi Sankar (5):
  dt-bindings: firmware: Document bindings for QCOM SCMI Generic
    Extension
  firmware: arm_scmi: Add QCOM Generic Vendor Protocol documentation
  firmware: arm_scmi: vendors: Add QCOM SCMI Generic Extensions
  soc: qcom: Introduce SCMI based Memlat (Memory Latency) governor
  arm64: dts: qcom: x1e80100: Enable LLCC/DDR/DDR_QOS dvfs

 .../bindings/firmware/arm,scmi.yaml           |   1 +
 .../bindings/firmware/qcom,scmi-memlat.yaml   | 246 ++++++++
 arch/arm64/boot/dts/qcom/x1e80100.dtsi        | 138 +++++
 drivers/firmware/arm_scmi/Kconfig             |   1 +
 drivers/firmware/arm_scmi/Makefile            |   1 +
 .../firmware/arm_scmi/vendors/qcom/Kconfig    |  15 +
 .../firmware/arm_scmi/vendors/qcom/Makefile   |   2 +
 .../arm_scmi/vendors/qcom/qcom-generic-ext.c  | 184 ++++++
 .../arm_scmi/vendors/qcom/qcom_generic.rst    | 210 +++++++
 drivers/soc/qcom/Kconfig                      |  12 +
 drivers/soc/qcom/Makefile                     |   1 +
 drivers/soc/qcom/qcom_scmi_memlat_client.c    | 569 ++++++++++++++++++
 .../dt-bindings/firmware/qcom,scmi-memlat.h   |  22 +
 include/linux/scmi_qcom_protocol.h            |  39 ++
 14 files changed, 1441 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/qcom,scmi-memlat.yaml
 create mode 100644 drivers/firmware/arm_scmi/vendors/qcom/Kconfig
 create mode 100644 drivers/firmware/arm_scmi/vendors/qcom/Makefile
 create mode 100644 drivers/firmware/arm_scmi/vendors/qcom/qcom-generic-ext.c
 create mode 100644 drivers/firmware/arm_scmi/vendors/qcom/qcom_generic.rst
 create mode 100644 drivers/soc/qcom/qcom_scmi_memlat_client.c
 create mode 100644 include/dt-bindings/firmware/qcom,scmi-memlat.h
 create mode 100644 include/linux/scmi_qcom_protocol.h

Comments

Sibi Sankar Oct. 22, 2024, 8:24 a.m. UTC | #1
On 10/8/24 12:22, Krzysztof Kozlowski wrote:
> On Mon, Oct 07, 2024 at 11:40:18AM +0530, Sibi Sankar wrote:
>> The QCOM SCMI vendor protocol provides a generic way of exposing a
>> number of Qualcomm SoC specific features (like memory bus scaling)
>> through a mixture of pre-determined algorithm strings and param_id
>> pairs hosted on the SCMI controller. Introduce a client driver that
>> uses the memlat algorithm string hosted on QCOM SCMI Vendor Protocol
>> to detect memory latency workloads and control frequency/level of
>> the various memory buses (DDR/LLCC/DDR_QOS).
> 
> None of your patches are wrapped according to Linux coding style which
> makes reviewing more difficult than it should be. And before you answer
> with checkpatch, checkpatch is not a coding style.

I can see that you've been a reviewer of this series from the very
initial version. That would imply you had a chance to shape/guide the
series to whatever shape you prefer. Yet you choose not to do so and
make a blanket statement now that it's close to merge in v4 :/

-Sibi

> 
> Best regards,
> Krzysztof
>
Johan Hovold Nov. 6, 2024, 12:55 p.m. UTC | #2
On Mon, Oct 07, 2024 at 11:40:18AM +0530, Sibi Sankar wrote:
> The QCOM SCMI vendor protocol provides a generic way of exposing a
> number of Qualcomm SoC specific features (like memory bus scaling)
> through a mixture of pre-determined algorithm strings and param_id
> pairs hosted on the SCMI controller. Introduce a client driver that
> uses the memlat algorithm string hosted on QCOM SCMI Vendor Protocol
> to detect memory latency workloads and control frequency/level of
> the various memory buses (DDR/LLCC/DDR_QOS).
> 
> QCOM SCMI Generic Vendor protocol background:
> It was found that a lot of the vendor protocol used internally was
> for debug/internal development purposes that would either be super
> SoC specific or had to be disabled because of some features being
> fused out during production. This lead to a large number of vendor
> protocol numbers being quickly consumed and were never released
> either. Using a generic vendor protocol with functionality abstracted
> behind algorithm strings gave us the flexibility of allowing such
> functionality exist during initial development/debugging while
> still being able to expose functionality like memlat once they have
> matured enough. The param-ids are certainly expected to act as ABI
> for algorithms strings like MEMLAT.

I wanted to give this series a quick spin on the x1e80100 CRD, but ran
into some issues.

First, I expected the drivers to be loaded automatically when built as
modules, but that did not happen so something appears to be missing.

Second, after loading the protocol and client drivers manually (in that
order, shouldn't the client driver pull in the protocol?), I got:

	scmi_module: Loaded SCMI Vendor Protocol 0x80 - Qualcomm  20000
	arm-scmi arm-scmi.0.auto: QCOM Generic Vendor Version 1.0
	scmi-qcom-generic-ext-memlat scmi_dev.5: error -EOPNOTSUPP: failed to configure common events
	scmi-qcom-generic-ext-memlat scmi_dev.5: probe with driver scmi-qcom-generic-ext-memlat failed with error -95

which seems to suggest that the firmware on my CRD does not support this
feature. Is that the way this should be interpreted? And does that mean
that non of the commercial laptops supports this either?

Johan
Cristian Marussi Nov. 6, 2024, 8:03 p.m. UTC | #3
On Wed, Nov 06, 2024 at 01:55:33PM +0100, Johan Hovold wrote:
> On Mon, Oct 07, 2024 at 11:40:18AM +0530, Sibi Sankar wrote:
> > The QCOM SCMI vendor protocol provides a generic way of exposing a
> > number of Qualcomm SoC specific features (like memory bus scaling)
> > through a mixture of pre-determined algorithm strings and param_id
> > pairs hosted on the SCMI controller. Introduce a client driver that
> > uses the memlat algorithm string hosted on QCOM SCMI Vendor Protocol
> > to detect memory latency workloads and control frequency/level of
> > the various memory buses (DDR/LLCC/DDR_QOS).
> > 
> > QCOM SCMI Generic Vendor protocol background:
> > It was found that a lot of the vendor protocol used internally was
> > for debug/internal development purposes that would either be super
> > SoC specific or had to be disabled because of some features being
> > fused out during production. This lead to a large number of vendor
> > protocol numbers being quickly consumed and were never released
> > either. Using a generic vendor protocol with functionality abstracted
> > behind algorithm strings gave us the flexibility of allowing such
> > functionality exist during initial development/debugging while
> > still being able to expose functionality like memlat once they have
> > matured enough. The param-ids are certainly expected to act as ABI
> > for algorithms strings like MEMLAT.
> 
> I wanted to give this series a quick spin on the x1e80100 CRD, but ran
> into some issues.
> 
> First, I expected the drivers to be loaded automatically when built as
> modules, but that did not happen so something appears to be missing.
> 

Hi Johan,

so the SCMI stack is fully modularizable as of this release, i.e.

 - SCMI core (scmi-core + scmi-module)
 - SCMI transports (scmi_transport_{mailbox,virtio,smc,optee}
 - optional SCMI Vendor protos
 - Std and Vendor SCMI Drivers on top of protos

....on the other side the SCMI standard protocols are still embedded
in scmi-module (or builtin) as of now...

Even though, module usage is tracked by the core AND when an SCMI Vendor
driver tries to use protocol_X, it causes protocol_X to be initialized
(calling its protocol_init), there is NO auto-loading for SCMI Vendor
Protocols when bult as LKM...because there were really no ask till now
and this stuff is in general needed so very early dburing boot...so the
usecase of all these LKM modules is just debug/test as in your case
(or in mine frequently)....

...and I am NOT saying with this that is necessarily right or must be
stay like this...just explaining how it is now....

....anyway...it is mostly trivial to add vendor/protocols autoloading
transparently...today I was experimenting with a patch that triggers
autoloading based on a generic common alias pattern in the form
'scmi-protocol-0x<NN>' (with NN the specific protocol ID of course)
that triggers the loading as soon as the SCMI Vendor driver tries to
access the protocol during its probe...

....I will post it for the next cycle once it is clean.
(unless I am missing something else that you want to add...)

> Second, after loading the protocol and client drivers manually (in that
> order, shouldn't the client driver pull in the protocol?), I got:
> 
> 	scmi_module: Loaded SCMI Vendor Protocol 0x80 - Qualcomm  20000
> 	arm-scmi arm-scmi.0.auto: QCOM Generic Vendor Version 1.0
> 	scmi-qcom-generic-ext-memlat scmi_dev.5: error -EOPNOTSUPP: failed to configure common events
> 	scmi-qcom-generic-ext-memlat scmi_dev.5: probe with driver scmi-qcom-generic-ext-memlat failed with error -95
> 
> which seems to suggest that the firmware on my CRD does not support this
> feature. Is that the way this should be interpreted? And does that mean
> that non of the commercial laptops supports this either?

This seems like FW rejecting the command, maybe just only for the specific
Linux OSPM agent since it is not allowed to ask for that specific setup,
and only Sibi can shed a light here :D

...but this Vendor protocol, if I am not mistaken, AFAIU, uses a bunch
of "algo strings" coming from tokens it picks from DT and use thsoe as
params for the set_param() VendorProtocol ops...cannot be that a bad/missing
DT value could cause the FW to reject the command due to the params ?
(even if the command is supported)...

...just a guess ah... I have no real knowledge of this venndor proto...


Thanks,
Cristian
Johan Hovold Nov. 8, 2024, 3:14 p.m. UTC | #4
On Wed, Nov 06, 2024 at 08:03:30PM +0000, Cristian Marussi wrote:
> On Wed, Nov 06, 2024 at 01:55:33PM +0100, Johan Hovold wrote:

> > First, I expected the drivers to be loaded automatically when built as
> > modules, but that did not happen so something appears to be missing.

> Even though, module usage is tracked by the core AND when an SCMI Vendor
> driver tries to use protocol_X, it causes protocol_X to be initialized
> (calling its protocol_init), there is NO auto-loading for SCMI Vendor
> Protocols when bult as LKM...because there were really no ask till now
> and this stuff is in general needed so very early dburing boot...so the
> usecase of all these LKM modules is just debug/test as in your case
> (or in mine frequently)....
> 
> ...and I am NOT saying with this that is necessarily right or must be
> stay like this...just explaining how it is now....

Ok, thanks for clarifying.

> ....anyway...it is mostly trivial to add vendor/protocols autoloading
> transparently...today I was experimenting with a patch that triggers
> autoloading based on a generic common alias pattern in the form
> 'scmi-protocol-0x<NN>' (with NN the specific protocol ID of course)
> that triggers the loading as soon as the SCMI Vendor driver tries to
> access the protocol during its probe...
> 
> ....I will post it for the next cycle once it is clean.
> (unless I am missing something else that you want to add...)

Sounds like that would solve the issue. I was just expecting the memlat
client driver and its protocol dependency to be loaded automatically
when built as modules on machines that can use them (e.g. as we don't
want to have every vendor protocol driver built into distro kernels,
etc).

> > Second, after loading the protocol and client drivers manually (in that
> > order, shouldn't the client driver pull in the protocol?), I got:
> > 
> > 	scmi_module: Loaded SCMI Vendor Protocol 0x80 - Qualcomm  20000
> > 	arm-scmi arm-scmi.0.auto: QCOM Generic Vendor Version 1.0
> > 	scmi-qcom-generic-ext-memlat scmi_dev.5: error -EOPNOTSUPP: failed to configure common events
> > 	scmi-qcom-generic-ext-memlat scmi_dev.5: probe with driver scmi-qcom-generic-ext-memlat failed with error -95
> > 
> > which seems to suggest that the firmware on my CRD does not support this
> > feature. Is that the way this should be interpreted? And does that mean
> > that non of the commercial laptops supports this either?
> 
> This seems like FW rejecting the command, maybe just only for the specific
> Linux OSPM agent since it is not allowed to ask for that specific setup,
> and only Sibi can shed a light here :D
> 
> ...but this Vendor protocol, if I am not mistaken, AFAIU, uses a bunch
> of "algo strings" coming from tokens it picks from DT and use thsoe as
> params for the set_param() VendorProtocol ops...cannot be that a bad/missing
> DT value could cause the FW to reject the command due to the params ?
> (even if the command is supported)...
> 
> ...just a guess ah... I have no real knowledge of this venndor proto...

Yeah, hopefully Sibi can shed some light on this. I'm using the DT
patch (5/5) from this series, which according to the commit message is
supposed to enable bus scaling on the x1e80100 platform. So I guess
something is missing in my firmware.

Johan