mbox series

[v4,00/10] Coresight: Add support for TPDM and TPDA

Message ID 20220324121734.21531-1-quic_jinlmao@quicinc.com
Headers show
Series Coresight: Add support for TPDM and TPDA | expand

Message

Mao Jinlong March 24, 2022, 12:17 p.m. UTC
This series adds support for the trace performance monitoring and
diagnostics hardware (TPDM and TPDA). It is composed of two major
elements.
a) Changes for original coresight framework to support for TPDM and TPDA.
b) Add driver code for TPDM and TPDA.

Introduction of changes for original coresight framework
Support TPDM as new coresight source.
Since only STM and ETM are supported as coresight source originally.
TPDM is a newly added coresight source. We need to change
the original way of saving coresight path to support more types source
for coresight driver.
The following patch is to add support more coresight sources.
     Use IDR to maintain all the enabled sources' paths.

Introduction of TPDM and TPDA
TPDM - The trace performance monitoring and diagnostics monitor or TPDM in
short serves as data collection component for various dataset types
specified in the QPMDA(Qualcomm performance monitoring and diagnostics
architecture) spec. The primary use case of the TPDM is to collect data
from different data sources and send it to a TPDA for packetization,
timestamping and funneling.
     Coresight: Add coresight TPDM source driver
     dt-bindings: arm: Adds CoreSight TPDM hardware definitions
     coresight-tpdm: Add DSB dataset support
     coresight-tpdm: Add integration test support
     docs: sysfs: coresight: Add sysfs ABI documentation for TPDM

TPDA - The trace performance monitoring and diagnostics aggregator or
TPDA in short serves as an arbitration and packetization engine for the
performance monitoring and diagnostics network as specified in the QPMDA
(Qualcomm performance monitoring and diagnostics architecture)
specification. The primary use case of the TPDA is to provide
packetization, funneling and timestamping of Monitor data as specified
in the QPMDA specification.
The following patch is to add driver for TPDA.
     Coresight: Add TPDA link driver
     dt-bindings: arm: Adds CoreSight TPDA hardware definitions

The last patch of this series is a device tree modification, which add
the TPDM and TPDA configuration to device tree for validating.
    ARM: dts: msm: Add coresight components for SM8250
    ARM: dts: msm: Add tpdm mm/prng for sm8250

Once this series patches are applied properly, the tpdm and tpda nodes
should be observed at the coresight path /sys/bus/coresight/devices
e.g.
/sys/bus/coresight/devices # ls -l | grep tpd
tpda0 -> ../../../devices/platform/soc@0/6004000.tpda/tpda0
tpdm0 -> ../../../devices/platform/soc@0/6c08000.mm.tpdm/tpdm0

We can use the commands are similar to the below to validate TPDMs.
Enable coresight sink first.

echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
The test data will be collected in the coresight sink which is enabled.
If rwp register of the sink is keeping updating when do
integration_test (by cat tmc_etf0/mgmt/rwp), it means there is data
generated from TPDM to sink.

There must be a tpda between tpdm and the sink. When there are some
other trace event hw components in the same HW block with tpdm, tpdm
and these hw components will connect to the coresight funnel. When
there is only tpdm trace hw in the HW block, tpdm will connect to
tpda directly.
  
    +---------------+                +-------------+
    |  tpdm@6c08000 |                |tpdm@684C000 |
    +-------|-------+                +------|------+
            |                               |
    +-------|-------+                       |
    | funnel@6c0b000|                       |
    +-------|-------+                       |
            |                               |
    +-------|-------+                       |
    |funnel@6c2d000 |                       |
    +-------|-------+                       |
            |                               |
            |    +---------------+          |
            +----- tpda@6004000  -----------+
                 +-------|-------+
                         |
                 +-------|-------+
                 |funnel@6005000 |
                 +---------------+

This patch series depends on patch series
"coresight: Add new API to allocate trace source ID values".
https://patchwork.kernel.org/project/linux-arm-kernel/cover/20220308205000.27646-1-mike.leach@linaro.org/ 

Changes from V3:
1. Drop trace id for tpdm source as its trace atid is defined by the tpda.
Allocate tpda's atid dynamically.  (Mike Leach)

Mao Jinlong (10):
  Use IDR to maintain all the enabled sources' paths.
  Coresight: Add coresight TPDM source driver
  dt-bindings: arm: Adds CoreSight TPDM hardware definitions
  coresight-tpdm: Add DSB dataset support
  coresight-tpdm: Add integration test support
  docs: sysfs: coresight: Add sysfs ABI documentation for TPDM
  Coresight: Add TPDA link driver
  dt-bindings: arm: Adds CoreSight TPDA hardware definitions
  ARM: dts: msm: Add coresight components for SM8250
  ARM: dts: msm: Add tpdm mm/prng for sm8250

 .../testing/sysfs-bus-coresight-devices-tpdm  |   5 +
 .../bindings/arm/coresight-tpda.yaml          | 119 +++
 .../bindings/arm/coresight-tpdm.yaml          |  99 +++
 .../devicetree/bindings/arm/coresight.txt     |   7 +
 MAINTAINERS                                   |   1 +
 .../arm64/boot/dts/qcom/sm8250-coresight.dtsi | 708 ++++++++++++++++++
 arch/arm64/boot/dts/qcom/sm8250.dtsi          |   2 +
 drivers/hwtracing/coresight/Kconfig           |  33 +
 drivers/hwtracing/coresight/Makefile          |   2 +
 drivers/hwtracing/coresight/coresight-core.c  |  79 +-
 drivers/hwtracing/coresight/coresight-tpda.c  | 192 +++++
 drivers/hwtracing/coresight/coresight-tpda.h  |  32 +
 drivers/hwtracing/coresight/coresight-tpdm.c  | 260 +++++++
 drivers/hwtracing/coresight/coresight-tpdm.h  |  56 ++
 include/linux/coresight.h                     |   1 +
 15 files changed, 1546 insertions(+), 50 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm
 create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpda.yaml
 create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpdm.yaml
 create mode 100644 arch/arm64/boot/dts/qcom/sm8250-coresight.dtsi
 create mode 100644 drivers/hwtracing/coresight/coresight-tpda.c
 create mode 100644 drivers/hwtracing/coresight/coresight-tpda.h
 create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.c
 create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.h

Comments

Mao Jinlong March 29, 2022, 1:47 p.m. UTC | #1
Hi Greg,

On 3/25/2022 1:06 AM, Greg Kroah-Hartman wrote:
> On Thu, Mar 24, 2022 at 10:23:19PM +0800, Jinlong Mao wrote:
>> Hi Greg,
>>
>> Thanks for your review.
>>
>> On 3/24/2022 8:26 PM, Greg Kroah-Hartman wrote:
>>> On Thu, Mar 24, 2022 at 08:17:25PM +0800, Mao Jinlong wrote:
>>>> Use hash length of the source's device name to map to the pointer
>>>> of the enabled path. Using IDR will be more efficient than using
>>>> the list. And there could be other sources except STM and CPU etms
>>>> in the new HWs. It is better to maintain all the paths together.
>>>>
>>>> Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
>>>> ---
>>>>    drivers/hwtracing/coresight/coresight-core.c | 75 +++++++-------------
>>>>    1 file changed, 26 insertions(+), 49 deletions(-)
>>> Your subject line is odd.  Please put back the driver subsystem in the
>>> subject line so that it makes more sense.
>> I will update the subject in next version.
>>> And how have you measured "more efficient"?
>> Using IDR would be better than doing a sequential search as there will be
>> much more device  in future.
> How many "more"?  Where does the trade off of speed for complexity help?
> How much faster is this really?  You can't claim performance
> improvements without any proof :)
There is about 40 trace sources in our internal device. I believe there 
will be more cpu cores, then
there will be more etm sources. IDR here is used to store the path of 
both etm
sources and other sources which aren't associated with CPU.  Use IDR is 
not more complicated
than using list. It will also save the time of searching the path when 
coresight_disable.
I tested in internal device. The test case is that enable all the 
sources, disable the source one
by one to check the search time.

Use list to store paths:

               sh-7687    [005] ....   342.113099: __coresight_disable: 
====search path start==== source_0
               sh-7687    [005] ....   342.113127: __coresight_disable: 
====search path end==== source_0
               sh-7693    [005] ....   342.542216: __coresight_disable: 
====search path start==== source_1
               sh-7693    [005] ....   342.542244: __coresight_disable: 
====search path end==== source_1
               sh-7699    [005] ....   342.929083: __coresight_disable: 
====search path start==== source_2
               sh-7699    [005] ....   342.929106: __coresight_disable: 
====search path end==== source_2
               sh-7711    [005] ....   343.760688: __coresight_disable: 
====search path start==== source_3
               sh-7711    [005] ....   343.760713: __coresight_disable: 
====search path end==== source_3
               sh-7717    [005] ....   344.172353: __coresight_disable: 
====search path start==== source_4
               sh-7717    [005] ....   344.172381: __coresight_disable: 
====search path end==== source_4


Use IDR to store paths:

              sh-7156    [006] ....    223.294228: __coresight_disable: 
====search path start==== source_0
               sh-7156    [006] ....   223.294237: __coresight_disable: 
====search path end==== source_0
               sh-7162    [006] ....   223.690153: __coresight_disable: 
====search path start==== source_1
               sh-7162    [006] ....   223.690163: __coresight_disable: 
====search path end==== source_1
               sh-7168    [006] ....   224.110670: __coresight_disable: 
====search path start==== source_2
               sh-7168    [006] ....   224.110679: __coresight_disable: 
====search path end==== source_2
            <...>-7180    [006] ....   224.929315: __coresight_disable: 
====search path start==== source_3
            <...>-7180    [006] ....   224.929324: __coresight_disable: 
====search path end==== source_3
            <...>-7186    [006] ....   225.343617: __coresight_disable: 
====search path start==== source_4
            <...>-7186    [006] ....   225.343626: __coresight_disable: 
====search path end==== source_4

 From the log, Searching the path from the IDR takes about 9us for each 
source.
Searching the path from the list takes about 23 ~ 28us for the source. 
Use IDR saves much time.

Thanks

Jinlong Mao
>
> thanks,
>
> greg k-h
Mao Jinlong March 29, 2022, 1:56 p.m. UTC | #2
Hi Suzuki,

On 3/28/2022 4:33 PM, Suzuki K Poulose wrote:
> On 24/03/2022 14:23, Jinlong Mao wrote:
>> Hi Greg,
>>
>> Thanks for your review.
>>
>> On 3/24/2022 8:26 PM, Greg Kroah-Hartman wrote:
>>> On Thu, Mar 24, 2022 at 08:17:25PM +0800, Mao Jinlong wrote:
>>>> Use hash length of the source's device name to map to the pointer
>>>> of the enabled path. Using IDR will be more efficient than using
>>>> the list. And there could be other sources except STM and CPU etms
>>>> in the new HWs. It is better to maintain all the paths together.
>>>>
>>>> Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
>>>> ---
>>>>   drivers/hwtracing/coresight/coresight-core.c | 75 
>>>> +++++++-------------
>>>>   1 file changed, 26 insertions(+), 49 deletions(-)
>>> Your subject line is odd.  Please put back the driver subsystem in the
>>> subject line so that it makes more sense.
>> I will update the subject in next version.
>>>
>>> And how have you measured "more efficient"?
>>
>> Using IDR would be better than doing a sequential search as there 
>> will be much more device  in future.
>
> Where do we use sequential search now ? For non-CPU bound sources, yes
> we may need something. But CPU case is straight forward, and could be
> retained as it is. i.e., per-cpu list of paths.
>
We use list to store the paths for both ETM and non-CPU bound sources in 
patch below.

“[PATCH 01/10] coresight: add support to enable more coresight paths”

According to Mathieu's comments, IDR is used now.  So i added "Using IDR 
will be more efficient than using
the list" this message in my commit message. I think we need to use one 
mechanism to store ETM and
non-CPU bound sources.


Mathieu's comments:

So many TPDM and many ETMs...  That is definitely a reason to do better than a
sequential search.

If an IDR (or some other kind of mechanism) is used then we can use that to
store paths associated with ETMs as well.  That way everything works the same
way and access time is constant for any kind of source.

Thanks

Jinlong Mao

> Cheers
> Suzuki
>
>
>>
>>>
>>> thanks,
>>>
>>> greg k-h
>>
>> Thanks
>>
>> Jinlong Mao
>>
>
Mathieu Poirier March 29, 2022, 2:36 p.m. UTC | #3
On Tue, 29 Mar 2022 at 07:56, Jinlong Mao <quic_jinlmao@quicinc.com> wrote:
>
> Hi Suzuki,
>
> On 3/28/2022 4:33 PM, Suzuki K Poulose wrote:
> > On 24/03/2022 14:23, Jinlong Mao wrote:
> >> Hi Greg,
> >>
> >> Thanks for your review.
> >>
> >> On 3/24/2022 8:26 PM, Greg Kroah-Hartman wrote:
> >>> On Thu, Mar 24, 2022 at 08:17:25PM +0800, Mao Jinlong wrote:
> >>>> Use hash length of the source's device name to map to the pointer
> >>>> of the enabled path. Using IDR will be more efficient than using
> >>>> the list. And there could be other sources except STM and CPU etms
> >>>> in the new HWs. It is better to maintain all the paths together.
> >>>>
> >>>> Signed-off-by: Mao Jinlong <quic_jinlmao@quicinc.com>
> >>>> ---
> >>>>   drivers/hwtracing/coresight/coresight-core.c | 75
> >>>> +++++++-------------
> >>>>   1 file changed, 26 insertions(+), 49 deletions(-)
> >>> Your subject line is odd.  Please put back the driver subsystem in the
> >>> subject line so that it makes more sense.
> >> I will update the subject in next version.
> >>>
> >>> And how have you measured "more efficient"?
> >>
> >> Using IDR would be better than doing a sequential search as there
> >> will be much more device  in future.
> >
> > Where do we use sequential search now ? For non-CPU bound sources, yes
> > we may need something. But CPU case is straight forward, and could be
> > retained as it is. i.e., per-cpu list of paths.
> >
> We use list to store the paths for both ETM and non-CPU bound sources in
> patch below.
>
> “[PATCH 01/10] coresight: add support to enable more coresight paths”
>
> According to Mathieu's comments, IDR is used now.  So i added "Using IDR
> will be more efficient than using
> the list" this message in my commit message. I think we need to use one
> mechanism to store ETM and
> non-CPU bound sources.
>
>
> Mathieu's comments:
>
> So many TPDM and many ETMs...  That is definitely a reason to do better than a
> sequential search.
>
> If an IDR (or some other kind of mechanism) is used then we can use that to
> store paths associated with ETMs as well.  That way everything works the same
> way and access time is constant for any kind of source.

As per my last sentence above, the goal of  my comment was to simplify
things so that we don't have two different ways of managing sources.
But if that ends up causing more trouble than benefit then it should
be avoided.

>
> Thanks
>
> Jinlong Mao
>
> > Cheers
> > Suzuki
> >
> >
> >>
> >>>
> >>> thanks,
> >>>
> >>> greg k-h
> >>
> >> Thanks
> >>
> >> Jinlong Mao
> >>
> >
Mao Jinlong April 11, 2022, 2:55 a.m. UTC | #4
On 3/30/2022 5:05 PM, Suzuki K Poulose wrote:
> On 30/03/2022 03:10, Jinlong Mao wrote:
>>
>> On 3/29/2022 10:36 PM, Mathieu Poirier wrote:
>>> On Tue, 29 Mar 2022 at 07:56, Jinlong Mao<quic_jinlmao@quicinc.com>  
>>> wrote:
>>>> Hi Suzuki,
>>>>
>>>> On 3/28/2022 4:33 PM, Suzuki K Poulose wrote:
>>>>> On 24/03/2022 14:23, Jinlong Mao wrote:
>>>>>> Hi Greg,
>>>>>>
>>>>>> Thanks for your review.
>>>>>>
>>>>>> On 3/24/2022 8:26 PM, Greg Kroah-Hartman wrote:
>>>>>>> On Thu, Mar 24, 2022 at 08:17:25PM +0800, Mao Jinlong wrote:
>>>>>>>> Use hash length of the source's device name to map to the pointer
>>>>>>>> of the enabled path. Using IDR will be more efficient than using
>>>>>>>> the list. And there could be other sources except STM and CPU etms
>>>>>>>> in the new HWs. It is better to maintain all the paths together.
>>>>>>>>
>>>>>>>> Signed-off-by: Mao Jinlong<quic_jinlmao@quicinc.com>
>>>>>>>> ---
>>>>>>>>    drivers/hwtracing/coresight/coresight-core.c | 75
>>>>>>>> +++++++-------------
>>>>>>>>    1 file changed, 26 insertions(+), 49 deletions(-)
>>>>>>> Your subject line is odd.  Please put back the driver subsystem 
>>>>>>> in the
>>>>>>> subject line so that it makes more sense.
>>>>>> I will update the subject in next version.
>>>>>>> And how have you measured "more efficient"?
>>>>>> Using IDR would be better than doing a sequential search as there
>>>>>> will be much more device  in future.
>>>>> Where do we use sequential search now ? For non-CPU bound sources, 
>>>>> yes
>>>>> we may need something. But CPU case is straight forward, and could be
>>>>> retained as it is. i.e., per-cpu list of paths.
>>>>>
>>>> We use list to store the paths for both ETM and non-CPU bound 
>>>> sources in
>>>> patch below.
>>>>
>>>> “[PATCH 01/10] coresight: add support to enable more coresight paths”
>>>>
>>>> According to Mathieu's comments, IDR is used now.  So i added 
>>>> "Using IDR
>>>> will be more efficient than using
>>>> the list" this message in my commit message. I think we need to use 
>>>> one
>>>> mechanism to store ETM and
>>>> non-CPU bound sources.
>>>>
>>>>
>>>> Mathieu's comments:
>>>>
>>>> So many TPDM and many ETMs...  That is definitely a reason to do 
>>>> better than a
>>>> sequential search.
>>>>
>>>> If an IDR (or some other kind of mechanism) is used then we can use 
>>>> that to
>>>> store paths associated with ETMs as well.  That way everything 
>>>> works the same
>>>> way and access time is constant for any kind of source.
>>> As per my last sentence above, the goal of  my comment was to simplify
>>> things so that we don't have two different ways of managing sources.
>>> But if that ends up causing more trouble than benefit then it should
>>> be avoided.
>>
>> Hi Mathieu,
>>
>> I didn't see any disadvantage to use IDR to store both ETM source and 
>> non-CPU bound sources.
>>
>> Benefits:
>>
>>   * Only need to maintain one way of managing sources.
>>   * Less time to search the path
>
> My preference is to keep the ETM source paths per-CPU. For the reasons
> below :
>   - It is straight forward for an ETM. per_cpu(paths, cpu)
>   - It is faster than the IDR.
>   - Makes the debugging easier. Simply lookup the per_cpu variable.
>
> I agree that the IDR is required for the non ETM sources. And I am fine
> with that.
>
> Suzuki

Hi Suzuki,

I will address your comments in next version.

Could you please help to review other patches ?

Thanks

Jinlong Mao

>
>>
>> Thanks
>> Jinlong Mao
>>>> Thanks
>>>>
>>>> Jinlong Mao
>>>>
>>>>> Cheers
>>>>> Suzuki
>>>>>
>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> greg k-h
>>>>>> Thanks
>>>>>>
>>>>>> Jinlong Mao
>>>>>>
>>> _______________________________________________
>>> CoreSight mailing list --coresight@lists.linaro.org
>>> To unsubscribe send an email tocoresight-leave@lists.linaro.org
>