mbox series

[0/4] perf/dwc_pcie: Fix registration issue in multi PCIe controller instances

Message ID 20240731-dwc_pmu_fix-v1-0-ca47d153e5b2@quicinc.com
Headers show
Series perf/dwc_pcie: Fix registration issue in multi PCIe controller instances | expand

Message

Krishna chaitanya chundru July 31, 2024, 4:23 a.m. UTC
When there are multiple of instances of PCIe controllers, registration
to perf driver fails with this error. This is because of having same
bdf value for devices under two different controllers.

Update the logic to use sbdf which is a unique number in case of
multi instance also.

When the PCIe devices are discovered late, the driver can't find
the PCIe devices and returns in the init without registering with
the bus notifier. Due to that the devices which are discovered late
the driver can't register for this.

Register for bus notifier even if the device is not found in init.

Update the vendor table with QCOM PCIe vendorid to support QCOM devices.

Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
---
Krishna chaitanya chundru (4):
      perf/dwc_pcie: Fix registration issue in multi PCIe controller instances
      Documentation: dwc_pcie_pmu: Update bdf to sbdf
      perf/dwc_pcie: Always register for PCIe bus notifier
      perf/dwc_pcie: Add support for QCOM vendor devices

 Documentation/admin-guide/perf/dwc_pcie_pmu.rst | 16 +++++++--------
 drivers/perf/dwc_pcie_pmu.c                     | 27 +++++++++++++------------
 2 files changed, 22 insertions(+), 21 deletions(-)
---
base-commit: b236787b0da563e3bad0dab1b4b9a5bb54eabd39
change-id: 20240731-dwc_pmu_fix-3729bd3657fe
prerequisite-change-id: 20240728-mhi_runtime_pm-2383b74c71ed:v1

Best regards,

Comments

Yicong Yang Aug. 15, 2024, 1:40 p.m. UTC | #1
On 2024/7/31 12:23, Krishna chaitanya chundru wrote:
> When there are multiple of instances of PCIe controllers, registration
> to perf driver fails with this error.
> sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0'
> CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty
> Hardware name: Qualcomm SA8775P Ride (DT)
> Call trace:
>  dump_backtrace.part.8+0x98/0xf0
>  show_stack+0x14/0x1c
>  dump_stack_lvl+0x74/0x88
>  dump_stack+0x14/0x1c
>  sysfs_warn_dup+0x60/0x78
>  sysfs_create_dir_ns+0xe8/0x100
>  kobject_add_internal+0x94/0x224
>  kobject_add+0xa8/0x118
>  device_add+0x298/0x7b4
>  platform_device_add+0x1a0/0x228
>  platform_device_register_full+0x11c/0x148
>  dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu]
>  dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu]
>  do_one_initcall+0x58/0x1c0
>  do_init_module+0x58/0x208
>  load_module+0x1804/0x188c
>  __do_sys_init_module+0x18c/0x1f0
>  __arm64_sys_init_module+0x14/0x1c
>  invoke_syscall+0x40/0xf8
>  el0_svc_common.constprop.1+0x70/0xf4
>  do_el0_svc+0x18/0x20
>  el0_svc+0x28/0xb0
>  el0t_64_sync_handler+0x9c/0xc0
>  el0t_64_sync+0x160/0x164
> kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST,
> don't try to register things with the same name in the same directory.
> 
> This is because of having same bdf value for devices under two different
> controllers.
> 
> Update the logic to use sbdf which is a unique number in case of
> multi instance also.
> 
> Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")

Did you run into this on a QCOM platform with Patch 4/4 since there's
multiple PCIe domains?

> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
> ---
>  drivers/perf/dwc_pcie_pmu.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
> index c5e328f23841..c115348b8d53 100644
> --- a/drivers/perf/dwc_pcie_pmu.c
> +++ b/drivers/perf/dwc_pcie_pmu.c
> @@ -556,10 +556,10 @@ static int dwc_pcie_register_dev(struct pci_dev *pdev)
>  {
>  	struct platform_device *plat_dev;
>  	struct dwc_pcie_dev_info *dev_info;
> -	u32 bdf;
> +	u32 sbdf;
>  
> -	bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
> -	plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf,
> +	sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);
> +	plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", sbdf,
>  						 pdev, sizeof(*pdev));
>  
>  	if (IS_ERR(plat_dev))
> @@ -611,15 +611,15 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>  	struct pci_dev *pdev = plat_dev->dev.platform_data;
>  	struct dwc_pcie_pmu *pcie_pmu;
>  	char *name;
> -	u32 bdf, val;
> +	u32 sbdf, val;
>  	u16 vsec;
>  	int ret;
>  
>  	vsec = pci_find_vsec_capability(pdev, pdev->vendor,
>  					DWC_PCIE_VSEC_RAS_DES_ID);
>  	pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
> -	bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
> -	name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf);
> +	sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);

sbdf is also registerd as the id of the platform device in platform_device_register_data() above,
can we use it directly here without encoding it again?

Thanks.

> +	name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", sbdf);
>  	if (!name)
>  		return -ENOMEM;
>  
> @@ -650,7 +650,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>  	ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
>  				       &pcie_pmu->cpuhp_node);
>  	if (ret) {
> -		pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf);
> +		pci_err(pdev, "Error %d registering hotplug @%x\n", ret, sbdf);
>  		return ret;
>  	}
>  
> @@ -663,7 +663,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>  
>  	ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
>  	if (ret) {
> -		pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf);
> +		pci_err(pdev, "Error %d registering PMU @%x\n", ret, sbdf);
>  		return ret;
>  	}
>  	ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu,
>
Yicong Yang Aug. 15, 2024, 1:49 p.m. UTC | #2
On 2024/7/31 12:23, Krishna chaitanya chundru wrote:
> When the PCIe devices are discovered late, the driver can't find
> the PCIe devices and returns in the init without registering with
> the bus notifier. Due to that the devices which are discovered late
> the driver can't register for this.
> 
> Register for bus notifier even if the device is not found in init.
> 
> Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
> ---
>  drivers/perf/dwc_pcie_pmu.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
> index c115348b8d53..aa1010b44bcb 100644
> --- a/drivers/perf/dwc_pcie_pmu.c
> +++ b/drivers/perf/dwc_pcie_pmu.c
> @@ -741,8 +741,6 @@ static int __init dwc_pcie_pmu_init(void)
>  
>  		found = true;
>  	}
> -	if (!found)
> -		return -ENODEV;
>  
>  	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>  				      "perf/dwc_pcie_pmu:online",
> @@ -753,9 +751,11 @@ static int __init dwc_pcie_pmu_init(void)
>  
>  	dwc_pcie_pmu_hp_state = ret;
>  
> -	ret = platform_driver_register(&dwc_pcie_pmu_driver);
> -	if (ret)
> -		goto platform_driver_register_err;
> +	if (!found) {
> +		ret = platform_driver_register(&dwc_pcie_pmu_driver);
> +		if (ret)
> +			goto platform_driver_register_err;
> +	}
> 

This doesn't match the commit.

If any device is found at this stage, we cannot use them since you don't
register a driver.

>  	ret = bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb);
>  	if (ret)
>
Krishna chaitanya chundru Aug. 16, 2024, 3:41 a.m. UTC | #3
On 8/15/2024 7:10 PM, Yicong Yang wrote:
> On 2024/7/31 12:23, Krishna chaitanya chundru wrote:
>> When there are multiple of instances of PCIe controllers, registration
>> to perf driver fails with this error.
>> sysfs: cannot create duplicate filename '/devices/platform/dwc_pcie_pmu.0'
>> CPU: 0 PID: 166 Comm: modprobe Not tainted 6.10.0-rc2-next-20240607-dirty
>> Hardware name: Qualcomm SA8775P Ride (DT)
>> Call trace:
>>   dump_backtrace.part.8+0x98/0xf0
>>   show_stack+0x14/0x1c
>>   dump_stack_lvl+0x74/0x88
>>   dump_stack+0x14/0x1c
>>   sysfs_warn_dup+0x60/0x78
>>   sysfs_create_dir_ns+0xe8/0x100
>>   kobject_add_internal+0x94/0x224
>>   kobject_add+0xa8/0x118
>>   device_add+0x298/0x7b4
>>   platform_device_add+0x1a0/0x228
>>   platform_device_register_full+0x11c/0x148
>>   dwc_pcie_register_dev+0x74/0xf0 [dwc_pcie_pmu]
>>   dwc_pcie_pmu_init+0x7c/0x1000 [dwc_pcie_pmu]
>>   do_one_initcall+0x58/0x1c0
>>   do_init_module+0x58/0x208
>>   load_module+0x1804/0x188c
>>   __do_sys_init_module+0x18c/0x1f0
>>   __arm64_sys_init_module+0x14/0x1c
>>   invoke_syscall+0x40/0xf8
>>   el0_svc_common.constprop.1+0x70/0xf4
>>   do_el0_svc+0x18/0x20
>>   el0_svc+0x28/0xb0
>>   el0t_64_sync_handler+0x9c/0xc0
>>   el0t_64_sync+0x160/0x164
>> kobject: kobject_add_internal failed for dwc_pcie_pmu.0 with -EEXIST,
>> don't try to register things with the same name in the same directory.
>>
>> This is because of having same bdf value for devices under two different
>> controllers.
>>
>> Update the logic to use sbdf which is a unique number in case of
>> multi instance also.
>>
>> Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
> 
> Did you run into this on a QCOM platform with Patch 4/4 since there's
> multiple PCIe domains?
> 
Yes we ran this in QCOM platform where it has multiple PCIe instances.
>> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
>> ---
>>   drivers/perf/dwc_pcie_pmu.c | 16 ++++++++--------
>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
>> index c5e328f23841..c115348b8d53 100644
>> --- a/drivers/perf/dwc_pcie_pmu.c
>> +++ b/drivers/perf/dwc_pcie_pmu.c
>> @@ -556,10 +556,10 @@ static int dwc_pcie_register_dev(struct pci_dev *pdev)
>>   {
>>   	struct platform_device *plat_dev;
>>   	struct dwc_pcie_dev_info *dev_info;
>> -	u32 bdf;
>> +	u32 sbdf;
>>   
>> -	bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
>> -	plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", bdf,
>> +	sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);
>> +	plat_dev = platform_device_register_data(NULL, "dwc_pcie_pmu", sbdf,
>>   						 pdev, sizeof(*pdev));
>>   
>>   	if (IS_ERR(plat_dev))
>> @@ -611,15 +611,15 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>>   	struct pci_dev *pdev = plat_dev->dev.platform_data;
>>   	struct dwc_pcie_pmu *pcie_pmu;
>>   	char *name;
>> -	u32 bdf, val;
>> +	u32 sbdf, val;
>>   	u16 vsec;
>>   	int ret;
>>   
>>   	vsec = pci_find_vsec_capability(pdev, pdev->vendor,
>>   					DWC_PCIE_VSEC_RAS_DES_ID);
>>   	pci_read_config_dword(pdev, vsec + PCI_VNDR_HEADER, &val);
>> -	bdf = PCI_DEVID(pdev->bus->number, pdev->devfn);
>> -	name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", bdf);
>> +	sbdf = (pci_domain_nr(pdev->bus) << 16) | PCI_DEVID(pdev->bus->number, pdev->devfn);
> 
> sbdf is also registerd as the id of the platform device in platform_device_register_data() above,
> can we use it directly here without encoding it again?
> 
> Thanks.
> 
ack.

- Krishna chaitanya.
>> +	name = devm_kasprintf(&plat_dev->dev, GFP_KERNEL, "dwc_rootport_%x", sbdf);
>>   	if (!name)
>>   		return -ENOMEM;
>>   
>> @@ -650,7 +650,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>>   	ret = cpuhp_state_add_instance(dwc_pcie_pmu_hp_state,
>>   				       &pcie_pmu->cpuhp_node);
>>   	if (ret) {
>> -		pci_err(pdev, "Error %d registering hotplug @%x\n", ret, bdf);
>> +		pci_err(pdev, "Error %d registering hotplug @%x\n", ret, sbdf);
>>   		return ret;
>>   	}
>>   
>> @@ -663,7 +663,7 @@ static int dwc_pcie_pmu_probe(struct platform_device *plat_dev)
>>   
>>   	ret = perf_pmu_register(&pcie_pmu->pmu, name, -1);
>>   	if (ret) {
>> -		pci_err(pdev, "Error %d registering PMU @%x\n", ret, bdf);
>> +		pci_err(pdev, "Error %d registering PMU @%x\n", ret, sbdf);
>>   		return ret;
>>   	}
>>   	ret = devm_add_action_or_reset(&plat_dev->dev, dwc_pcie_unregister_pmu,
>>
Krishna chaitanya chundru Aug. 16, 2024, 3:51 a.m. UTC | #4
On 8/15/2024 7:19 PM, Yicong Yang wrote:
> On 2024/7/31 12:23, Krishna chaitanya chundru wrote:
>> When the PCIe devices are discovered late, the driver can't find
>> the PCIe devices and returns in the init without registering with
>> the bus notifier. Due to that the devices which are discovered late
>> the driver can't register for this.
>>
>> Register for bus notifier even if the device is not found in init.
>>
>> Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
>> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
>> ---
>>   drivers/perf/dwc_pcie_pmu.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
>> index c115348b8d53..aa1010b44bcb 100644
>> --- a/drivers/perf/dwc_pcie_pmu.c
>> +++ b/drivers/perf/dwc_pcie_pmu.c
>> @@ -741,8 +741,6 @@ static int __init dwc_pcie_pmu_init(void)
>>   
>>   		found = true;
>>   	}
>> -	if (!found)
>> -		return -ENODEV;
>>   
>>   	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>>   				      "perf/dwc_pcie_pmu:online",
>> @@ -753,9 +751,11 @@ static int __init dwc_pcie_pmu_init(void)
>>   
>>   	dwc_pcie_pmu_hp_state = ret;
>>   
>> -	ret = platform_driver_register(&dwc_pcie_pmu_driver);
>> -	if (ret)
>> -		goto platform_driver_register_err;
>> +	if (!found) {
>> +		ret = platform_driver_register(&dwc_pcie_pmu_driver);
>> +		if (ret)
>> +			goto platform_driver_register_err;
>> +	}
>>
> 
> This doesn't match the commit.
> > If any device is found at this stage, we cannot use them since you don't
> register a driver.
> 
If the device is not found then only we are skipping platform driver
register otherwise driver will be registering with the platform driver.

- Krishna Chaitanya.
>>   	ret = bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb);
>>   	if (ret)
>>
Yicong Yang Aug. 16, 2024, 7:42 a.m. UTC | #5
On 2024/8/16 11:51, Krishna Chaitanya Chundru wrote:
> 
> 
> On 8/15/2024 7:19 PM, Yicong Yang wrote:
>> On 2024/7/31 12:23, Krishna chaitanya chundru wrote:
>>> When the PCIe devices are discovered late, the driver can't find
>>> the PCIe devices and returns in the init without registering with
>>> the bus notifier. Due to that the devices which are discovered late
>>> the driver can't register for this.
>>>
>>> Register for bus notifier even if the device is not found in init.
>>>
>>> Fixes: af9597adc2f1 ("drivers/perf: add DesignWare PCIe PMU driver")
>>> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
>>> ---
>>>   drivers/perf/dwc_pcie_pmu.c | 10 +++++-----
>>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/perf/dwc_pcie_pmu.c b/drivers/perf/dwc_pcie_pmu.c
>>> index c115348b8d53..aa1010b44bcb 100644
>>> --- a/drivers/perf/dwc_pcie_pmu.c
>>> +++ b/drivers/perf/dwc_pcie_pmu.c
>>> @@ -741,8 +741,6 @@ static int __init dwc_pcie_pmu_init(void)
>>>             found = true;
>>>       }
>>> -    if (!found)
>>> -        return -ENODEV;
>>>         ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
>>>                         "perf/dwc_pcie_pmu:online",
>>> @@ -753,9 +751,11 @@ static int __init dwc_pcie_pmu_init(void)
>>>         dwc_pcie_pmu_hp_state = ret;
>>>   -    ret = platform_driver_register(&dwc_pcie_pmu_driver);
>>> -    if (ret)
>>> -        goto platform_driver_register_err;
>>> +    if (!found) {
>>> +        ret = platform_driver_register(&dwc_pcie_pmu_driver);
>>> +        if (ret)
>>> +            goto platform_driver_register_err;
>>> +    }
>>>
>>
>> This doesn't match the commit.
>> > If any device is found at this stage, we cannot use them since you don't
>> register a driver.
>>
> If the device is not found then only we are skipping platform driver
> register otherwise driver will be registering with the platform driver.
> 

think about the case that devices already discovered before module init.
without the change here we'll register both the platform devices and driver
but with the change here we'll only register the platform devices without
the related driver to probe them.

Try to register the driver and notifier unconditionally will solve the issue.
It'll probe the device and register the PMU if later device is added by
the bus notifier.

Thanks.

> - Krishna Chaitanya.
>>>       ret = bus_register_notifier(&pci_bus_type, &dwc_pcie_pmu_nb);
>>>       if (ret)
>>>
> 
> .