[v9,0/7] PCI: mediatek: Add new generation controller support

Message ID	20210324030510.29177-1-jianjun.wang@mediatek.com
Headers	show Return-Path: <devicetree-owner@kernel.org> From: Jianjun Wang <jianjun.wang@mediatek.com> To: Bjorn Helgaas <bhelgaas@google.com>, Rob Herring <robh+dt@kernel.org>, Marc Zyngier <maz@kernel.org>, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>, Ryder Lee <ryder.lee@mediatek.com> CC: Philipp Zabel <p.zabel@pengutronix.de>, Matthias Brugger <matthias.bgg@gmail.com>, <linux-pci@vger.kernel.org>, <linux-mediatek@lists.infradead.org>, <devicetree@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>, Jianjun Wang <jianjun.wang@mediatek.com>, <youlin.pei@mediatek.com>, <chuanjia.liu@mediatek.com>, <qizhong.cheng@mediatek.com>, <sin_jieyang@mediatek.com>, <drinkcat@chromium.org>, <Rex-BC.Chen@mediatek.com>, <anson.chuang@mediatek.com>, Krzysztof Wilczyski <kw@linux.com>, =?utf-8?q?Pali_Roh=C3=A1r?= <pali@kernel.org> Subject: [v9,0/7] PCI: mediatek: Add new generation controller support Date: Wed, 24 Mar 2021 11:05:03 +0800 Message-ID: <20210324030510.29177-1-jianjun.wang@mediatek.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk
Series	PCI: mediatek: Add new generation controller support \| expand [v9,0/7] PCI: mediatek: Add new generation controller support [v9,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192 [v9,4/7] PCI: mediatek-gen3: Add INTx support [v9,6/7] PCI: mediatek-gen3: Add system PM support

Jianjun Wang (王建军) March 24, 2021, 3:05 a.m. UTC

These series patches add pcie-mediatek-gen3.c and dt-bindings file to
support new generation PCIe controller.

Changes in v9:
1. Use mtk_pcie_parse_port() to get the hw resources;
2. Remove unnecessary logs;
3. Add local IRQ enable status save/restore instead of
   the enable/disable callbacks for suspend/resume;
4. Fix typos.

Changes in v8:
1. Add irq_clock to protect IRQ register access;
2. Mask all INTx interrupt when startup port;
3. Remove activate/deactivate callbacks from bottom_domain_ops;
4. Add unmask/mask callbacks in mtk_msi_bottom_irq_chip;
5. Add property information for reg-names.

Changes in v7:
1. Split the driver patch to core PCIe, INTx, MSI and PM patches;
2. Reshape MSI init and handle flow, use msi_bottom_domain to cover all sets;
3. Replace readl/writel with their relaxed version;
4. Add MSI description in binding document;
5. Add pl_250m clock in binding document.

Changes in v6:
1. Export pci_pio_to_address() to support compiling as kernel module;
2. Replace usleep_range(100 * 1000, 120 * 1000) with msleep(100);
3. Replace dev_notice with dev_err;
4. Fix MSI get hwirq flow;
5. Fix warning for possible recursive locking in mtk_pcie_set_affinity.

Changes in v5:
1. Remove unused macros
2. Modify the config read/write callbacks, set the config byte field
   in TLP header and use pci_generic_config_read32/write32
   to access the config space
3. Fix the settings of translation window, both MEM and IO regions
   works properly
4. Fix typos

Changes in v4:
1. Fix PCIe power up/down flow
2. Use "mac" and "phy" for reset names
3. Add clock names
4. Fix the variables type

Changes in v3:
1. Remove standard property in binding document
2. Return error number when get_optional* API throws an error
3. Use the bulk clk APIs

Changes in v2:
1. Fix the typo of dt-bindings patch
2. Remove the unnecessary properties in binding document
3. dispos the irq mappings of msi top domain when irq teardown

Jianjun Wang (7):
  dt-bindings: PCI: mediatek-gen3: Add YAML schema
  PCI: Export pci_pio_to_address() for module use
  PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192
  PCI: mediatek-gen3: Add INTx support
  PCI: mediatek-gen3: Add MSI support
  PCI: mediatek-gen3: Add system PM support
  MAINTAINERS: Add Jianjun Wang as MediaTek PCI co-maintainer

 .../bindings/pci/mediatek-pcie-gen3.yaml      |  181 +++
 MAINTAINERS                                   |    1 +
 drivers/pci/controller/Kconfig                |   13 +
 drivers/pci/controller/Makefile               |    1 +
 drivers/pci/controller/pcie-mediatek-gen3.c   | 1025 +++++++++++++++++
 drivers/pci/pci.c                             |    1 +
 6 files changed, 1222 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/mediatek-pcie-gen3.yaml
 create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c

Pali Rohár March 24, 2021, 9:09 a.m. UTC | #1

On Wednesday 24 March 2021 11:05:05 Jianjun Wang wrote:
> This interface will be used by PCI host drivers for PIO translation,
> export it to support compiling those drivers as kernel modules.
> 
> Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>
> ---
>  drivers/pci/pci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 16a17215f633..12bba221c9f2 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4052,6 +4052,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio)
>  
>  	return address;
>  }
> +EXPORT_SYMBOL(pci_pio_to_address);

Hello! I'm not sure if EXPORT_SYMBOL is correct because file has GPL-2.0
header. Should not be in this case used only EXPORT_SYMBOL_GPL? Maybe
other people would know what is correct?

>  
>  unsigned long __weak pci_address_to_pio(phys_addr_t address)
>  {
> -- 
> 2.25.1
>

Pali Rohár March 27, 2021, 7:28 p.m. UTC | #2

On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote:
> +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx)

> +{

> +	struct mtk_msi_set *msi_set = &port->msi_sets[set_idx];

> +	unsigned long msi_enable, msi_status;

> +	unsigned int virq;

> +	irq_hw_number_t bit, hwirq;

> +

> +	msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET);

> +

> +	do {

> +		msi_status = readl_relaxed(msi_set->base +

> +					   PCIE_MSI_SET_STATUS_OFFSET);

> +		msi_status &= msi_enable;

> +		if (!msi_status)

> +			break;

> +

> +		for_each_set_bit(bit, &msi_status, PCIE_MSI_IRQS_PER_SET) {

> +			hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET;

> +			virq = irq_find_mapping(port->msi_bottom_domain, hwirq);

> +			generic_handle_irq(virq);

> +		}

> +	} while (true);


Hello!

Just a question, cannot this while-loop cause block of processing other
interrupts?

I have done tests with different HW (aardvark) but with same while(true)
loop logic. One XHCI PCIe controller was sending MSI interrupts too fast
and interrupt handler with this while(true) logic was in infinite loop.
During one IRQ it was calling infinite many times generic_handle_irq()
as HW was feeding new and new MSI hwirq into status register.

But this is different HW, so it can have different behavior and does not
have to cause above issue.

I have just spotted same code pattern for processing MSI interrupts...

> +}

> +

>  static void mtk_pcie_irq_handler(struct irq_desc *desc)

>  {

>  	struct mtk_pcie_port *port = irq_desc_get_handler_data(desc);

> @@ -405,6 +673,14 @@ static void mtk_pcie_irq_handler(struct irq_desc *desc)

>  		generic_handle_irq(virq);

>  	}

>  

> +	irq_bit = PCIE_MSI_SHIFT;

> +	for_each_set_bit_from(irq_bit, &status, PCIE_MSI_SET_NUM +

> +			      PCIE_MSI_SHIFT) {

> +		mtk_pcie_msi_handler(port, irq_bit - PCIE_MSI_SHIFT);

> +

> +		writel_relaxed(BIT(irq_bit), port->base + PCIE_INT_STATUS_REG);

> +	}

> +

>  	chained_irq_exit(irqchip, desc);

>  }

>  

> -- 

> 2.25.1

>

Marc Zyngier March 27, 2021, 7:44 p.m. UTC | #3

On Sat, 27 Mar 2021 19:28:37 +0000,
Pali Rohár <pali@kernel.org> wrote:
> 

> On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote:

> > +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx)

> > +{

> > +	struct mtk_msi_set *msi_set = &port->msi_sets[set_idx];

> > +	unsigned long msi_enable, msi_status;

> > +	unsigned int virq;

> > +	irq_hw_number_t bit, hwirq;

> > +

> > +	msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET);

> > +

> > +	do {

> > +		msi_status = readl_relaxed(msi_set->base +

> > +					   PCIE_MSI_SET_STATUS_OFFSET);

> > +		msi_status &= msi_enable;

> > +		if (!msi_status)

> > +			break;

> > +

> > +		for_each_set_bit(bit, &msi_status, PCIE_MSI_IRQS_PER_SET) {

> > +			hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET;

> > +			virq = irq_find_mapping(port->msi_bottom_domain, hwirq);

> > +			generic_handle_irq(virq);

> > +		}

> > +	} while (true);

> 

> Hello!

> 

> Just a question, cannot this while-loop cause block of processing other

> interrupts?

This is a level interrupt. You don't have much choice but to handle it
immediately, although an alternative would be to mask it and deal with
it in a thread. And since Linux doesn't deal with interrupt priority,
a screaming interrupt is never a good thing.

> I have done tests with different HW (aardvark) but with same while(true)

> loop logic. One XHCI PCIe controller was sending MSI interrupts too fast

> and interrupt handler with this while(true) logic was in infinite loop.

> During one IRQ it was calling infinite many times generic_handle_irq()

> as HW was feeding new and new MSI hwirq into status register.

Define "too fast". If something in the system is able to program the
XHCI device in such a way that it causes a screaming interrupt, that's
the place to look for problems, and probably not in the interrupt
handling itself, which does what it is supposed to do.

> But this is different HW, so it can have different behavior and does not

> have to cause above issue.

> 

> I have just spotted same code pattern for processing MSI interrupts...

This is a common pattern that you will find in pretty much any
interrupt handling/demuxing, and is done this way when the cost of
taking the exception is high compared to that of handling it.

Which is pretty much any of the badly designed, level-driving,
DW-inspired, sorry excuse for MSI implementations that are popular on
low-end ARM SoCs.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Pali Rohár March 27, 2021, 8:29 p.m. UTC | #4

On Saturday 27 March 2021 19:44:30 Marc Zyngier wrote:
> On Sat, 27 Mar 2021 19:28:37 +0000,

> Pali Rohár <pali@kernel.org> wrote:

> > 

> > On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote:

> > > +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx)

> > > +{

> > > +	struct mtk_msi_set *msi_set = &port->msi_sets[set_idx];

> > > +	unsigned long msi_enable, msi_status;

> > > +	unsigned int virq;

> > > +	irq_hw_number_t bit, hwirq;

> > > +

> > > +	msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET);

> > > +

> > > +	do {

> > > +		msi_status = readl_relaxed(msi_set->base +

> > > +					   PCIE_MSI_SET_STATUS_OFFSET);

> > > +		msi_status &= msi_enable;

> > > +		if (!msi_status)

> > > +			break;

> > > +

> > > +		for_each_set_bit(bit, &msi_status, PCIE_MSI_IRQS_PER_SET) {

> > > +			hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET;

> > > +			virq = irq_find_mapping(port->msi_bottom_domain, hwirq);

> > > +			generic_handle_irq(virq);

> > > +		}

> > > +	} while (true);

> > 

> > Hello!

> > 

> > Just a question, cannot this while-loop cause block of processing other

> > interrupts?

> 

> This is a level interrupt. You don't have much choice but to handle it

> immediately, although an alternative would be to mask it and deal with

> it in a thread. And since Linux doesn't deal with interrupt priority,

> a screaming interrupt is never a good thing.


I see. Something like "interrupt priority" (which does not exist?) would
be needed to handle it.

> > I have done tests with different HW (aardvark) but with same while(true)

> > loop logic. One XHCI PCIe controller was sending MSI interrupts too fast

> > and interrupt handler with this while(true) logic was in infinite loop.

> > During one IRQ it was calling infinite many times generic_handle_irq()

> > as HW was feeding new and new MSI hwirq into status register.

> 

> Define "too fast".


Fast - next interrupt comes prior checking if while(true)-loop should stop.

> If something in the system is able to program the

> XHCI device in such a way that it causes a screaming interrupt, that's

> the place to look for problems, and probably not in the interrupt

> handling itself, which does what it is supposed to do.

> 

> > But this is different HW, so it can have different behavior and does not

> > have to cause above issue.

> > 

> > I have just spotted same code pattern for processing MSI interrupts...

> 

> This is a common pattern that you will find in pretty much any

> interrupt handling/demuxing, and is done this way when the cost of

> taking the exception is high compared to that of handling it.


And would not help if while(true)-loop is replaced by loop with upper
limit of iterations? Or just call only one iteration?

> Which is pretty much any of the badly designed, level-driving,

> DW-inspired, sorry excuse for MSI implementations that are popular on

> low-end ARM SoCs.


Ok. So thank you for information!

> Thanks,

> 

> 	M.

> 

> -- 

> Without deviation from the norm, progress is not possible.

Marc Zyngier March 27, 2021, 9:45 p.m. UTC | #5

On Sat, 27 Mar 2021 20:29:04 +0000,
Pali Rohár <pali@kernel.org> wrote:
> 

> On Saturday 27 March 2021 19:44:30 Marc Zyngier wrote:

> > On Sat, 27 Mar 2021 19:28:37 +0000,

> > Pali Rohár <pali@kernel.org> wrote:

> > > 

> > > On Wednesday 24 March 2021 11:05:08 Jianjun Wang wrote:

> > > > +static void mtk_pcie_msi_handler(struct mtk_pcie_port *port, int set_idx)

> > > > +{

> > > > +	struct mtk_msi_set *msi_set = &port->msi_sets[set_idx];

> > > > +	unsigned long msi_enable, msi_status;

> > > > +	unsigned int virq;

> > > > +	irq_hw_number_t bit, hwirq;

> > > > +

> > > > +	msi_enable = readl_relaxed(msi_set->base + PCIE_MSI_SET_ENABLE_OFFSET);

> > > > +

> > > > +	do {

> > > > +		msi_status = readl_relaxed(msi_set->base +

> > > > +					   PCIE_MSI_SET_STATUS_OFFSET);

> > > > +		msi_status &= msi_enable;

> > > > +		if (!msi_status)

> > > > +			break;

> > > > +

> > > > +		for_each_set_bit(bit, &msi_status, PCIE_MSI_IRQS_PER_SET) {

> > > > +			hwirq = bit + set_idx * PCIE_MSI_IRQS_PER_SET;

> > > > +			virq = irq_find_mapping(port->msi_bottom_domain, hwirq);

> > > > +			generic_handle_irq(virq);

> > > > +		}

> > > > +	} while (true);

> > > 

> > > Hello!

> > > 

> > > Just a question, cannot this while-loop cause block of processing other

> > > interrupts?

> > 

> > This is a level interrupt. You don't have much choice but to handle it

> > immediately, although an alternative would be to mask it and deal with

> > it in a thread. And since Linux doesn't deal with interrupt priority,

> > a screaming interrupt is never a good thing.

> 

> I see. Something like "interrupt priority" (which does not exist?) would

> be needed to handle it.


Interrupt priorities definitely exist, but Linux doesn't use
them. Furthermore, This wouldn't be relevant here as you get a bunch
of MSI multiplexed onto a single one. Where would you apply the
priority?

> 

> > > I have done tests with different HW (aardvark) but with same while(true)

> > > loop logic. One XHCI PCIe controller was sending MSI interrupts too fast

> > > and interrupt handler with this while(true) logic was in infinite loop.

> > > During one IRQ it was calling infinite many times generic_handle_irq()

> > > as HW was feeding new and new MSI hwirq into status register.

> > 

> > Define "too fast".

> 

> Fast - next interrupt comes prior checking if while(true)-loop should stop.


That's definitely not something you can easily fix at the interrupt
handling level. You need to prevent this from happening. That's
usually the result of a misprogramming or a HW bug.

> > If something in the system is able to program the

> > XHCI device in such a way that it causes a screaming interrupt, that's

> > the place to look for problems, and probably not in the interrupt

> > handling itself, which does what it is supposed to do.

> > 

> > > But this is different HW, so it can have different behavior and does not

> > > have to cause above issue.

> > > 

> > > I have just spotted same code pattern for processing MSI interrupts...

> > 

> > This is a common pattern that you will find in pretty much any

> > interrupt handling/demuxing, and is done this way when the cost of

> > taking the exception is high compared to that of handling it.

> 

> And would not help if while(true)-loop is replaced by loop with upper

> limit of iterations? Or just call only one iteration?


That wouldn't change much: you would still have the interrupt being
pending, and it would fire again at the earliest opportunity.

At best, the root interrupt controller is able to present you with
another interrupt before forcing you to deal with the one you have
ignored again. But you cannot rely on that either.

And to be honest, other interrupts are only a part of the problem you
are describing. With a screaming interrupt, you can't execute
userspace. This is as bad as it gets.

	M.

-- 
Without deviation from the norm, progress is not possible.

Lorenzo Pieralisi April 13, 2021, 9:53 a.m. UTC | #6

On Wed, Mar 24, 2021 at 10:09:42AM +0100, Pali Rohár wrote:
> On Wednesday 24 March 2021 11:05:05 Jianjun Wang wrote:

> > This interface will be used by PCI host drivers for PIO translation,

> > export it to support compiling those drivers as kernel modules.

> > 

> > Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>

> > ---

> >  drivers/pci/pci.c | 1 +

> >  1 file changed, 1 insertion(+)

> > 

> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c

> > index 16a17215f633..12bba221c9f2 100644

> > --- a/drivers/pci/pci.c

> > +++ b/drivers/pci/pci.c

> > @@ -4052,6 +4052,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio)

> >  

> >  	return address;

> >  }

> > +EXPORT_SYMBOL(pci_pio_to_address);

> 

> Hello! I'm not sure if EXPORT_SYMBOL is correct because file has GPL-2.0

> header. Should not be in this case used only EXPORT_SYMBOL_GPL? Maybe

> other people would know what is correct?


I think this should be EXPORT_SYMBOL_GPL(), I can make this change
but this requires Bjorn's ACK to go upstream (Bjorn, it is my fault,
it was assigned to me on patchwork, now updated, please have a look).

Thanks,
Lorenzo

> >  

> >  unsigned long __weak pci_address_to_pio(phys_addr_t address)

> >  {

> > -- 

> > 2.25.1

> >

Bjorn Helgaas April 16, 2021, 7:21 p.m. UTC | #7

On Wed, Mar 24, 2021 at 11:05:03AM +0800, Jianjun Wang wrote:
> These series patches add pcie-mediatek-gen3.c and dt-bindings file to

> support new generation PCIe controller.


Incidental: b4 doesn't work on this thread, I suspect because the
usual subject line format is:

  [PATCH v9 9/7]

instead of:

  [v9,0/7]

For b4 info, see https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst

Bjorn Helgaas April 16, 2021, 7:24 p.m. UTC | #8

On Tue, Apr 13, 2021 at 10:53:05AM +0100, Lorenzo Pieralisi wrote:
> On Wed, Mar 24, 2021 at 10:09:42AM +0100, Pali Rohár wrote:

> > On Wednesday 24 March 2021 11:05:05 Jianjun Wang wrote:

> > > This interface will be used by PCI host drivers for PIO translation,

> > > export it to support compiling those drivers as kernel modules.

> > > 

> > > Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>

> > > ---

> > >  drivers/pci/pci.c | 1 +

> > >  1 file changed, 1 insertion(+)

> > > 

> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c

> > > index 16a17215f633..12bba221c9f2 100644

> > > --- a/drivers/pci/pci.c

> > > +++ b/drivers/pci/pci.c

> > > @@ -4052,6 +4052,7 @@ phys_addr_t pci_pio_to_address(unsigned long pio)

> > >  

> > >  	return address;

> > >  }

> > > +EXPORT_SYMBOL(pci_pio_to_address);

> > 

> > Hello! I'm not sure if EXPORT_SYMBOL is correct because file has GPL-2.0

> > header. Should not be in this case used only EXPORT_SYMBOL_GPL? Maybe

> > other people would know what is correct?

> 

> I think this should be EXPORT_SYMBOL_GPL(), I can make this change

> but this requires Bjorn's ACK to go upstream (Bjorn, it is my fault,

> it was assigned to me on patchwork, now updated, please have a look).


Yep, looks good to me, and I agree it should be EXPORT_SYMBOL_GPL().

Acked-by: Bjorn Helgaas <bhelgaas@google.com>


> > >  

> > >  unsigned long __weak pci_address_to_pio(phys_addr_t address)

> > >  {

> > > -- 

> > > 2.25.1

> > >

Lorenzo Pieralisi April 19, 2021, 10:44 a.m. UTC | #9

On Fri, Apr 16, 2021 at 02:21:00PM -0500, Bjorn Helgaas wrote:
> On Wed, Mar 24, 2021 at 11:05:03AM +0800, Jianjun Wang wrote:

> > These series patches add pcie-mediatek-gen3.c and dt-bindings file to

> > support new generation PCIe controller.

> 

> Incidental: b4 doesn't work on this thread, I suspect because the

> usual subject line format is:

> 

>   [PATCH v9 9/7]

> 

> instead of:

> 

>   [v9,0/7]

> 

> For b4 info, see https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst


Jianjun will update the series accordingly (and please add to v10 the
review tags you received.

Lorenzo

Jianjun Wang (王建军) April 20, 2021, 2:05 a.m. UTC | #10

On Mon, 2021-04-19 at 11:44 +0100, Lorenzo Pieralisi wrote:
> On Fri, Apr 16, 2021 at 02:21:00PM -0500, Bjorn Helgaas wrote:

> > On Wed, Mar 24, 2021 at 11:05:03AM +0800, Jianjun Wang wrote:

> > > These series patches add pcie-mediatek-gen3.c and dt-bindings file to

> > > support new generation PCIe controller.

> > 

> > Incidental: b4 doesn't work on this thread, I suspect because the

> > usual subject line format is:

> > 

> >   [PATCH v9 9/7]

> > 

> > instead of:

> > 

> >   [v9,0/7]

> > 

> > For b4 info, see https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/README.rst

> 

> Jianjun will update the series accordingly (and please add to v10 the

> review tags you received.

> 

> Lorenzo


Yes, I will update this series in v10 to fix the subject line format and
use EXPORT_SYMBOL_GPL(), thanks for your comments.

Thanks.

[v9,0/7] PCI: mediatek: Add new generation controller support

Message

Comments