diff mbox series

[v8,3/7] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

Message ID 20210224061132.26526-4-jianjun.wang@mediatek.com
State New
Headers show
Series [v8,1/7] dt-bindings: PCI: mediatek-gen3: Add YAML schema | expand

Commit Message

Jianjun Wang (王建军) Feb. 24, 2021, 6:11 a.m. UTC
MediaTek's PCIe host controller has three generation HWs, the new
generation HW is an individual bridge, it supports Gen3 speed and
compatible with Gen2, Gen1 speed.

Add support for new Gen3 controller which can be found on MT8192.

Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
---
 drivers/pci/controller/Kconfig              |  13 +
 drivers/pci/controller/Makefile             |   1 +
 drivers/pci/controller/pcie-mediatek-gen3.c | 457 ++++++++++++++++++++
 3 files changed, 471 insertions(+)
 create mode 100644 drivers/pci/controller/pcie-mediatek-gen3.c

Comments

Jianjun Wang (王建军) Feb. 25, 2021, 3:07 a.m. UTC | #1
Hi Krzysztof,

Thanks for your review, I will fix these at next version.

Thanks.

On Wed, 2021-02-24 at 14:36 +0100, Krzysztof Wilczyński wrote:
> Hi Jianjun,

> 

> Thank you for all the work here!

> 

> [...]

> > + * struct mtk_pcie_port - PCIe port information

> > + * @dev: pointer to PCIe device

> > + * @base: IO mapped register base

> > + * @reg_base: Physical register base

> > + * @mac_reset: mac reset control

> > + * @phy_reset: phy reset control

> > + * @phy: PHY controller block

> > + * @clks: PCIe clocks

> > + * @num_clks: PCIe clocks count for this port

> 

> It would be "MAC" and "PHY" in the above.

> 

> [...]

> > + * mtk_pcie_config_tlp_header

> > + * @bus: PCI bus to query

> > + * @devfn: device/function number

> > + * @where: offset in config space

> > + * @size: data size in TLP header

> > + *

> > + * Set byte enable field and device information in configuration TLP header.

> 

> The kernel-doc above might be missing brief function description.  See

> the following for more concrete example:

> 

>   https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation

> 

> [...]

> > +static int mtk_pcie_set_trans_table(struct mtk_pcie_port *port,

> > +				    resource_size_t cpu_addr,

> > +				    resource_size_t pci_addr,

> > +				    resource_size_t size,

> > +				    unsigned long type, int num)

> > +{

> > +	void __iomem *table;

> > +	u32 val;

> > +

> > +	if (num >= PCIE_MAX_TRANS_TABLES) {

> > +		dev_err(port->dev, "not enough translate table[%d] for addr: %#llx, limited to [%d]\n",

> 

> The wording of this error message is a little confusing.

> 

> > +			num, (unsigned long long) cpu_addr,

> 

> No space between the bracket and the variable name.

> 

> [...]

> > +	err = phy_init(port->phy);

> > +	if (err) {

> > +		dev_err(dev, "failed to initialize PCIe phy\n");

> > +		goto err_phy_init;

> > +	}

> > +

> > +	err = phy_power_on(port->phy);

> > +	if (err) {

> > +		dev_err(dev, "failed to power on PCIe phy\n");

> > +		goto err_phy_on;

> > +	}

> [...]

> 

> It would be "PHY" in the error messages above.

> 

> [...]

> > +	if (err) {

> > +		dev_err(dev, "clock init failed\n");

> > +		goto err_clk_init;

> > +	}

> [...]

> 

> A nitpick, so feel free to ignore it, of course.  What about "failed to

> initialize clock" to keep the style consistent.

> 

> [...]

> > +	err = mtk_pcie_startup_port(port);

> > +	if (err) {

> > +		dev_err(dev, "PCIe startup failed\n");

> [...]

> 

> Also a nitpick.  What about "failed to bring PCIe link up"?

> 

> Krzysztof
Pali Rohár March 11, 2021, 12:38 p.m. UTC | #2
On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:
> +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> +{

...
> +

> +	/* Delay 100ms to wait the reference clocks become stable */

> +	msleep(100);

> +

> +	/* De-assert PERST# signal */

> +	val &= ~PCIE_PE_RSTB;

> +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);


Hello! This is a new driver which introduce yet another custom timeout
prior PERST# signal for PCIe card is de-asserted. Timeouts for other
drivers I collected in older email [2].

Please look at my email [1] about PCIe Warm Reset if you have any clue
about it. Lorenzo and Rob already expressed that this timeout should not
be driver specific. But nobody was able to "decode" and "understand"
PCIe spec yet about these timeouts.

> +

> +	/* Check if the link is up or not */

> +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> +				 !!(val & PCIE_PORT_LINKUP), 20,

> +				 50 * USEC_PER_MSEC);


IIRC, you need to wait at least 100ms after de-asserting PERST# signal
as it is required by PCIe specs and also because experiments proved that
some Compex wifi cards (e.g. WLE900VX) are not detected if you do not
wait this minimal time.

> +	if (err) {

> +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> +		return err;

> +	}


[1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/
[2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/
Jianjun Wang (王建军) March 13, 2021, 7:43 a.m. UTC | #3
On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:
> On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> > +{

> ...

> > +

> > +	/* Delay 100ms to wait the reference clocks become stable */

> > +	msleep(100);

> > +

> > +	/* De-assert PERST# signal */

> > +	val &= ~PCIE_PE_RSTB;

> > +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);

> 

> Hello! This is a new driver which introduce yet another custom timeout

> prior PERST# signal for PCIe card is de-asserted. Timeouts for other

> drivers I collected in older email [2].

> 

> Please look at my email [1] about PCIe Warm Reset if you have any clue

> about it. Lorenzo and Rob already expressed that this timeout should not

> be driver specific. But nobody was able to "decode" and "understand"

> PCIe spec yet about these timeouts.


Hi Pali,

I think this is more like a platform specific timeout, which is used to
wait for the reference clocks to become stable and finish the reset flow
of HW blocks.

Here is the steps to start a link training in this HW:

1. Assert all reset signals which including the transaction layer, PIPE
interface and internal bus interface;

2. De-assert reset signals except the PERST#, this will make the
physical layer active and start to output the reference clock, but the
EP device remains in the reset state.
   Before releasing the PERST# signal, the HW blocks needs at least 10ms
to finish the reset flow, and ref-clk needs about 30us to become stable.

3. De-assert PERST# signal, wait LTSSM enter L0 state.

This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since
we are in the kernel stage, the power supply has already stabled, this
timeout may not take that long.

> > +

> > +	/* Check if the link is up or not */

> > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > +				 50 * USEC_PER_MSEC);

> 

> IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> as it is required by PCIe specs and also because experiments proved that

> some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> wait this minimal time.


Yes, this should be 100ms, I will fix it at next version, thanks for
your review.

Thanks.
> 

> > +	if (err) {

> > +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> > +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> > +		return err;

> > +	}

> 

> [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

> [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/
Pali Rohár March 18, 2021, 12:02 a.m. UTC | #4
On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote:
> On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:

> > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> > > +{

> > ...

> > > +

> > > +	/* Delay 100ms to wait the reference clocks become stable */

> > > +	msleep(100);

> > > +

> > > +	/* De-assert PERST# signal */

> > > +	val &= ~PCIE_PE_RSTB;

> > > +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);

> > 

> > Hello! This is a new driver which introduce yet another custom timeout

> > prior PERST# signal for PCIe card is de-asserted. Timeouts for other

> > drivers I collected in older email [2].

> > 

> > Please look at my email [1] about PCIe Warm Reset if you have any clue

> > about it. Lorenzo and Rob already expressed that this timeout should not

> > be driver specific. But nobody was able to "decode" and "understand"

> > PCIe spec yet about these timeouts.

> 

> Hi Pali,

> 

> I think this is more like a platform specific timeout, which is used to

> wait for the reference clocks to become stable and finish the reset flow

> of HW blocks.

> 

> Here is the steps to start a link training in this HW:

> 

> 1. Assert all reset signals which including the transaction layer, PIPE

> interface and internal bus interface;

> 

> 2. De-assert reset signals except the PERST#, this will make the

> physical layer active and start to output the reference clock, but the

> EP device remains in the reset state.

>    Before releasing the PERST# signal, the HW blocks needs at least 10ms

> to finish the reset flow, and ref-clk needs about 30us to become stable.

> 

> 3. De-assert PERST# signal, wait LTSSM enter L0 state.

> 

> This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since

> we are in the kernel stage, the power supply has already stabled, this

> timeout may not take that long.


I think that this is not platform specific timeout or platform specific
steps. This matches generic steps as defined in PCIe CEM spec, section
2.2.1. Initial Power-Up (G3 to S0).

What is platform specific is just how to achieve these steps.

Am I right?

...

TPVPERL is one of my timeout candidates as minimal required timeout for
Warm Reset. I have wrote it in email:

https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/

But I'm not sure as specially in none diagram is described just warm
reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal).

...

Anyway, I would suggest to define constants for those timeouts. I guess
that in future we could be able to define "generic" timeout constants
which would not be in private driver section, but in some common header
file.

> > > +

> > > +	/* Check if the link is up or not */

> > > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > > +				 50 * USEC_PER_MSEC);

> > 

> > IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> > as it is required by PCIe specs and also because experiments proved that

> > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> > wait this minimal time.

> 

> Yes, this should be 100ms, I will fix it at next version, thanks for

> your review.


In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for
this step during reviewing aardvark driver.

https://lore.kernel.org/linux-pci/20190426161050.GA189964@google.com/

And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100:

https://lore.kernel.org/linux-pci/20190522213351.21366-2-repk@triplefau.lt/

> Thanks.

> > 

> > > +	if (err) {

> > > +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> > > +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> > > +		return err;

> > > +	}

> > 

> > [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

> > [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/

>
Jianjun Wang (王建军) March 18, 2021, 5:48 a.m. UTC | #5
On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote:
> On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote:

> > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:

> > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> > > > +{

> > > ...

> > > > +

> > > > +	/* Delay 100ms to wait the reference clocks become stable */

> > > > +	msleep(100);

> > > > +

> > > > +	/* De-assert PERST# signal */

> > > > +	val &= ~PCIE_PE_RSTB;

> > > > +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);

> > > 

> > > Hello! This is a new driver which introduce yet another custom timeout

> > > prior PERST# signal for PCIe card is de-asserted. Timeouts for other

> > > drivers I collected in older email [2].

> > > 

> > > Please look at my email [1] about PCIe Warm Reset if you have any clue

> > > about it. Lorenzo and Rob already expressed that this timeout should not

> > > be driver specific. But nobody was able to "decode" and "understand"

> > > PCIe spec yet about these timeouts.

> > 

> > Hi Pali,

> > 

> > I think this is more like a platform specific timeout, which is used to

> > wait for the reference clocks to become stable and finish the reset flow

> > of HW blocks.

> > 

> > Here is the steps to start a link training in this HW:

> > 

> > 1. Assert all reset signals which including the transaction layer, PIPE

> > interface and internal bus interface;

> > 

> > 2. De-assert reset signals except the PERST#, this will make the

> > physical layer active and start to output the reference clock, but the

> > EP device remains in the reset state.

> >    Before releasing the PERST# signal, the HW blocks needs at least 10ms

> > to finish the reset flow, and ref-clk needs about 30us to become stable.

> > 

> > 3. De-assert PERST# signal, wait LTSSM enter L0 state.

> > 

> > This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since

> > we are in the kernel stage, the power supply has already stabled, this

> > timeout may not take that long.

> 

> I think that this is not platform specific timeout or platform specific

> steps. This matches generic steps as defined in PCIe CEM spec, section

> 2.2.1. Initial Power-Up (G3 to S0).

> 

> What is platform specific is just how to achieve these steps.

> 

> Am I right?

> 

> ...

> 

> TPVPERL is one of my timeout candidates as minimal required timeout for

> Warm Reset. I have wrote it in email:

> 

> https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/

> 

> But I'm not sure as specially in none diagram is described just warm

> reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal).

> 

> ...

> 

> Anyway, I would suggest to define constants for those timeouts. I guess

> that in future we could be able to define "generic" timeout constants

> which would not be in private driver section, but in some common header

> file.


I agree with this, but I'm not sure if we really need that long time in
the kernel stage, because the power supply has already stable and it's
really impact the boot time, especially when the platform have multi
ports and not connect any EP device, we need to wait 200ms for each port
when system bootup.

For this PCIe controller driver, I would like to change the timeout
value to 10ms to comply with the HW design, and save some boot time.

> 

> > > > +

> > > > +	/* Check if the link is up or not */

> > > > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > > > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > > > +				 50 * USEC_PER_MSEC);

> > > 

> > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> > > as it is required by PCIe specs and also because experiments proved that

> > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> > > wait this minimal time.

> > 

> > Yes, this should be 100ms, I will fix it at next version, thanks for

> > your review.

> 

> In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for

> this step during reviewing aardvark driver.

> 

> https://lore.kernel.org/linux-pci/20190426161050.GA189964@google.com/

> 

> And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100:

> 

> https://lore.kernel.org/linux-pci/20190522213351.21366-2-repk@triplefau.lt/


Sure, I will use PCI_PM_D3COLD_WAIT macro instead in the next version.

Thanks.

> 

> > Thanks.

> > > 

> > > > +	if (err) {

> > > > +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> > > > +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> > > > +		return err;

> > > > +	}

> > > 

> > > [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

> > > [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/

> >
Pali Rohár March 19, 2021, 6:53 p.m. UTC | #6
On Thursday 18 March 2021 13:48:07 Jianjun Wang wrote:
> On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote:

> > On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote:

> > > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:

> > > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > > > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> > > > > +{

> > > > ...

> > > > > +

> > > > > +	/* Delay 100ms to wait the reference clocks become stable */

> > > > > +	msleep(100);

> > > > > +

> > > > > +	/* De-assert PERST# signal */

> > > > > +	val &= ~PCIE_PE_RSTB;

> > > > > +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);

> > > > 

> > > > Hello! This is a new driver which introduce yet another custom timeout

> > > > prior PERST# signal for PCIe card is de-asserted. Timeouts for other

> > > > drivers I collected in older email [2].

> > > > 

> > > > Please look at my email [1] about PCIe Warm Reset if you have any clue

> > > > about it. Lorenzo and Rob already expressed that this timeout should not

> > > > be driver specific. But nobody was able to "decode" and "understand"

> > > > PCIe spec yet about these timeouts.

> > > 

> > > Hi Pali,

> > > 

> > > I think this is more like a platform specific timeout, which is used to

> > > wait for the reference clocks to become stable and finish the reset flow

> > > of HW blocks.

> > > 

> > > Here is the steps to start a link training in this HW:

> > > 

> > > 1. Assert all reset signals which including the transaction layer, PIPE

> > > interface and internal bus interface;

> > > 

> > > 2. De-assert reset signals except the PERST#, this will make the

> > > physical layer active and start to output the reference clock, but the

> > > EP device remains in the reset state.

> > >    Before releasing the PERST# signal, the HW blocks needs at least 10ms

> > > to finish the reset flow, and ref-clk needs about 30us to become stable.

> > > 

> > > 3. De-assert PERST# signal, wait LTSSM enter L0 state.

> > > 

> > > This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since

> > > we are in the kernel stage, the power supply has already stabled, this

> > > timeout may not take that long.

> > 

> > I think that this is not platform specific timeout or platform specific

> > steps. This matches generic steps as defined in PCIe CEM spec, section

> > 2.2.1. Initial Power-Up (G3 to S0).

> > 

> > What is platform specific is just how to achieve these steps.

> > 

> > Am I right?

> > 

> > ...

> > 

> > TPVPERL is one of my timeout candidates as minimal required timeout for

> > Warm Reset. I have wrote it in email:

> > 

> > https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/

> > 

> > But I'm not sure as specially in none diagram is described just warm

> > reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal).

> > 

> > ...

> > 

> > Anyway, I would suggest to define constants for those timeouts. I guess

> > that in future we could be able to define "generic" timeout constants

> > which would not be in private driver section, but in some common header

> > file.

> 

> I agree with this, but I'm not sure if we really need that long time in

> the kernel stage, because the power supply has already stable and it's

> really impact the boot time, especially when the platform have multi

> ports and not connect any EP device, we need to wait 200ms for each port

> when system bootup.


Ports are independent. So you can initialize them in parallel, right?

If you initialize each port in separate worker then during msleep calls
kernel can schedule other kernel thread to run and so it does not
increase boot time. While pcie is sleeping kernel can do other things.
So the result is that whole boot time is not increased, just reordered.

> For this PCIe controller driver, I would like to change the timeout

> value to 10ms to comply with the HW design, and save some boot time.


In case you can connect _any_ PCIe card to your HW then you cannot
decrease or change timeouts required by PCIe specs. Otherwise there can
be a card which would not be initialized correctly.

I'm debugging driver for aardvark PCIe controller and I see that Compex
cards really needs these timeouts, otherwise link is down and card
cannot be detected.

So I guess that there can be also other cards which requires other
timeouts as specified in PCIe specs.

> > 

> > > > > +

> > > > > +	/* Check if the link is up or not */

> > > > > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > > > > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > > > > +				 50 * USEC_PER_MSEC);

> > > > 

> > > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> > > > as it is required by PCIe specs and also because experiments proved that

> > > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> > > > wait this minimal time.

> > > 

> > > Yes, this should be 100ms, I will fix it at next version, thanks for

> > > your review.

> > 

> > In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for

> > this step during reviewing aardvark driver.

> > 

> > https://lore.kernel.org/linux-pci/20190426161050.GA189964@google.com/

> > 

> > And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100:

> > 

> > https://lore.kernel.org/linux-pci/20190522213351.21366-2-repk@triplefau.lt/

> 

> Sure, I will use PCI_PM_D3COLD_WAIT macro instead in the next version.

> 

> Thanks.

> 

> > 

> > > Thanks.

> > > > 

> > > > > +	if (err) {

> > > > > +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> > > > > +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> > > > > +		return err;

> > > > > +	}

> > > > 

> > > > [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

> > > > [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/

> > > 

>
Jianjun Wang (王建军) March 23, 2021, 1:31 a.m. UTC | #7
On Fri, 2021-03-19 at 19:53 +0100, Pali Rohár wrote:
> On Thursday 18 March 2021 13:48:07 Jianjun Wang wrote:

> > On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote:

> > > On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote:

> > > > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:

> > > > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > > > > > +static int mtk_pcie_startup_port(struct mtk_pcie_port *port)

> > > > > > +{

> > > > > ...

> > > > > > +

> > > > > > +	/* Delay 100ms to wait the reference clocks become stable */

> > > > > > +	msleep(100);

> > > > > > +

> > > > > > +	/* De-assert PERST# signal */

> > > > > > +	val &= ~PCIE_PE_RSTB;

> > > > > > +	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);

> > > > > 

> > > > > Hello! This is a new driver which introduce yet another custom timeout

> > > > > prior PERST# signal for PCIe card is de-asserted. Timeouts for other

> > > > > drivers I collected in older email [2].

> > > > > 

> > > > > Please look at my email [1] about PCIe Warm Reset if you have any clue

> > > > > about it. Lorenzo and Rob already expressed that this timeout should not

> > > > > be driver specific. But nobody was able to "decode" and "understand"

> > > > > PCIe spec yet about these timeouts.

> > > > 

> > > > Hi Pali,

> > > > 

> > > > I think this is more like a platform specific timeout, which is used to

> > > > wait for the reference clocks to become stable and finish the reset flow

> > > > of HW blocks.

> > > > 

> > > > Here is the steps to start a link training in this HW:

> > > > 

> > > > 1. Assert all reset signals which including the transaction layer, PIPE

> > > > interface and internal bus interface;

> > > > 

> > > > 2. De-assert reset signals except the PERST#, this will make the

> > > > physical layer active and start to output the reference clock, but the

> > > > EP device remains in the reset state.

> > > >    Before releasing the PERST# signal, the HW blocks needs at least 10ms

> > > > to finish the reset flow, and ref-clk needs about 30us to become stable.

> > > > 

> > > > 3. De-assert PERST# signal, wait LTSSM enter L0 state.

> > > > 

> > > > This 100ms timeout is reference to TPVPERL in the PCIe CEM spec. Since

> > > > we are in the kernel stage, the power supply has already stabled, this

> > > > timeout may not take that long.

> > > 

> > > I think that this is not platform specific timeout or platform specific

> > > steps. This matches generic steps as defined in PCIe CEM spec, section

> > > 2.2.1. Initial Power-Up (G3 to S0).

> > > 

> > > What is platform specific is just how to achieve these steps.

> > > 

> > > Am I right?

> > > 

> > > ...

> > > 

> > > TPVPERL is one of my timeout candidates as minimal required timeout for

> > > Warm Reset. I have wrote it in email:

> > > 

> > > https://lore.kernel.org/linux-pci/20200430082245.xblvb7xeamm4e336@pali/

> > > 

> > > But I'm not sure as specially in none diagram is described just warm

> > > reset as defined in mPCIe CEM (3.2.4.3. PERST# Signal).

> > > 

> > > ...

> > > 

> > > Anyway, I would suggest to define constants for those timeouts. I guess

> > > that in future we could be able to define "generic" timeout constants

> > > which would not be in private driver section, but in some common header

> > > file.

> > 

> > I agree with this, but I'm not sure if we really need that long time in

> > the kernel stage, because the power supply has already stable and it's

> > really impact the boot time, especially when the platform have multi

> > ports and not connect any EP device, we need to wait 200ms for each port

> > when system bootup.

> 

> Ports are independent. So you can initialize them in parallel, right?

> 

> If you initialize each port in separate worker then during msleep calls

> kernel can schedule other kernel thread to run and so it does not

> increase boot time. While pcie is sleeping kernel can do other things.

> So the result is that whole boot time is not increased, just reordered.

> 

> > For this PCIe controller driver, I would like to change the timeout

> > value to 10ms to comply with the HW design, and save some boot time.

> 

> In case you can connect _any_ PCIe card to your HW then you cannot

> decrease or change timeouts required by PCIe specs. Otherwise there can

> be a card which would not be initialized correctly.

> 

> I'm debugging driver for aardvark PCIe controller and I see that Compex

> cards really needs these timeouts, otherwise link is down and card

> cannot be detected.

> 

> So I guess that there can be also other cards which requires other

> timeouts as specified in PCIe specs.


OK, I'll keep this timeout value. 

One more question, is there any chance that we can put this linkup flow
to a more "standard" way, such as drivers provides the ops of the PERST#
pin and let the framework to decide how to start a link training, or we
just use macro to replace this timeout value in the future?

Thanks.

> 

> > > 

> > > > > > +

> > > > > > +	/* Check if the link is up or not */

> > > > > > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > > > > > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > > > > > +				 50 * USEC_PER_MSEC);

> > > > > 

> > > > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> > > > > as it is required by PCIe specs and also because experiments proved that

> > > > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> > > > > wait this minimal time.

> > > > 

> > > > Yes, this should be 100ms, I will fix it at next version, thanks for

> > > > your review.

> > > 

> > > In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for

> > > this step during reviewing aardvark driver.

> > > 

> > > https://lore.kernel.org/linux-pci/20190426161050.GA189964@google.com/

> > > 

> > > And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100:

> > > 

> > > https://lore.kernel.org/linux-pci/20190522213351.21366-2-repk@triplefau.lt/

> > 

> > Sure, I will use PCI_PM_D3COLD_WAIT macro instead in the next version.

> > 

> > Thanks.

> > 

> > > 

> > > > Thanks.

> > > > > 

> > > > > > +	if (err) {

> > > > > > +		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);

> > > > > > +		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);

> > > > > > +		return err;

> > > > > > +	}

> > > > > 

> > > > > [1] - https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

> > > > > [2] - https://lore.kernel.org/linux-pci/20200424092546.25p3hdtkehohe3xw@pali/

> > > > 

> > 

> 

> _______________________________________________

> Linux-mediatek mailing list

> Linux-mediatek@lists.infradead.org

> http://lists.infradead.org/mailman/listinfo/linux-mediatek
Pali Rohár March 23, 2021, 2:51 p.m. UTC | #8
On Tuesday 23 March 2021 09:31:34 Jianjun Wang wrote:
> One more question, is there any chance that we can put this linkup flow

> to a more "standard" way, such as drivers provides the ops of the PERST#

> pin and let the framework to decide how to start a link training, or we

> just use macro to replace this timeout value in the future?


This is something about which I was thinking that could be useful for
pci-aardvark.c driver. But I was not sure if some other driver can
benefit from such "framework". But now I see that your driver is another
candidate which can benefit from it.

Currently there is no such "framework" in kernel and the hardest part
would be to design it.

Having this API would allow kernel to implement and export PCIe Warm
Reset (which is done via PERST# signal) and easily extend Amey's reset
patches to export also Warm Reset via sysfs.

But to implement this framework and using it for reset we first need to
answer questions which I have sent in email:
https://lore.kernel.org/linux-pci/20210310110535.zh4pnn4vpmvzwl5q@pali/

Bjorn, Alex: any opinion about PERST#?

Also see Enrico's email, where confirmed that there are platforms which
shares one PERST# signal for more endpoint cards:
https://lore.kernel.org/linux-pci/1da0fa2c-8056-9ae8-6ce4-ab645317772d@metux.net/
Pali Rohár March 29, 2021, 10:58 p.m. UTC | #9
On Thursday 18 March 2021 13:48:07 Jianjun Wang wrote:
> On Thu, 2021-03-18 at 01:02 +0100, Pali Rohár wrote:

> > On Saturday 13 March 2021 15:43:14 Jianjun Wang wrote:

> > > On Thu, 2021-03-11 at 13:38 +0100, Pali Rohár wrote:

> > > > On Wednesday 24 February 2021 14:11:28 Jianjun Wang wrote:

> > > > > +

> > > > > +	/* Check if the link is up or not */

> > > > > +	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,

> > > > > +				 !!(val & PCIE_PORT_LINKUP), 20,

> > > > > +				 50 * USEC_PER_MSEC);

> > > > 

> > > > IIRC, you need to wait at least 100ms after de-asserting PERST# signal

> > > > as it is required by PCIe specs and also because experiments proved that

> > > > some Compex wifi cards (e.g. WLE900VX) are not detected if you do not

> > > > wait this minimal time.

> > > 

> > > Yes, this should be 100ms, I will fix it at next version, thanks for

> > > your review.

> > 

> > In past Bjorn suggested to use msleep(PCI_PM_D3COLD_WAIT); macro for

> > this step during reviewing aardvark driver.

> > 

> > https://lore.kernel.org/linux-pci/20190426161050.GA189964@google.com/

> > 

> > And next iteration used this PCI_PM_D3COLD_WAIT macro instead of 100:

> > 

> > https://lore.kernel.org/linux-pci/20190522213351.21366-2-repk@triplefau.lt/

> 

> Sure, I will use PCI_PM_D3COLD_WAIT macro instead in the next version.

> 

> Thanks.


Anyway, now I found out that kernel has functions for this waiting:
pcie_wait_for_link_delay() and pcie_wait_for_link()

Function is called from pci_bridge_wait_for_secondary_bus().

But in current form it is not usable for native controller drivers.

This looks like another candidate for code de-duplication or providing
"framework".


Lorenzo, as maintainer of native controller drivers, do you have some
ideas about providing "framework", common functions or something for
avoiding to implement same code patterns in every native controller
driver, which is de-facto standard PCIe codepath? Including a way how to
export PERST# reset gpio?
diff mbox series

Patch

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 64e2f5e379aa..b242b17025b3 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -242,6 +242,19 @@  config PCIE_MEDIATEK
 	  Say Y here if you want to enable PCIe controller support on
 	  MediaTek SoCs.
 
+config PCIE_MEDIATEK_GEN3
+	tristate "MediaTek Gen3 PCIe controller"
+	depends on ARCH_MEDIATEK || COMPILE_TEST
+	depends on PCI_MSI_IRQ_DOMAIN
+	help
+	  Adds support for PCIe Gen3 MAC controller for MediaTek SoCs.
+	  This PCIe controller is compatible with Gen3, Gen2 and Gen1 speed,
+	  and support up to 256 MSI interrupt numbers for
+	  multi-function devices.
+
+	  Say Y here if you want to enable Gen3 PCIe controller support on
+	  MediaTek SoCs.
+
 config PCIE_TANGO_SMP8759
 	bool "Tango SMP8759 PCIe controller (DANGEROUS)"
 	depends on ARCH_TANGO && PCI_MSI && OF
diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile
index 04c6edc285c5..df5d77d72a9d 100644
--- a/drivers/pci/controller/Makefile
+++ b/drivers/pci/controller/Makefile
@@ -27,6 +27,7 @@  obj-$(CONFIG_PCIE_ROCKCHIP) += pcie-rockchip.o
 obj-$(CONFIG_PCIE_ROCKCHIP_EP) += pcie-rockchip-ep.o
 obj-$(CONFIG_PCIE_ROCKCHIP_HOST) += pcie-rockchip-host.o
 obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o
+obj-$(CONFIG_PCIE_MEDIATEK_GEN3) += pcie-mediatek-gen3.o
 obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o
 obj-$(CONFIG_VMD) += vmd.o
 obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o
diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
new file mode 100644
index 000000000000..c602beb9afec
--- /dev/null
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -0,0 +1,457 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * MediaTek PCIe host controller driver.
+ *
+ * Copyright (c) 2020 MediaTek Inc.
+ * Author: Jianjun Wang <jianjun.wang@mediatek.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/iopoll.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/pm_domain.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+
+#include "../pci.h"
+
+#define PCIE_SETTING_REG		0x80
+#define PCIE_PCI_IDS_1			0x9c
+#define PCI_CLASS(class)		(class << 8)
+#define PCIE_RC_MODE			BIT(0)
+
+#define PCIE_CFGNUM_REG			0x140
+#define PCIE_CFG_DEVFN(devfn)		((devfn) & GENMASK(7, 0))
+#define PCIE_CFG_BUS(bus)		(((bus) << 8) & GENMASK(15, 8))
+#define PCIE_CFG_BYTE_EN(bytes)		(((bytes) << 16) & GENMASK(19, 16))
+#define PCIE_CFG_FORCE_BYTE_EN		BIT(20)
+#define PCIE_CFG_OFFSET_ADDR		0x1000
+#define PCIE_CFG_HEADER(bus, devfn) \
+	(PCIE_CFG_BUS(bus) | PCIE_CFG_DEVFN(devfn))
+
+#define PCIE_RST_CTRL_REG		0x148
+#define PCIE_MAC_RSTB			BIT(0)
+#define PCIE_PHY_RSTB			BIT(1)
+#define PCIE_BRG_RSTB			BIT(2)
+#define PCIE_PE_RSTB			BIT(3)
+
+#define PCIE_LTSSM_STATUS_REG		0x150
+
+#define PCIE_LINK_STATUS_REG		0x154
+#define PCIE_PORT_LINKUP		BIT(8)
+
+#define PCIE_TRANS_TABLE_BASE_REG	0x800
+#define PCIE_ATR_SRC_ADDR_MSB_OFFSET	0x4
+#define PCIE_ATR_TRSL_ADDR_LSB_OFFSET	0x8
+#define PCIE_ATR_TRSL_ADDR_MSB_OFFSET	0xc
+#define PCIE_ATR_TRSL_PARAM_OFFSET	0x10
+#define PCIE_ATR_TLB_SET_OFFSET		0x20
+
+#define PCIE_MAX_TRANS_TABLES		8
+#define PCIE_ATR_EN			BIT(0)
+#define PCIE_ATR_SIZE(size) \
+	(((((size) - 1) << 1) & GENMASK(6, 1)) | PCIE_ATR_EN)
+#define PCIE_ATR_ID(id)			((id) & GENMASK(3, 0))
+#define PCIE_ATR_TYPE_MEM		PCIE_ATR_ID(0)
+#define PCIE_ATR_TYPE_IO		PCIE_ATR_ID(1)
+#define PCIE_ATR_TLP_TYPE(type)		(((type) << 16) & GENMASK(18, 16))
+#define PCIE_ATR_TLP_TYPE_MEM		PCIE_ATR_TLP_TYPE(0)
+#define PCIE_ATR_TLP_TYPE_IO		PCIE_ATR_TLP_TYPE(2)
+
+/**
+ * struct mtk_pcie_port - PCIe port information
+ * @dev: pointer to PCIe device
+ * @base: IO mapped register base
+ * @reg_base: Physical register base
+ * @mac_reset: mac reset control
+ * @phy_reset: phy reset control
+ * @phy: PHY controller block
+ * @clks: PCIe clocks
+ * @num_clks: PCIe clocks count for this port
+ */
+struct mtk_pcie_port {
+	struct device *dev;
+	void __iomem *base;
+	phys_addr_t reg_base;
+	struct reset_control *mac_reset;
+	struct reset_control *phy_reset;
+	struct phy *phy;
+	struct clk_bulk_data *clks;
+	int num_clks;
+};
+
+/**
+ * mtk_pcie_config_tlp_header
+ * @bus: PCI bus to query
+ * @devfn: device/function number
+ * @where: offset in config space
+ * @size: data size in TLP header
+ *
+ * Set byte enable field and device information in configuration TLP header.
+ */
+static void mtk_pcie_config_tlp_header(struct pci_bus *bus, unsigned int devfn,
+					int where, int size)
+{
+	struct mtk_pcie_port *port = bus->sysdata;
+	int bytes;
+	u32 val;
+
+	bytes = (GENMASK(size - 1, 0) & 0xf) << (where & 0x3);
+
+	val = PCIE_CFG_FORCE_BYTE_EN | PCIE_CFG_BYTE_EN(bytes) |
+	      PCIE_CFG_HEADER(bus->number, devfn);
+
+	writel_relaxed(val, port->base + PCIE_CFGNUM_REG);
+}
+
+static void __iomem *mtk_pcie_map_bus(struct pci_bus *bus, unsigned int devfn,
+				      int where)
+{
+	struct mtk_pcie_port *port = bus->sysdata;
+
+	return port->base + PCIE_CFG_OFFSET_ADDR + where;
+}
+
+static int mtk_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
+				int where, int size, u32 *val)
+{
+	mtk_pcie_config_tlp_header(bus, devfn, where, size);
+
+	return pci_generic_config_read32(bus, devfn, where, size, val);
+}
+
+static int mtk_pcie_config_write(struct pci_bus *bus, unsigned int devfn,
+				 int where, int size, u32 val)
+{
+	mtk_pcie_config_tlp_header(bus, devfn, where, size);
+
+	if (size <= 2)
+		val <<= (where & 0x3) * 8;
+
+	return pci_generic_config_write32(bus, devfn, where, 4, val);
+}
+
+static struct pci_ops mtk_pcie_ops = {
+	.map_bus = mtk_pcie_map_bus,
+	.read  = mtk_pcie_config_read,
+	.write = mtk_pcie_config_write,
+};
+
+static int mtk_pcie_set_trans_table(struct mtk_pcie_port *port,
+				    resource_size_t cpu_addr,
+				    resource_size_t pci_addr,
+				    resource_size_t size,
+				    unsigned long type, int num)
+{
+	void __iomem *table;
+	u32 val;
+
+	if (num >= PCIE_MAX_TRANS_TABLES) {
+		dev_err(port->dev, "not enough translate table[%d] for addr: %#llx, limited to [%d]\n",
+			num, (unsigned long long) cpu_addr,
+			PCIE_MAX_TRANS_TABLES);
+		return -ENODEV;
+	}
+
+	table = port->base + PCIE_TRANS_TABLE_BASE_REG +
+		num * PCIE_ATR_TLB_SET_OFFSET;
+
+	writel_relaxed(lower_32_bits(cpu_addr) | PCIE_ATR_SIZE(fls(size) - 1),
+		       table);
+	writel_relaxed(upper_32_bits(cpu_addr),
+		       table + PCIE_ATR_SRC_ADDR_MSB_OFFSET);
+	writel_relaxed(lower_32_bits(pci_addr),
+		       table + PCIE_ATR_TRSL_ADDR_LSB_OFFSET);
+	writel_relaxed(upper_32_bits(pci_addr),
+		       table + PCIE_ATR_TRSL_ADDR_MSB_OFFSET);
+
+	if (type == IORESOURCE_IO)
+		val = PCIE_ATR_TYPE_IO | PCIE_ATR_TLP_TYPE_IO;
+	else
+		val = PCIE_ATR_TYPE_MEM | PCIE_ATR_TLP_TYPE_MEM;
+
+	writel_relaxed(val, table + PCIE_ATR_TRSL_PARAM_OFFSET);
+
+	return 0;
+}
+
+static int mtk_pcie_startup_port(struct mtk_pcie_port *port)
+{
+	struct resource_entry *entry;
+	struct pci_host_bridge *host = pci_host_bridge_from_priv(port);
+	unsigned int table_index = 0;
+	int err;
+	u32 val;
+
+	/* Set as RC mode */
+	val = readl_relaxed(port->base + PCIE_SETTING_REG);
+	val |= PCIE_RC_MODE;
+	writel_relaxed(val, port->base + PCIE_SETTING_REG);
+
+	/* Set class code */
+	val = readl_relaxed(port->base + PCIE_PCI_IDS_1);
+	val &= ~GENMASK(31, 8);
+	val |= PCI_CLASS(PCI_CLASS_BRIDGE_PCI << 8);
+	writel_relaxed(val, port->base + PCIE_PCI_IDS_1);
+
+	/* Assert all reset signals */
+	val = readl_relaxed(port->base + PCIE_RST_CTRL_REG);
+	val |= PCIE_MAC_RSTB | PCIE_PHY_RSTB | PCIE_BRG_RSTB | PCIE_PE_RSTB;
+	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);
+
+	/* De-assert reset signals */
+	val &= ~(PCIE_MAC_RSTB | PCIE_PHY_RSTB | PCIE_BRG_RSTB);
+	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);
+
+	/* Delay 100ms to wait the reference clocks become stable */
+	msleep(100);
+
+	/* De-assert PERST# signal */
+	val &= ~PCIE_PE_RSTB;
+	writel_relaxed(val, port->base + PCIE_RST_CTRL_REG);
+
+	/* Check if the link is up or not */
+	err = readl_poll_timeout(port->base + PCIE_LINK_STATUS_REG, val,
+				 !!(val & PCIE_PORT_LINKUP), 20,
+				 50 * USEC_PER_MSEC);
+	if (err) {
+		val = readl_relaxed(port->base + PCIE_LTSSM_STATUS_REG);
+		dev_err(port->dev, "PCIe link down, ltssm reg val: %#x\n", val);
+		return err;
+	}
+
+	/* Set PCIe translation windows */
+	resource_list_for_each_entry(entry, &host->windows) {
+		struct resource *res = entry->res;
+		unsigned long type = resource_type(res);
+		resource_size_t cpu_addr;
+		resource_size_t pci_addr;
+		resource_size_t size;
+		const char *range_type;
+
+		if (type == IORESOURCE_IO) {
+			cpu_addr = pci_pio_to_address(res->start);
+			range_type = "IO";
+		} else if (type == IORESOURCE_MEM) {
+			cpu_addr = res->start;
+			range_type = "MEM";
+		} else {
+			continue;
+		}
+
+		pci_addr = res->start - entry->offset;
+		size = resource_size(res);
+		err = mtk_pcie_set_trans_table(port, cpu_addr, pci_addr, size,
+					       type, table_index);
+		if (err)
+			return err;
+
+		dev_dbg(port->dev, "set %s trans window[%d]: cpu_addr = %#llx, pci_addr = %#llx, size = %#llx\n",
+			range_type, table_index, (unsigned long long) cpu_addr,
+			(unsigned long long) pci_addr,
+			(unsigned long long) size);
+
+		table_index++;
+	}
+
+	return 0;
+}
+
+static int mtk_pcie_clk_init(struct mtk_pcie_port *port)
+{
+	int ret;
+
+	port->num_clks = devm_clk_bulk_get_all(port->dev, &port->clks);
+	if (port->num_clks < 0) {
+		dev_err(port->dev, "failed to get PCIe clock\n");
+		return port->num_clks;
+	}
+
+	ret = clk_bulk_prepare_enable(port->num_clks, port->clks);
+	if (ret) {
+		dev_err(port->dev, "failed to enable PCIe clocks\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int mtk_pcie_power_up(struct mtk_pcie_port *port)
+{
+	struct device *dev = port->dev;
+	int err;
+
+	port->phy_reset = devm_reset_control_get_optional_exclusive(dev, "phy");
+	if (IS_ERR(port->phy_reset))
+		return PTR_ERR(port->phy_reset);
+
+	/* PHY power on and enable pipe clock */
+	port->phy = devm_phy_optional_get(dev, "pcie-phy");
+	if (IS_ERR(port->phy))
+		return PTR_ERR(port->phy);
+
+	reset_control_deassert(port->phy_reset);
+
+	err = phy_init(port->phy);
+	if (err) {
+		dev_err(dev, "failed to initialize PCIe phy\n");
+		goto err_phy_init;
+	}
+
+	err = phy_power_on(port->phy);
+	if (err) {
+		dev_err(dev, "failed to power on PCIe phy\n");
+		goto err_phy_on;
+	}
+
+	port->mac_reset = devm_reset_control_get_optional_exclusive(dev, "mac");
+	if (IS_ERR(port->mac_reset)) {
+		err = PTR_ERR(port->mac_reset);
+		goto err_mac_rst;
+	}
+
+	reset_control_deassert(port->mac_reset);
+
+	/* MAC power on and enable transaction layer clocks */
+	pm_runtime_enable(dev);
+	pm_runtime_get_sync(dev);
+
+	err = mtk_pcie_clk_init(port);
+	if (err) {
+		dev_err(dev, "clock init failed\n");
+		goto err_clk_init;
+	}
+
+	return 0;
+
+err_clk_init:
+	pm_runtime_put_sync(dev);
+	pm_runtime_disable(dev);
+	reset_control_assert(port->mac_reset);
+err_mac_rst:
+	phy_power_off(port->phy);
+err_phy_on:
+	phy_exit(port->phy);
+err_phy_init:
+	reset_control_assert(port->phy_reset);
+
+	return err;
+}
+
+static void mtk_pcie_power_down(struct mtk_pcie_port *port)
+{
+	clk_bulk_disable_unprepare(port->num_clks, port->clks);
+
+	pm_runtime_put_sync(port->dev);
+	pm_runtime_disable(port->dev);
+	reset_control_assert(port->mac_reset);
+
+	phy_power_off(port->phy);
+	phy_exit(port->phy);
+	reset_control_assert(port->phy_reset);
+}
+
+static int mtk_pcie_setup(struct mtk_pcie_port *port)
+{
+	struct device *dev = port->dev;
+	struct platform_device *pdev = to_platform_device(dev);
+	struct resource *regs;
+	int err;
+
+	regs = platform_get_resource_byname(pdev, IORESOURCE_MEM, "pcie-mac");
+	port->base = devm_ioremap_resource(dev, regs);
+	if (IS_ERR(port->base)) {
+		dev_err(dev, "failed to map register base\n");
+		return PTR_ERR(port->base);
+	}
+
+	port->reg_base = regs->start;
+
+	/* Don't touch the hardware registers before power up */
+	err = mtk_pcie_power_up(port);
+	if (err)
+		return err;
+
+	/* Try link up */
+	err = mtk_pcie_startup_port(port);
+	if (err) {
+		dev_err(dev, "PCIe startup failed\n");
+		goto err_setup;
+	}
+
+	return 0;
+
+err_setup:
+	mtk_pcie_power_down(port);
+
+	return err;
+}
+
+static int mtk_pcie_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct mtk_pcie_port *port;
+	struct pci_host_bridge *host;
+	int err;
+
+	host = devm_pci_alloc_host_bridge(dev, sizeof(*port));
+	if (!host)
+		return -ENOMEM;
+
+	port = pci_host_bridge_priv(host);
+
+	port->dev = dev;
+	platform_set_drvdata(pdev, port);
+
+	err = mtk_pcie_setup(port);
+	if (err)
+		return err;
+
+	host->ops = &mtk_pcie_ops;
+	host->sysdata = port;
+
+	err = pci_host_probe(host);
+	if (err) {
+		mtk_pcie_power_down(port);
+		return err;
+	}
+
+	return 0;
+}
+
+static int mtk_pcie_remove(struct platform_device *pdev)
+{
+	struct mtk_pcie_port *port = platform_get_drvdata(pdev);
+	struct pci_host_bridge *host = pci_host_bridge_from_priv(port);
+
+	pci_lock_rescan_remove();
+	pci_stop_root_bus(host->bus);
+	pci_remove_root_bus(host->bus);
+	pci_unlock_rescan_remove();
+
+	mtk_pcie_power_down(port);
+
+	return 0;
+}
+
+static const struct of_device_id mtk_pcie_of_match[] = {
+	{ .compatible = "mediatek,mt8192-pcie" },
+	{},
+};
+
+static struct platform_driver mtk_pcie_driver = {
+	.probe = mtk_pcie_probe,
+	.remove = mtk_pcie_remove,
+	.driver = {
+		.name = "mtk-pcie",
+		.of_match_table = mtk_pcie_of_match,
+	},
+};
+
+module_platform_driver(mtk_pcie_driver);
+MODULE_LICENSE("GPL v2");