Message ID | 20210815042525.36878-1-alyssa@rosenzweig.io |
---|---|
Headers | show |
Series | Add PCI driver for the Apple M1 | expand |
On Sun, 15 Aug 2021 05:25:25 +0100, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > Add a driver for the PCIe controller found in Apple system-on-chips, > particularly the Apple M1. This driver exposes the internal bus used for > the USB type-A ports, Ethernet, Wi-Fi, and Bluetooth. This patch brings > up the USB type-A ports and Ethernet. Bringing up the radios requires > interfacing with the System Management Coprocessor via Apple's > mailboxes, so that's left to a future patch. > > In this state, the driver consists of two major parts: hardware > initialization and MSI handling. The hardware initialization is derived > from Mark Kettenis's U-Boot patches which in turn is derived from > Corellium's patches for the hardware. The rest of the driver is derived > from Marc Zyngier's driver for the hardware. This really needs to be split into multiple patches: - PCI probing - Clock management - MSI implementation A couple of comments below: > > Co-developed-by: Stan Skowronek <stan@corellium.com> > Signed-off-by: Stan Skowronek <stan@corellium.com> > Co-developed-by: Marc Zyngier <maz@kernel.org> > Signed-off-by: Marc Zyngier <maz@kernel.org> > Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> > --- > MAINTAINERS | 1 + > drivers/pci/controller/Kconfig | 13 + > drivers/pci/controller/Makefile | 1 + > drivers/pci/controller/pcie-apple.c | 466 ++++++++++++++++++++++++++++ > 4 files changed, 481 insertions(+) > create mode 100644 drivers/pci/controller/pcie-apple.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index d7d2e1d1e2f2..f13f65a844f7 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -1274,6 +1274,7 @@ M: Alyssa Rosenzweig <alyssa@rosenzweig.io> > L: linux-pci@vger.kernel.org > S: Maintained > F: Documentation/devicetree/bindings/pci/apple,pcie.yaml > +F: drivers/pci/controller/pcie-apple.c > > APPLE SMC DRIVER > M: Henrik Rydberg <rydberg@bitmath.org> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig > index 326f7d13024f..881a6a81f3e2 100644 > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@ -312,6 +312,19 @@ config PCIE_HISI_ERR > Say Y here if you want error handling support > for the PCIe controller's errors on HiSilicon HIP SoCs > > +config PCIE_APPLE > + tristate "Apple PCIe controller" > + depends on ARCH_APPLE || COMPILE_TEST > + depends on OF > + depends on PCI_MSI_IRQ_DOMAIN > + depends on GPIOLIB > + help > + Say Y here if you want to enable PCIe controller support on Apple > + system-on-chips, like the Apple M1. This is required for the USB > + type-A ports, Ethernet, Wi-Fi, and Bluetooth. > + > + If unsure, say Y if you have an Apple Silicon system. > + > source "drivers/pci/controller/dwc/Kconfig" > source "drivers/pci/controller/mobiveil/Kconfig" > source "drivers/pci/controller/cadence/Kconfig" > diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile > index aaf30b3dcc14..f9d40bad932c 100644 > --- a/drivers/pci/controller/Makefile > +++ b/drivers/pci/controller/Makefile > @@ -37,6 +37,7 @@ obj-$(CONFIG_VMD) += vmd.o > obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o > obj-$(CONFIG_PCI_LOONGSON) += pci-loongson.o > obj-$(CONFIG_PCIE_HISI_ERR) += pcie-hisi-error.o > +obj-$(CONFIG_PCIE_APPLE) += pcie-apple.o > # pcie-hisi.o quirks are needed even without CONFIG_PCIE_DW > obj-y += dwc/ > obj-y += mobiveil/ > diff --git a/drivers/pci/controller/pcie-apple.c b/drivers/pci/controller/pcie-apple.c > new file mode 100644 > index 000000000000..08088e06460f > --- /dev/null > +++ b/drivers/pci/controller/pcie-apple.c > @@ -0,0 +1,466 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * PCIe host bridge driver for Apple system-on-chips. > + * > + * The HW is ECAM compliant, so once the controller is initialized, the driver > + * mostly only needs MSI handling. Initialization requires enabling power and > + * clocks, along with a number of register pokes. > + * > + * Copyright (C) 2021 Google LLC > + * Copyright (C) 2021 Corellium LLC > + * Copyright (C) 2021 Mark Kettenis <kettenis@openbsd.org> > + * Copyright (C) 2021 Alyssa Rosenzweig <alyssa@rosenzweig.io> > + * Author: Marc Zyngier <maz@kernel.org> > + */ > + > +#include <linux/kernel.h> > +#include <linux/irqdomain.h> > +#include <linux/module.h> > +#include <linux/msi.h> > +#include <linux/of_irq.h> > +#include <linux/pci-ecam.h> > +#include <linux/iopoll.h> > +#include <linux/gpio/consumer.h> > + > +#define CORE_RC_PHYIF_CTL 0x00024 > +#define CORE_RC_PHYIF_CTL_RUN BIT(0) > +#define CORE_RC_PHYIF_STAT 0x00028 > +#define CORE_RC_PHYIF_STAT_REFCLK BIT(4) > +#define CORE_RC_CTL 0x00050 > +#define CORE_RC_CTL_RUN BIT(0) > +#define CORE_RC_STAT 0x00058 > +#define CORE_RC_STAT_READY BIT(0) > +#define CORE_FABRIC_STAT 0x04000 > +#define CORE_FABRIC_STAT_MASK 0x001F001F > +#define CORE_PHY_CTL 0x80000 > +#define CORE_PHY_CTL_CLK0REQ BIT(0) > +#define CORE_PHY_CTL_CLK1REQ BIT(1) > +#define CORE_PHY_CTL_CLK0ACK BIT(2) > +#define CORE_PHY_CTL_CLK1ACK BIT(3) > +#define CORE_PHY_CTL_RESET BIT(7) > +#define CORE_LANE_CFG(port) (0x84000 + 0x4000 * (port)) > +#define CORE_LANE_CFG_REFCLK0REQ BIT(0) > +#define CORE_LANE_CFG_REFCLK1 BIT(1) > +#define CORE_LANE_CFG_REFCLK0ACK BIT(2) > +#define CORE_LANE_CFG_REFCLKEN (BIT(9) | BIT(10)) > +#define CORE_LANE_CTL(port) (0x84004 + 0x4000 * (port)) > +#define CORE_LANE_CTL_CFGACC BIT(15) > + > +#define PORT_LTSSMCTL 0x00080 > +#define PORT_LTSSMCTL_START BIT(0) > +#define PORT_INTSTAT 0x00100 > +#define PORT_INT_TUNNEL_ERR BIT(31) > +#define PORT_INT_CPL_TIMEOUT BIT(23) > +#define PORT_INT_RID2SID_MAPERR BIT(22) > +#define PORT_INT_CPL_ABORT BIT(21) > +#define PORT_INT_MSI_BAD_DATA BIT(19) > +#define PORT_INT_MSI_ERR BIT(18) > +#define PORT_INT_REQADDR_GT32 BIT(17) > +#define PORT_INT_AF_TIMEOUT BIT(15) > +#define PORT_INT_LINK_DOWN BIT(14) > +#define PORT_INT_LINK_UP BIT(12) > +#define PORT_INT_LINK_BWMGMT BIT(11) > +#define PORT_INT_AER_MASK (15 << 4) > +#define PORT_INT_PORT_ERR BIT(4) > +#define PORT_INT_INTx(i) BIT(i) > +#define PORT_INT_INTxALL 15 > +#define PORT_INTMSK 0x00104 > +#define PORT_INTMSKSET 0x00108 > +#define PORT_INTMSKCLR 0x0010c > +#define PORT_MSICFG 0x00124 > +#define PORT_MSICFG_EN BIT(0) > +#define PORT_MSICFG_L2MSINUM_SHIFT 4 > +#define PORT_MSIBASE 0x00128 > +#define PORT_MSIBASE_1_SHIFT 16 > +#define PORT_MSIADDR 0x00168 > +#define PORT_LINKSTS 0x00208 > +#define PORT_LINKSTS_UP BIT(0) > +#define PORT_LINKSTS_BUSY BIT(2) > +#define PORT_LINKCMDSTS 0x00210 > +#define PORT_OUTS_NPREQS 0x00284 > +#define PORT_OUTS_NPREQS_REQ BIT(24) > +#define PORT_OUTS_NPREQS_CPL BIT(16) > +#define PORT_RXWR_FIFO 0x00288 > +#define PORT_RXWR_FIFO_HDR GENMASK(15, 10) > +#define PORT_RXWR_FIFO_DATA GENMASK(9, 0) > +#define PORT_RXRD_FIFO 0x0028C > +#define PORT_RXRD_FIFO_REQ GENMASK(6, 0) > +#define PORT_OUTS_CPLS 0x00290 > +#define PORT_OUTS_CPLS_SHRD GENMASK(14, 8) > +#define PORT_OUTS_CPLS_WAIT GENMASK(6, 0) > +#define PORT_APPCLK 0x00800 > +#define PORT_APPCLK_EN BIT(0) > +#define PORT_APPCLK_CGDIS BIT(8) > +#define PORT_STATUS 0x00804 > +#define PORT_STATUS_READY BIT(0) > +#define PORT_REFCLK 0x00810 > +#define PORT_REFCLK_EN BIT(0) > +#define PORT_REFCLK_CGDIS BIT(8) > +#define PORT_PERST 0x00814 > +#define PORT_PERST_OFF BIT(0) > +#define PORT_RID2SID(i16) (0x00828 + 4 * (i16)) > +#define PORT_RID2SID_VALID BIT(31) > +#define PORT_RID2SID_SID_SHIFT 16 > +#define PORT_RID2SID_BUS_SHIFT 8 > +#define PORT_RID2SID_DEV_SHIFT 3 > +#define PORT_RID2SID_FUNC_SHIFT 0 > +#define PORT_OUTS_PREQS_HDR 0x00980 > +#define PORT_OUTS_PREQS_HDR_MASK GENMASK(9, 0) > +#define PORT_OUTS_PREQS_DATA 0x00984 > +#define PORT_OUTS_PREQS_DATA_MASK GENMASK(15, 0) > +#define PORT_TUNCTRL 0x00988 > +#define PORT_TUNCTRL_PERST_ON BIT(0) > +#define PORT_TUNCTRL_PERST_ACK_REQ BIT(1) > +#define PORT_TUNSTAT 0x0098c > +#define PORT_TUNSTAT_PERST_ON BIT(0) > +#define PORT_TUNSTAT_PERST_ACK_PEND BIT(1) > +#define PORT_PREFMEM_ENABLE 0x00994 > + > +/* The doorbell address is "well known" */ > +#define DOORBELL_ADDR 0xfffff000 > + > +/* The hardware exposes 3 ports. Port 0 (WiFi and Bluetooth) is special, as it > + * is power-gated by SMC to facilitate rfkill. > + */ > +enum apple_pcie_port { > + APPLE_PCIE_PORT_RADIO = 0, > + APPLE_PCIE_PORT_USB = 1, > + APPLE_PCIE_PORT_ETHERNET = 2, This really doesn't belong in a low-level PCIe controller driver. The ports should be purely generic. > + APPLE_PCIE_NUM_PORTS > +}; > + > +struct apple_pcie { > + u32 msi_base; > + u32 nvecs; > + struct mutex lock; > + struct device *dev; > + struct irq_domain *domain; > + unsigned long *bitmap; > + void __iomem *rc; > +}; > + > +static inline void rmwl(u32 clr, u32 set, void __iomem *addr) > +{ > + writel((readl(addr) & ~clr) | set, addr); Please use relaxed accessors. If the barriers are actually needed, please document what you are ordering against. This applies throughout the patch. This also begs the question: can this be called concurrently? > +} > + > +static void apple_msi_top_irq_mask(struct irq_data *d) > +{ > + pci_msi_mask_irq(d); > + irq_chip_mask_parent(d); > +} > + > +static void apple_msi_top_irq_unmask(struct irq_data *d) > +{ > + pci_msi_unmask_irq(d); > + irq_chip_unmask_parent(d); > +} > + > +static void apple_msi_top_irq_eoi(struct irq_data *d) > +{ > + irq_chip_eoi_parent(d); > +} Drop this and use the irq_chip_eoi_parent() directly in the irqchip. This was only here for debug. > + > +static struct irq_chip apple_msi_top_chip = { > + .name = "PCIe MSI", > + .irq_mask = apple_msi_top_irq_mask, > + .irq_unmask = apple_msi_top_irq_unmask, > + .irq_eoi = apple_msi_top_irq_eoi, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_set_type = irq_chip_set_type_parent, > +}; > + > +static void apple_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) > +{ > + msg->address_hi = 0; > + msg->address_lo = DOORBELL_ADDR; What was never clear is whether this doorbell address really is always at this location, or whether it is actually programmed by *something*. In any case, please use the lower_32bit/upper_32bit helpers, just in case we can move the address somewhere else. > + msg->data = data->hwirq; > +} > + > +static struct irq_chip apple_msi_bottom_chip = { > + .name = "MSI", > + .irq_mask = irq_chip_mask_parent, > + .irq_unmask = irq_chip_unmask_parent, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_eoi = irq_chip_eoi_parent, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_set_type = irq_chip_set_type_parent, > + .irq_compose_msi_msg = apple_msi_compose_msg, > +}; > + > +static int apple_msi_domain_alloc(struct irq_domain *domain, unsigned int virq, > + unsigned int nr_irqs, void *args) > +{ > + struct apple_pcie *pcie = domain->host_data; > + struct irq_fwspec fwspec; > + unsigned int i; > + int ret, hwirq; > + > + mutex_lock(&pcie->lock); > + > + hwirq = bitmap_find_free_region(pcie->bitmap, pcie->nvecs, > + order_base_2(nr_irqs)); > + > + mutex_unlock(&pcie->lock); > + > + if (hwirq < 0) > + return -ENOSPC; > + > + fwspec.fwnode = domain->parent->fwnode; > + fwspec.param_count = 3; > + fwspec.param[0] = 0; > + fwspec.param[1] = hwirq + pcie->msi_base; > + fwspec.param[2] = IRQ_TYPE_EDGE_RISING; > + > + ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, &fwspec); > + if (ret) > + return ret; > + > + for (i = 0; i < nr_irqs; i++) { > + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, > + &apple_msi_bottom_chip, > + domain->host_data); > + } > + > + return 0; > +} > + > +static void apple_msi_domain_free(struct irq_domain *domain, unsigned int virq, > + unsigned int nr_irqs) > +{ > + struct irq_data *d = irq_domain_get_irq_data(domain, virq); > + struct apple_pcie *pcie = domain->host_data; > + > + mutex_lock(&pcie->lock); > + > + bitmap_release_region(pcie->bitmap, d->hwirq, order_base_2(nr_irqs)); > + > + mutex_unlock(&pcie->lock); > +} > + > +static const struct irq_domain_ops apple_msi_domain_ops = { > + .alloc = apple_msi_domain_alloc, > + .free = apple_msi_domain_free, > +}; > + > +static struct msi_domain_info apple_msi_info = { > + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | > + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), > + .chip = &apple_msi_top_chip, > +}; > + > +static int apple_pcie_setup_refclk(void __iomem *rc, > + void __iomem *port, > + unsigned int idx) > +{ > + u32 stat; > + int res; > + > + res = readl_poll_timeout(rc + CORE_RC_PHYIF_STAT, stat, > + stat & CORE_RC_PHYIF_STAT_REFCLK, 100, 50000); > + if (res < 0) > + return res; > + > + rmwl(0, CORE_LANE_CTL_CFGACC, rc + CORE_LANE_CTL(idx)); > + rmwl(0, CORE_LANE_CFG_REFCLK0REQ, rc + CORE_LANE_CFG(idx)); > + > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > + stat & CORE_LANE_CFG_REFCLK0ACK, 100, 50000); > + if (res < 0) > + return res; > + > + rmwl(0, CORE_LANE_CFG_REFCLK1, rc + CORE_LANE_CFG(idx)); > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > + stat & CORE_LANE_CFG_REFCLK1, 100, 50000); > + > + if (res < 0) > + return res; > + > + rmwl(CORE_LANE_CTL_CFGACC, 0, rc + CORE_LANE_CTL(idx)); > + udelay(1); > + rmwl(0, CORE_LANE_CFG_REFCLKEN, rc + CORE_LANE_CFG(idx)); > + > + rmwl(0, PORT_REFCLK_EN, port + PORT_REFCLK); > + > + return 0; > +} > + > +static int apple_pcie_setup_port(struct apple_pcie *pcie, unsigned int i) > +{ > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); > + void __iomem *port; > + struct gpio_desc *reset; > + uint32_t stat; > + int ret; > + > + port = devm_of_iomap(pcie->dev, to_of_node(fwnode), i + 3, NULL); > + > + if (IS_ERR(port)) > + return -ENODEV; > + > + reset = devm_gpiod_get_index(pcie->dev, "reset", i, 0); > + if (IS_ERR(reset)) > + return PTR_ERR(reset); > + > + gpiod_direction_output(reset, 0); > + > + rmwl(0, PORT_APPCLK_EN, port + PORT_APPCLK); > + > + ret = apple_pcie_setup_refclk(pcie->rc, port, i); > + if (ret < 0) > + return ret; > + > + rmwl(0, PORT_PERST_OFF, port + PORT_PERST); > + gpiod_set_value(reset, 1); > + > + ret = readl_poll_timeout(port + PORT_STATUS, stat, > + stat & PORT_STATUS_READY, 100, 250000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u ready wait timeout\n", i); > + return ret; > + } > + > + rmwl(PORT_REFCLK_CGDIS, 0, port + PORT_REFCLK); > + rmwl(PORT_APPCLK_CGDIS, 0, port + PORT_APPCLK); > + > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > + !(stat & PORT_LINKSTS_BUSY), 100, 250000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u link not busy timeout\n", i); > + return ret; > + } > + > + writel(0xfb512fff, port + PORT_INTMSKSET); Magic. What is this for? > + > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > + > + usleep_range(5000, 10000); > + > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > + > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > + stat & PORT_LINKSTS_UP, 100, 500000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > + return ret; > + } I have the strong feeling that a lot of things in the above is to get an interrupt when the port reports an event. Why the polling then? > + > + writel(DOORBELL_ADDR, port + PORT_MSIADDR); > + writel(0, port + PORT_MSIBASE); So here you go, the MSI doorbell *is* configurable. Should it be placed somewhere else? Shouldn't it be configured before the link is up? > + writel((5 << PORT_MSICFG_L2MSINUM_SHIFT) | PORT_MSICFG_EN, > + port + PORT_MSICFG); Ah, that one actually makes sense (enables 32 MSIs for the port). > + > + return 0; > +} > + > +static int apple_msi_init(struct apple_pcie *pcie) > +{ > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); > + struct device_node *parent_intc; > + struct irq_domain *parent; > + int ret, i; > + > + pcie->rc = devm_of_iomap(pcie->dev, to_of_node(fwnode), 1, NULL); > + > + if (IS_ERR(pcie->rc)) > + return -ENODEV; > + > + for (i = 0; i < APPLE_PCIE_NUM_PORTS; ++i) { > + /* Bringing up the radios requires SMC. Skip for now. */ > + if (i == APPLE_PCIE_PORT_RADIO) > + continue; Why? Shouldn't this be moved into the driver for the endpoint, rather than hardcoding something here which is likely to change from one system to another? If establishing the link actually requires talking to another part of the system, then it should be described in DT. > + > + ret = apple_pcie_setup_port(pcie, i); > + > + if (ret) { > + dev_err(pcie->dev, "Port %u setup fail: %d\n", i, ret); > + return ret; > + } > + } > + > + ret = of_property_read_u32_index(to_of_node(fwnode), "msi-interrupts", > + 0, &pcie->msi_base); > + if (ret) > + return ret; > + > + ret = of_property_read_u32_index(to_of_node(fwnode), "msi-interrupts", > + 1, &pcie->nvecs); > + if (ret) > + return ret; > + > + pcie->bitmap = devm_kcalloc(pcie->dev, BITS_TO_LONGS(pcie->nvecs), > + sizeof(long), GFP_KERNEL); > + if (!pcie->bitmap) > + return -ENOMEM; > + > + parent_intc = of_irq_find_parent(to_of_node(fwnode)); > + parent = irq_find_host(parent_intc); > + if (!parent_intc || !parent) { > + dev_err(pcie->dev, "failed to find parent domain\n"); > + return -ENXIO; > + } > + > + parent = irq_domain_create_hierarchy(parent, 0, pcie->nvecs, fwnode, > + &apple_msi_domain_ops, pcie); > + if (!parent) { > + dev_err(pcie->dev, "failed to create IRQ domain\n"); > + return -ENOMEM; > + } > + irq_domain_update_bus_token(parent, DOMAIN_BUS_NEXUS); > + > + pcie->domain = pci_msi_create_irq_domain(fwnode, &apple_msi_info, > + parent); > + if (!pcie->domain) { > + dev_err(pcie->dev, "failed to create MSI domain\n"); > + irq_domain_remove(parent); > + return -ENOMEM; > + } > + > + return 0; > +} > + > +static int apple_m1_pci_init(struct pci_config_window *cfg) > +{ > + struct device *dev = cfg->parent; > + struct apple_pcie *pcie; > + > + pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL); > + if (!pcie) > + return -ENOMEM; > + > + pcie->dev = dev; > + > + mutex_init(&pcie->lock); > + > + return apple_msi_init(pcie); > +} > + > +static const struct pci_ecam_ops apple_m1_cfg_ecam_ops = { > + .init = apple_m1_pci_init, > + .pci_ops = { > + .map_bus = pci_ecam_map_bus, > + .read = pci_generic_config_read, > + .write = pci_generic_config_write, > + } > +}; > + > +static const struct of_device_id apple_pci_of_match[] = { > + { .compatible = "apple,pcie", .data = &apple_m1_cfg_ecam_ops }, > + { }, > +}; > +MODULE_DEVICE_TABLE(of, gen_pci_of_match); > + > +static struct platform_driver apple_pci_driver = { > + .driver = { > + .name = "pcie-apple", > + .of_match_table = apple_pci_of_match, > + }, > + .probe = pci_host_common_probe, > + .remove = pci_host_common_remove, > +}; > +module_platform_driver(apple_pci_driver); > + > +MODULE_LICENSE("GPL v2"); Thanks, M.
On Sat, Aug 14, 2021 at 11:34 PM Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > Add a driver for the PCIe controller found in Apple system-on-chips, > particularly the Apple M1. This driver exposes the internal bus used for > the USB type-A ports, Ethernet, Wi-Fi, and Bluetooth. This patch brings > up the USB type-A ports and Ethernet. Bringing up the radios requires > interfacing with the System Management Coprocessor via Apple's > mailboxes, so that's left to a future patch. > > In this state, the driver consists of two major parts: hardware > initialization and MSI handling. The hardware initialization is derived > from Mark Kettenis's U-Boot patches which in turn is derived from > Corellium's patches for the hardware. The rest of the driver is derived > from Marc Zyngier's driver for the hardware. > > Co-developed-by: Stan Skowronek <stan@corellium.com> > Signed-off-by: Stan Skowronek <stan@corellium.com> > Co-developed-by: Marc Zyngier <maz@kernel.org> > Signed-off-by: Marc Zyngier <maz@kernel.org> > Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> > --- > MAINTAINERS | 1 + > drivers/pci/controller/Kconfig | 13 + > drivers/pci/controller/Makefile | 1 + > drivers/pci/controller/pcie-apple.c | 466 ++++++++++++++++++++++++++++ > 4 files changed, 481 insertions(+) > create mode 100644 drivers/pci/controller/pcie-apple.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index d7d2e1d1e2f2..f13f65a844f7 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -1274,6 +1274,7 @@ M: Alyssa Rosenzweig <alyssa@rosenzweig.io> > L: linux-pci@vger.kernel.org > S: Maintained > F: Documentation/devicetree/bindings/pci/apple,pcie.yaml > +F: drivers/pci/controller/pcie-apple.c > > APPLE SMC DRIVER > M: Henrik Rydberg <rydberg@bitmath.org> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig > index 326f7d13024f..881a6a81f3e2 100644 > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@ -312,6 +312,19 @@ config PCIE_HISI_ERR > Say Y here if you want error handling support > for the PCIe controller's errors on HiSilicon HIP SoCs > > +config PCIE_APPLE > + tristate "Apple PCIe controller" > + depends on ARCH_APPLE || COMPILE_TEST > + depends on OF > + depends on PCI_MSI_IRQ_DOMAIN > + depends on GPIOLIB > + help > + Say Y here if you want to enable PCIe controller support on Apple > + system-on-chips, like the Apple M1. This is required for the USB > + type-A ports, Ethernet, Wi-Fi, and Bluetooth. > + > + If unsure, say Y if you have an Apple Silicon system. > + > source "drivers/pci/controller/dwc/Kconfig" > source "drivers/pci/controller/mobiveil/Kconfig" > source "drivers/pci/controller/cadence/Kconfig" > diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile > index aaf30b3dcc14..f9d40bad932c 100644 > --- a/drivers/pci/controller/Makefile > +++ b/drivers/pci/controller/Makefile > @@ -37,6 +37,7 @@ obj-$(CONFIG_VMD) += vmd.o > obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o > obj-$(CONFIG_PCI_LOONGSON) += pci-loongson.o > obj-$(CONFIG_PCIE_HISI_ERR) += pcie-hisi-error.o > +obj-$(CONFIG_PCIE_APPLE) += pcie-apple.o > # pcie-hisi.o quirks are needed even without CONFIG_PCIE_DW > obj-y += dwc/ > obj-y += mobiveil/ > diff --git a/drivers/pci/controller/pcie-apple.c b/drivers/pci/controller/pcie-apple.c > new file mode 100644 > index 000000000000..08088e06460f > --- /dev/null > +++ b/drivers/pci/controller/pcie-apple.c > @@ -0,0 +1,466 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * PCIe host bridge driver for Apple system-on-chips. > + * > + * The HW is ECAM compliant, so once the controller is initialized, the driver > + * mostly only needs MSI handling. Initialization requires enabling power and > + * clocks, along with a number of register pokes. > + * > + * Copyright (C) 2021 Google LLC > + * Copyright (C) 2021 Corellium LLC > + * Copyright (C) 2021 Mark Kettenis <kettenis@openbsd.org> > + * Copyright (C) 2021 Alyssa Rosenzweig <alyssa@rosenzweig.io> > + * Author: Marc Zyngier <maz@kernel.org> > + */ > + > +#include <linux/kernel.h> > +#include <linux/irqdomain.h> > +#include <linux/module.h> > +#include <linux/msi.h> > +#include <linux/of_irq.h> > +#include <linux/pci-ecam.h> > +#include <linux/iopoll.h> > +#include <linux/gpio/consumer.h> > + > +#define CORE_RC_PHYIF_CTL 0x00024 > +#define CORE_RC_PHYIF_CTL_RUN BIT(0) > +#define CORE_RC_PHYIF_STAT 0x00028 > +#define CORE_RC_PHYIF_STAT_REFCLK BIT(4) > +#define CORE_RC_CTL 0x00050 > +#define CORE_RC_CTL_RUN BIT(0) > +#define CORE_RC_STAT 0x00058 > +#define CORE_RC_STAT_READY BIT(0) > +#define CORE_FABRIC_STAT 0x04000 > +#define CORE_FABRIC_STAT_MASK 0x001F001F > +#define CORE_PHY_CTL 0x80000 > +#define CORE_PHY_CTL_CLK0REQ BIT(0) > +#define CORE_PHY_CTL_CLK1REQ BIT(1) > +#define CORE_PHY_CTL_CLK0ACK BIT(2) > +#define CORE_PHY_CTL_CLK1ACK BIT(3) > +#define CORE_PHY_CTL_RESET BIT(7) I was going to say these should be a phy driver perhaps, but they are unused. So for now, just drop them. > +#define CORE_LANE_CFG(port) (0x84000 + 0x4000 * (port)) > +#define CORE_LANE_CFG_REFCLK0REQ BIT(0) > +#define CORE_LANE_CFG_REFCLK1 BIT(1) > +#define CORE_LANE_CFG_REFCLK0ACK BIT(2) > +#define CORE_LANE_CFG_REFCLKEN (BIT(9) | BIT(10)) > +#define CORE_LANE_CTL(port) (0x84004 + 0x4000 * (port)) > +#define CORE_LANE_CTL_CFGACC BIT(15) > + > +#define PORT_LTSSMCTL 0x00080 > +#define PORT_LTSSMCTL_START BIT(0) > +#define PORT_INTSTAT 0x00100 > +#define PORT_INT_TUNNEL_ERR BIT(31) > +#define PORT_INT_CPL_TIMEOUT BIT(23) > +#define PORT_INT_RID2SID_MAPERR BIT(22) > +#define PORT_INT_CPL_ABORT BIT(21) > +#define PORT_INT_MSI_BAD_DATA BIT(19) > +#define PORT_INT_MSI_ERR BIT(18) > +#define PORT_INT_REQADDR_GT32 BIT(17) > +#define PORT_INT_AF_TIMEOUT BIT(15) > +#define PORT_INT_LINK_DOWN BIT(14) > +#define PORT_INT_LINK_UP BIT(12) > +#define PORT_INT_LINK_BWMGMT BIT(11) > +#define PORT_INT_AER_MASK (15 << 4) > +#define PORT_INT_PORT_ERR BIT(4) > +#define PORT_INT_INTx(i) BIT(i) > +#define PORT_INT_INTxALL 15 > +#define PORT_INTMSK 0x00104 > +#define PORT_INTMSKSET 0x00108 > +#define PORT_INTMSKCLR 0x0010c > +#define PORT_MSICFG 0x00124 > +#define PORT_MSICFG_EN BIT(0) > +#define PORT_MSICFG_L2MSINUM_SHIFT 4 > +#define PORT_MSIBASE 0x00128 > +#define PORT_MSIBASE_1_SHIFT 16 > +#define PORT_MSIADDR 0x00168 > +#define PORT_LINKSTS 0x00208 > +#define PORT_LINKSTS_UP BIT(0) > +#define PORT_LINKSTS_BUSY BIT(2) > +#define PORT_LINKCMDSTS 0x00210 > +#define PORT_OUTS_NPREQS 0x00284 > +#define PORT_OUTS_NPREQS_REQ BIT(24) > +#define PORT_OUTS_NPREQS_CPL BIT(16) > +#define PORT_RXWR_FIFO 0x00288 > +#define PORT_RXWR_FIFO_HDR GENMASK(15, 10) > +#define PORT_RXWR_FIFO_DATA GENMASK(9, 0) > +#define PORT_RXRD_FIFO 0x0028C > +#define PORT_RXRD_FIFO_REQ GENMASK(6, 0) > +#define PORT_OUTS_CPLS 0x00290 > +#define PORT_OUTS_CPLS_SHRD GENMASK(14, 8) > +#define PORT_OUTS_CPLS_WAIT GENMASK(6, 0) > +#define PORT_APPCLK 0x00800 > +#define PORT_APPCLK_EN BIT(0) > +#define PORT_APPCLK_CGDIS BIT(8) > +#define PORT_STATUS 0x00804 > +#define PORT_STATUS_READY BIT(0) > +#define PORT_REFCLK 0x00810 > +#define PORT_REFCLK_EN BIT(0) > +#define PORT_REFCLK_CGDIS BIT(8) > +#define PORT_PERST 0x00814 > +#define PORT_PERST_OFF BIT(0) > +#define PORT_RID2SID(i16) (0x00828 + 4 * (i16)) > +#define PORT_RID2SID_VALID BIT(31) > +#define PORT_RID2SID_SID_SHIFT 16 > +#define PORT_RID2SID_BUS_SHIFT 8 > +#define PORT_RID2SID_DEV_SHIFT 3 > +#define PORT_RID2SID_FUNC_SHIFT 0 > +#define PORT_OUTS_PREQS_HDR 0x00980 > +#define PORT_OUTS_PREQS_HDR_MASK GENMASK(9, 0) > +#define PORT_OUTS_PREQS_DATA 0x00984 > +#define PORT_OUTS_PREQS_DATA_MASK GENMASK(15, 0) > +#define PORT_TUNCTRL 0x00988 > +#define PORT_TUNCTRL_PERST_ON BIT(0) > +#define PORT_TUNCTRL_PERST_ACK_REQ BIT(1) > +#define PORT_TUNSTAT 0x0098c > +#define PORT_TUNSTAT_PERST_ON BIT(0) > +#define PORT_TUNSTAT_PERST_ACK_PEND BIT(1) > +#define PORT_PREFMEM_ENABLE 0x00994 > + > +/* The doorbell address is "well known" */ > +#define DOORBELL_ADDR 0xfffff000 > + > +/* The hardware exposes 3 ports. Port 0 (WiFi and Bluetooth) is special, as it > + * is power-gated by SMC to facilitate rfkill. > + */ > +enum apple_pcie_port { > + APPLE_PCIE_PORT_RADIO = 0, > + APPLE_PCIE_PORT_USB = 1, > + APPLE_PCIE_PORT_ETHERNET = 2, > + APPLE_PCIE_NUM_PORTS > +}; > + > +struct apple_pcie { > + u32 msi_base; > + u32 nvecs; > + struct mutex lock; > + struct device *dev; > + struct irq_domain *domain; > + unsigned long *bitmap; > + void __iomem *rc; > +}; > + > +static inline void rmwl(u32 clr, u32 set, void __iomem *addr) > +{ > + writel((readl(addr) & ~clr) | set, addr); > +} > + > +static void apple_msi_top_irq_mask(struct irq_data *d) > +{ > + pci_msi_mask_irq(d); > + irq_chip_mask_parent(d); > +} > + > +static void apple_msi_top_irq_unmask(struct irq_data *d) > +{ > + pci_msi_unmask_irq(d); > + irq_chip_unmask_parent(d); > +} > + > +static void apple_msi_top_irq_eoi(struct irq_data *d) > +{ > + irq_chip_eoi_parent(d); > +} > + > +static struct irq_chip apple_msi_top_chip = { > + .name = "PCIe MSI", > + .irq_mask = apple_msi_top_irq_mask, > + .irq_unmask = apple_msi_top_irq_unmask, > + .irq_eoi = apple_msi_top_irq_eoi, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_set_type = irq_chip_set_type_parent, > +}; > + > +static void apple_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) > +{ > + msg->address_hi = 0; > + msg->address_lo = DOORBELL_ADDR; > + msg->data = data->hwirq; > +} > + > +static struct irq_chip apple_msi_bottom_chip = { > + .name = "MSI", > + .irq_mask = irq_chip_mask_parent, > + .irq_unmask = irq_chip_unmask_parent, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_eoi = irq_chip_eoi_parent, > + .irq_set_affinity = irq_chip_set_affinity_parent, > + .irq_set_type = irq_chip_set_type_parent, > + .irq_compose_msi_msg = apple_msi_compose_msg, > +}; > + > +static int apple_msi_domain_alloc(struct irq_domain *domain, unsigned int virq, > + unsigned int nr_irqs, void *args) > +{ > + struct apple_pcie *pcie = domain->host_data; > + struct irq_fwspec fwspec; > + unsigned int i; > + int ret, hwirq; > + > + mutex_lock(&pcie->lock); > + > + hwirq = bitmap_find_free_region(pcie->bitmap, pcie->nvecs, > + order_base_2(nr_irqs)); > + > + mutex_unlock(&pcie->lock); > + > + if (hwirq < 0) > + return -ENOSPC; > + > + fwspec.fwnode = domain->parent->fwnode; > + fwspec.param_count = 3; > + fwspec.param[0] = 0; > + fwspec.param[1] = hwirq + pcie->msi_base; > + fwspec.param[2] = IRQ_TYPE_EDGE_RISING; > + > + ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, &fwspec); > + if (ret) > + return ret; > + > + for (i = 0; i < nr_irqs; i++) { > + irq_domain_set_hwirq_and_chip(domain, virq + i, hwirq + i, > + &apple_msi_bottom_chip, > + domain->host_data); > + } > + > + return 0; > +} > + > +static void apple_msi_domain_free(struct irq_domain *domain, unsigned int virq, > + unsigned int nr_irqs) > +{ > + struct irq_data *d = irq_domain_get_irq_data(domain, virq); > + struct apple_pcie *pcie = domain->host_data; > + > + mutex_lock(&pcie->lock); > + > + bitmap_release_region(pcie->bitmap, d->hwirq, order_base_2(nr_irqs)); > + > + mutex_unlock(&pcie->lock); > +} > + > +static const struct irq_domain_ops apple_msi_domain_ops = { > + .alloc = apple_msi_domain_alloc, > + .free = apple_msi_domain_free, > +}; > + > +static struct msi_domain_info apple_msi_info = { > + .flags = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS | > + MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX), > + .chip = &apple_msi_top_chip, > +}; > + > +static int apple_pcie_setup_refclk(void __iomem *rc, > + void __iomem *port, > + unsigned int idx) > +{ > + u32 stat; > + int res; > + > + res = readl_poll_timeout(rc + CORE_RC_PHYIF_STAT, stat, > + stat & CORE_RC_PHYIF_STAT_REFCLK, 100, 50000); > + if (res < 0) > + return res; > + > + rmwl(0, CORE_LANE_CTL_CFGACC, rc + CORE_LANE_CTL(idx)); > + rmwl(0, CORE_LANE_CFG_REFCLK0REQ, rc + CORE_LANE_CFG(idx)); > + > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > + stat & CORE_LANE_CFG_REFCLK0ACK, 100, 50000); > + if (res < 0) > + return res; > + > + rmwl(0, CORE_LANE_CFG_REFCLK1, rc + CORE_LANE_CFG(idx)); > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > + stat & CORE_LANE_CFG_REFCLK1, 100, 50000); > + > + if (res < 0) > + return res; > + > + rmwl(CORE_LANE_CTL_CFGACC, 0, rc + CORE_LANE_CTL(idx)); > + udelay(1); > + rmwl(0, CORE_LANE_CFG_REFCLKEN, rc + CORE_LANE_CFG(idx)); > + > + rmwl(0, PORT_REFCLK_EN, port + PORT_REFCLK); > + > + return 0; > +} > + > +static int apple_pcie_setup_port(struct apple_pcie *pcie, unsigned int i) > +{ > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); Doesn't look like you ever use the fwnode, just get the DT node pointer. Unless this driver is going to use ACPI someday (and ACPI changes how PCI is done), there's no point in using fwnode. > + void __iomem *port; > + struct gpio_desc *reset; > + uint32_t stat; > + int ret; > + > + port = devm_of_iomap(pcie->dev, to_of_node(fwnode), i + 3, NULL); It's preferred to use platform resource api and ioremap over DT functions. > + > + if (IS_ERR(port)) > + return -ENODEV; > + > + reset = devm_gpiod_get_index(pcie->dev, "reset", i, 0); > + if (IS_ERR(reset)) > + return PTR_ERR(reset); > + > + gpiod_direction_output(reset, 0); > + > + rmwl(0, PORT_APPCLK_EN, port + PORT_APPCLK); > + > + ret = apple_pcie_setup_refclk(pcie->rc, port, i); > + if (ret < 0) > + return ret; > + > + rmwl(0, PORT_PERST_OFF, port + PORT_PERST); > + gpiod_set_value(reset, 1); > + > + ret = readl_poll_timeout(port + PORT_STATUS, stat, > + stat & PORT_STATUS_READY, 100, 250000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u ready wait timeout\n", i); > + return ret; > + } > + > + rmwl(PORT_REFCLK_CGDIS, 0, port + PORT_REFCLK); > + rmwl(PORT_APPCLK_CGDIS, 0, port + PORT_APPCLK); > + > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > + !(stat & PORT_LINKSTS_BUSY), 100, 250000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u link not busy timeout\n", i); > + return ret; > + } > + > + writel(0xfb512fff, port + PORT_INTMSKSET); > + > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > + > + usleep_range(5000, 10000); > + > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > + > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > + stat & PORT_LINKSTS_UP, 100, 500000); > + if (ret < 0) { > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > + return ret; > + } > + > + writel(DOORBELL_ADDR, port + PORT_MSIADDR); > + writel(0, port + PORT_MSIBASE); > + writel((5 << PORT_MSICFG_L2MSINUM_SHIFT) | PORT_MSICFG_EN, > + port + PORT_MSICFG); > + > + return 0; > +} > + > +static int apple_msi_init(struct apple_pcie *pcie) > +{ > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); > + struct device_node *parent_intc; > + struct irq_domain *parent; > + int ret, i; > + > + pcie->rc = devm_of_iomap(pcie->dev, to_of_node(fwnode), 1, NULL); Use devm_platform_ioremap_resource instead. Rob
Hi Rob, Thanks for the review. > > +#define CORE_RC_PHYIF_CTL 0x00024 > > +#define CORE_RC_PHYIF_CTL_RUN BIT(0) > > +#define CORE_RC_PHYIF_STAT 0x00028 > > +#define CORE_RC_PHYIF_STAT_REFCLK BIT(4) > > +#define CORE_RC_CTL 0x00050 > > +#define CORE_RC_CTL_RUN BIT(0) > > +#define CORE_RC_STAT 0x00058 > > +#define CORE_RC_STAT_READY BIT(0) > > +#define CORE_FABRIC_STAT 0x04000 > > +#define CORE_FABRIC_STAT_MASK 0x001F001F > > +#define CORE_PHY_CTL 0x80000 > > +#define CORE_PHY_CTL_CLK0REQ BIT(0) > > +#define CORE_PHY_CTL_CLK1REQ BIT(1) > > +#define CORE_PHY_CTL_CLK0ACK BIT(2) > > +#define CORE_PHY_CTL_CLK1ACK BIT(3) > > +#define CORE_PHY_CTL_RESET BIT(7) > > I was going to say these should be a phy driver perhaps, but they are > unused. So for now, just drop them. Removed in v2. CORE_PHY_CTRL is used in the asahi linux bootloader (m1n1, shared between linux+uboot+bsd) to do early pcie bringup. They are indeed not used here, nor are they used in the uboot/bsd drivers. > > +static int apple_pcie_setup_port(struct apple_pcie *pcie, unsigned int i) > > +{ > > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); > > Doesn't look like you ever use the fwnode, just get the DT node > pointer. Unless this driver is going to use ACPI someday (and ACPI > changes how PCI is done), there's no point in using fwnode. Dropped in v2. That was a copypaste fail splitting off apple_pcie_setup_port from apple_msi_init in an early revision. > It's preferred to use platform resource api and ioremap over DT functions. > ... > Use devm_platform_ioremap_resource instead. Done in v2. Thanks, Alyssa
> > +/* The hardware exposes 3 ports. Port 0 (WiFi and Bluetooth) is > > special, as it > > + * is power-gated by SMC to facilitate rfkill. > > + */ > > +enum apple_pcie_port { > > + APPLE_PCIE_PORT_RADIO = 0, > > + APPLE_PCIE_PORT_USB = 1, > > + APPLE_PCIE_PORT_ETHERNET = 2, > > + APPLE_PCIE_NUM_PORTS > > +}; > > This will also be used for the Thunderbolt ports, where this enum > won't apply at all. I could also see Apple changing the individual port > assignments in the future. I'd just remove it here and have this file be > a generic PCIe driver for Apple SoCs. Removed in v2. > As above, I don't think it makes sense to special-case anything for the > devices on the bus here. These controllers also have hot plug support, > so the radios don't have to be on by the time it's initialized. > We could also just turn them on in the bootloader for now. This should be fixed in v2 with Mark's device tree bindings.
Hi Marc, Thank you for the review. > This really needs to be split into multiple patches: > > - PCI probing > - Clock management > - MSI implementation Split up in v2. > > +enum apple_pcie_port { > > + APPLE_PCIE_PORT_RADIO = 0, > > + APPLE_PCIE_PORT_USB = 1, > > + APPLE_PCIE_PORT_ETHERNET = 2, > > This really doesn't belong in a low-level PCIe controller driver. The > ports should be purely generic. Fixed in v2. > Please use relaxed accessors. If the barriers are actually needed, > please document what you are ordering against. This applies throughout > the patch. Relaxed accessors are used throughout in v2... it Works For Me™ but no guarantees I didn't introduce a race... > This also begs the question: can this be called concurrently? I'm not sure. Sven, any idea how Apple devices are usually structured here? > > +static void apple_msi_top_irq_eoi(struct irq_data *d) > > +{ > > + irq_chip_eoi_parent(d); > > +} > > Drop this and use the irq_chip_eoi_parent() directly in the > irqchip. This was only here for debug. Done in v2. > In any case, please use the lower_32bit/upper_32bit helpers, just in > case we can move the address somewhere else. Done in v2. > > + writel(0xfb512fff, port + PORT_INTMSKSET); > > Magic. What is this for? Sven's explanation is the most likely. This magic value comes from Corellium via Mark; I assume it's written by macOS. > > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > > + > > + usleep_range(5000, 10000); > > + > > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > > + > > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > > + stat & PORT_LINKSTS_UP, 100, 500000); > > + if (ret < 0) { > > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > > + return ret; > > + } > > I have the strong feeling that a lot of things in the above is to get > an interrupt when the port reports an event. Why the polling then? I reordered the code to have the configuration after this happen before the START command as suggested (this works), and then removed the poll entirely (this also works?). It's possible the poll here was just a debug leftover in the original code. It's possible it's needed in the original but not needed in the interrupt-driven common code (if the link doesn't come up yet, nothing happens, so we don't have to block on it ourselves..) It's also possible I've introduced a race that we happen to win every time. Without specs, it's exceedingly hard to know which it is... The poll isn't what we want at any rate, so I've removed the poll in v2. But we may want extra interrupt handling code for v3. > > + > > + writel(DOORBELL_ADDR, port + PORT_MSIADDR); > > + writel(0, port + PORT_MSIBASE); > > So here you go, the MSI doorbell *is* configurable. Should it be > placed somewhere else? Shouldn't it be configured before the link is > up? Fixed in v2. > > + /* Bringing up the radios requires SMC. Skip for now. */ > > + if (i == APPLE_PCIE_PORT_RADIO) > > + continue; > > Why? Shouldn't this be moved into the driver for the endpoint, rather > than hardcoding something here which is likely to change from one > system to another? If establishing the link actually requires talking > to another part of the system, then it should be described in DT. Fixed in v2. Thanks, Alyssa
> > > +static int apple_pcie_setup_refclk(void __iomem *rc, > > > + void __iomem *port, > > > + unsigned int idx) > > > +{ > > > + u32 stat; > > > + int res; > > > + > > > + res = readl_poll_timeout(rc + CORE_RC_PHYIF_STAT, stat, > > > + stat & CORE_RC_PHYIF_STAT_REFCLK, 100, 50000); > > > + if (res < 0) > > > + return res; > > > + > > > + rmwl(0, CORE_LANE_CTL_CFGACC, rc + CORE_LANE_CTL(idx)); > > > + rmwl(0, CORE_LANE_CFG_REFCLK0REQ, rc + CORE_LANE_CFG(idx)); > > > + > > > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > > > + stat & CORE_LANE_CFG_REFCLK0ACK, 100, 50000); > > > + if (res < 0) > > > + return res; > > > + > > > + rmwl(0, CORE_LANE_CFG_REFCLK1, rc + CORE_LANE_CFG(idx)); > > > + res = readl_poll_timeout(rc + CORE_LANE_CFG(idx), stat, > > > + stat & CORE_LANE_CFG_REFCLK1, 100, 50000); > > > + > > > + if (res < 0) > > > + return res; > > > + > > > + rmwl(CORE_LANE_CTL_CFGACC, 0, rc + CORE_LANE_CTL(idx)); > > > + udelay(1); > > > + rmwl(0, CORE_LANE_CFG_REFCLKEN, rc + CORE_LANE_CFG(idx)); > > > + > > > + rmwl(0, PORT_REFCLK_EN, port + PORT_REFCLK); > > > + > > > + return 0; > > > +} > > This really wants to be moved to its own clock driver, unless there is > a strong guarantee that the clock tree isn't shared with anything > else. I expect that parts of that clock tree will need to be > refcounted, and that's where having a real clock driver will help. > > I also have the strong feeling that the clock hierarchy will need to > be described in DT one way or another, if only to be able to cope with > future revisions of this block. Could be, but this is the most magical part of the driver (no docs...) and that means it's prohibitively difficult to design useful DT bindings... For whatever it's worth the Corellium code doesn't expose any DT bindings here, so maybe this hasn't changed since older Apple SoCs. Orthogonal to this magic code, we /do/ need DT bindings for the clock/power gates. At the moment, m1n1 enables the PCIe clock and leaves it on, so it's not urgent for this series. But in the future when we want fine grained power management, we'll need it modeled. Sven has a WIP clock gate driver and proposed bindings, which can be added to the PCIe DT later nonintrusively. That shouldn't block this series, however.
Hi Andy, Thanks for the review. > + depends on GPIOLIB > > It doesn’t seem to provide a GPIO. I thought that was needed to consume GPIOs, but it looks like other PCI drivers don't do it. Removed. > + if (IS_ERR(port)) > + return -ENODEV; > + > + reset = devm_gpiod_get_index(pcie->dev, "reset", i, 0); > > Use appropriate flag. > > > + if (IS_ERR(reset)) > + return PTR_ERR(reset); > + > + gpiod_direction_output(reset, 0); > > Ditto and remove this line. Fixed in v2, thank you. > + usleep_range(5000, 10000); > > Sleep of such length should be explained. Removed in v2. > + > > Redundant blank line Presumably fixed in v2. > + pcie->bitmap = devm_kcalloc(pcie->dev, > BITS_TO_LONGS(pcie->nvecs), > + sizeof(long), GFP_KERNEL); > > devm_bitmap_zalloc() Done in v2. > +static const struct of_device_id apple_pci_of_match[] = { > + { .compatible = "apple,pcie", .data = &apple_m1_cfg_ecam_ops }, > > > > + { }, > > No comma for termination entry Fixed in v2. BR, Alyssa
On Sun, Aug 15, 2021, at 18:49, Marc Zyngier wrote: > On Sun, 15 Aug 2021 13:33:14 +0100, > "Sven Peter" <sven@svenpeter.dev> wrote: > > > > > > > > On Sun, Aug 15, 2021, at 09:42, Marc Zyngier wrote: > > > On Sun, 15 Aug 2021 05:25:25 +0100, > > > Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > [...] > > > > + > > > > +static int apple_pcie_setup_port(struct apple_pcie *pcie, unsigned int i) > > > > +{ > > > > + struct fwnode_handle *fwnode = dev_fwnode(pcie->dev); > > > > + void __iomem *port; > > > > + struct gpio_desc *reset; > > > > + uint32_t stat; > > > > + int ret; > > > > + > > > > + port = devm_of_iomap(pcie->dev, to_of_node(fwnode), i + 3, NULL); > > > > + > > > > + if (IS_ERR(port)) > > > > + return -ENODEV; > > > > + > > > > + reset = devm_gpiod_get_index(pcie->dev, "reset", i, 0); > > > > + if (IS_ERR(reset)) > > > > + return PTR_ERR(reset); > > > > + > > > > + gpiod_direction_output(reset, 0); > > > > + > > > > + rmwl(0, PORT_APPCLK_EN, port + PORT_APPCLK); > > > > + > > > > + ret = apple_pcie_setup_refclk(pcie->rc, port, i); > > > > + if (ret < 0) > > > > + return ret; > > > > + > > > > + rmwl(0, PORT_PERST_OFF, port + PORT_PERST); > > > > + gpiod_set_value(reset, 1); > > > > + > > > > + ret = readl_poll_timeout(port + PORT_STATUS, stat, > > > > + stat & PORT_STATUS_READY, 100, 250000); > > > > + if (ret < 0) { > > > > + dev_err(pcie->dev, "port %u ready wait timeout\n", i); > > > > + return ret; > > > > + } > > > > + > > > > + rmwl(PORT_REFCLK_CGDIS, 0, port + PORT_REFCLK); > > > > + rmwl(PORT_APPCLK_CGDIS, 0, port + PORT_APPCLK); > > > > + > > > > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > > > > + !(stat & PORT_LINKSTS_BUSY), 100, 250000); > > > > + if (ret < 0) { > > > > + dev_err(pcie->dev, "port %u link not busy timeout\n", i); > > > > + return ret; > > > > + } > > > > + > > > > + writel(0xfb512fff, port + PORT_INTMSKSET); > > > > > > Magic. What is this for? > > > > The magic comes from the original Corellium driver. It first masks everything > > except for the interrupts in the next line, then acks the interrupts it keeps > > enabled and then probably wants to wait for PORT_INT_LINK_UP (or any of the > > other interrupts which seem to indicate various error conditions) to fire but > > instead polls for PORT_LINKSTS_UP. > > > > > > > > > + > > > > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > > > > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > > > > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > > > > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > > > > + > > > > + usleep_range(5000, 10000); > > > > + > > > > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > > > > + > > > > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > > > > + stat & PORT_LINKSTS_UP, 100, 500000); > > > > + if (ret < 0) { > > > > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > > > > + return ret; > > > > + } > > > > > > I have the strong feeling that a lot of things in the above is to get > > > an interrupt when the port reports an event. Why the polling then? > > > > > > I'm pretty sure this is true. The same registers are also used to setup > > and handle legacy interrupts. > > > > My current understanding is that PORT_INTSTAT is used to retrieve the fired > > interrupts and ack them, and PORT_INTMSK are the masked interrupts. > > And then PORT_INTMSKSET and PORT_INTMSKCLR can be used to manipulate > > individual bits of PORT_INTMSK with a single store. > > So this really should be modelled as an interrupt controller. If > someone comes up with a bit of a spec (though the bit assignment is > relatively clear), I can update the interrupt code to handle > that. After all, I moan enough at people writing horrible PCI driver > code, I might as well write an example myself and point them to it. Thanks for the offer! Unfortunately, what I've written above is almost everything I know about this hardware. If you tell me what additional details you need to know I can see what I'm able to figure out though and write a small summary. Thanks, Sven
On Mon, 16 Aug 2021 02:31:40 +0100, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > Hi Marc, > > Thank you for the review. I wouldn't call that a review. Only a cursory look and a quick mention of what I found really odd. Without specs, this thing is impossible to properly review. [...] > > Please use relaxed accessors. If the barriers are actually needed, > > please document what you are ordering against. This applies throughout > > the patch. > > Relaxed accessors are used throughout in v2... it Works For Me™ but no > guarantees I didn't introduce a race... That's not exactly what I wanted to read... You really need to make an informed decision on the need of barriers. If the MMIO write needs to be ordered after a main memory write (i.e. a memory write that is consumed by the device you are subsequently writing to), you then need a barrier. If, as I suspect, the device isn't DMA capable and doesn't require ordering with the rest of the memory accesses, then no barriers are required. > > > This also begs the question: can this be called concurrently? > > I'm not sure. Sven, any idea how Apple devices are usually > structured here? Nothing here is Apple specific. If you can get two CPUs to issue a RMW on the same register, this piece of code is broken. This code has an undocumented assumption of being single threaded, and it is pretty unclear whether this assumption holds or not. [...] > > > + writel(0xfb512fff, port + PORT_INTMSKSET); > > > > Magic. What is this for? > > Sven's explanation is the most likely. This magic value comes from > Corellium via Mark; I assume it's written by macOS. I really wish there was no magic values whatsoever, and I've resisted posting the PCIe support patch myself for this very reason. Frankly, this stuff doesn't give me the warm feeling that we know what we're doing. > > > > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > > > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > > > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > > > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > > > + > > > + usleep_range(5000, 10000); > > > + > > > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > > > + > > > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > > > + stat & PORT_LINKSTS_UP, 100, 500000); > > > + if (ret < 0) { > > > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > > > + return ret; > > > + } > > > > I have the strong feeling that a lot of things in the above is to get > > an interrupt when the port reports an event. Why the polling then? > > I reordered the code to have the configuration after this happen > before the START command as suggested (this works), and then removed > the poll entirely (this also works?). It's possible the poll here > was just a debug leftover in the original code. What happens if the core PCI code probes the ports without the link being up yet? > It's possible it's needed in the original but not needed in the > interrupt-driven common code (if the link doesn't come up yet, > nothing happens, so we don't have to block on it ourselves..) > > It's also possible I've introduced a race that we happen to win every time. > > Without specs, it's exceedingly hard to know which it is... Indeed, and I hate this "finger in the air" approach. Specially when you need to trust your data to it. > The poll isn't what we want at any rate, so I've removed the poll in > v2. But we may want extra interrupt handling code for v3. Indeed. I need to rework the MSI patch anyway after the discussion with Rob, and I'll see what I can do for the rest of the event stuff. Thanks, M. -- Without deviation from the norm, progress is not possible.
On Mon, Aug 16, 2021 at 11:57 PM Marc Zyngier <maz@kernel.org> wrote: > On Mon, 16 Aug 2021 02:31:40 +0100, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > > > Please use relaxed accessors. If the barriers are actually needed, > > > please document what you are ordering against. This applies throughout > > > the patch. > > > > Relaxed accessors are used throughout in v2... it Works For Me™ but no > > guarantees I didn't introduce a race... > > That's not exactly what I wanted to read... You really need to make an > informed decision on the need of barriers. If the MMIO write needs to > be ordered after a main memory write (i.e. a memory write that is > consumed by the device you are subsequently writing to), you then need > a barrier. If, as I suspect, the device isn't DMA capable and doesn't > require ordering with the rest of the memory accesses, then no > barriers are required. My normal rule is to always use the normal accessors, and only use any special variants if this is either required for correct operation (e.g. heavy barriers on arm32 may call code that must not recursively use heavy barriers) or that you have proven to /both/ be correct and relevant for performance. IOW, don't use the relaxed accessors just because it isn't wrong in your driver, other developers may copy the code into a driver that can't do it. Arnd
On Mon, Aug 16, 2021, at 23:56, Marc Zyngier wrote: > [...] > > > > > + writel(0xfb512fff, port + PORT_INTMSKSET); > > > > > > Magic. What is this for? > > > > Sven's explanation is the most likely. This magic value comes from > > Corellium via Mark; I assume it's written by macOS. > > I really wish there was no magic values whatsoever, and I've resisted > posting the PCIe support patch myself for this very reason. Frankly, > this stuff doesn't give me the warm feeling that we know what we're > doing. I'm pretty sure we can get rid of most magic since we have register names for almost everything we need and since m1n1 does the really obscure black magic involving the PHY layer and those tunables thanks to Mark. As I mentioned earlier, all bits missing in 0xfb512fff are those used in the writel one line below. This line only keeps a set of interrupts unmasked and the next one acks exactly this set (which isn't correct, but that's what this code does). There only unknown interrupt here is BIT(26) but this whole sequence is like a cargo cult anyway right now since nothing checks for these interrupts. > > > > > > > + writel(PORT_INT_LINK_UP | PORT_INT_LINK_DOWN | PORT_INT_AF_TIMEOUT | > > > > + PORT_INT_REQADDR_GT32 | PORT_INT_MSI_ERR | > > > > + PORT_INT_MSI_BAD_DATA | PORT_INT_CPL_ABORT | > > > > + PORT_INT_CPL_TIMEOUT | (1 << 26), port + PORT_INTSTAT); > > > > + > > > > + usleep_range(5000, 10000); > > > > + > > > > + rmwl(0, PORT_LTSSMCTL_START, port + PORT_LTSSMCTL); > > > > + > > > > + ret = readl_poll_timeout(port + PORT_LINKSTS, stat, > > > > + stat & PORT_LINKSTS_UP, 100, 500000); > > > > + if (ret < 0) { > > > > + dev_err(pcie->dev, "port %u link up wait timeout\n", i); > > > > + return ret; > > > > + } > > > > > > I have the strong feeling that a lot of things in the above is to get > > > an interrupt when the port reports an event. Why the polling then? > > > > I reordered the code to have the configuration after this happen > > before the START command as suggested (this works), and then removed > > the poll entirely (this also works?). It's possible the poll here > > was just a debug leftover in the original code. > > What happens if the core PCI code probes the ports without the link > being up yet? We pretty much rely on everything being slow enough that the ports aren't probed if there's no readl_poll_timeout and no waiting for the link up interrupt. Now it doesn't take long for the link to be up after the LTSSM has been started, but it's still a few cycles and this is not a good idea. This needs to wait for the interrupt. I don't know yet what the first usleep_range is used for, but I'm willing to bet it's either waiting for another interrupt to fire or sometimes the link doesn't come up the first time and you just have to try again and the usleep prevents that. (I'm less inclined to bet on this one, but: this might be required for the first port with the WiFi/bluetooth radios which will never come up unless power has been enabled by talking to another co-processor first. That usleep_range might be a hack so that this code always runs after power has been enabled.) That same LTSSM retry flow is used for Thunderbolt hotplugging fwiw: Wait for the NHI layer to notify you, start the link training, wait for the link up interrupt (or the link down or error interrupt and just try link training again a few times), rescan the port. > > > It's possible it's needed in the original but not needed in the > > interrupt-driven common code (if the link doesn't come up yet, > > nothing happens, so we don't have to block on it ourselves..) > > > > It's also possible I've introduced a race that we happen to win every time. > > > > Without specs, it's exceedingly hard to know which it is... > > Indeed, and I hate this "finger in the air" approach. Specially when > you need to trust your data to it. > > > The poll isn't what we want at any rate, so I've removed the poll in > > v2. But we may want extra interrupt handling code for v3. > > Indeed. I need to rework the MSI patch anyway after the discussion > with Rob, and I'll see what I can do for the rest of the event stuff. Again, thanks a lot for this! As I said in the other mail, if you need any specific information about this hardware just let me know. I won't be able to give you an accurate spec but I can try to figure out most details you need. Thanks, Sven
On Tue, 17 Aug 2021 08:34:35 +0100, Arnd Bergmann <arnd@kernel.org> wrote: > > On Mon, Aug 16, 2021 at 11:57 PM Marc Zyngier <maz@kernel.org> wrote: > > On Mon, 16 Aug 2021 02:31:40 +0100, Alyssa Rosenzweig <alyssa@rosenzweig.io> wrote: > > > > > > Please use relaxed accessors. If the barriers are actually needed, > > > > please document what you are ordering against. This applies throughout > > > > the patch. > > > > > > Relaxed accessors are used throughout in v2... it Works For Me™ but no > > > guarantees I didn't introduce a race... > > > > That's not exactly what I wanted to read... You really need to make an > > informed decision on the need of barriers. If the MMIO write needs to > > be ordered after a main memory write (i.e. a memory write that is > > consumed by the device you are subsequently writing to), you then need > > a barrier. If, as I suspect, the device isn't DMA capable and doesn't > > require ordering with the rest of the memory accesses, then no > > barriers are required. > > My normal rule is to always use the normal accessors, and only use > any special variants if this is either required for correct operation > (e.g. heavy barriers on arm32 may call code that must not recursively > use heavy barriers) or that you have proven to /both/ be correct and > relevant for performance. IOW, don't use the relaxed accessors just > because it isn't wrong in your driver, other developers may copy the > code into a driver that can't do it. And that exactly the reason why I think we should *not* use heavy accessors if they are not required. I have little sympathy for blindly copied code, and spreading unnecessary barriers means that we cannot further reason about the actual ordering requirements. In other words, blanket use of heavy MMIO accessors to guarantee memory ordering is not dissimilar to reintroducing the BKL because we don't want people to worry about concurrency. M. -- Without deviation from the norm, progress is not possible.
On 15/08/2021 21.33, Sven Peter wrote: > The magic comes from the original Corellium driver. It first masks everything > except for the interrupts in the next line, then acks the interrupts it keeps > enabled and then probably wants to wait for PORT_INT_LINK_UP (or any of the > other interrupts which seem to indicate various error conditions) to fire but > instead polls for PORT_LINKSTS_UP. Let's not take any magic numbers from their drivers (or what macOS does, for that matter) without making an attempt to understand what they do, unless it becomes clear it's incomprehensible. This has already bit us in the past (the SError disable thing). -- Hector Martin (marcan@marcan.st) Public Key: https://mrcn.st/pub
> From: Hector Martin <marcan@marcan.st> > Date: Wed, 18 Aug 2021 20:43:48 +0900 > > On 15/08/2021 21.33, Sven Peter wrote: > > The magic comes from the original Corellium driver. It first masks > > everything except for the interrupts in the next line, then acks > > the interrupts it keeps enabled and then probably wants to wait > > for PORT_INT_LINK_UP (or any of the other interrupts which seem to > > indicate various error conditions) to fire but instead polls for > > PORT_LINKSTS_UP. > > Let's not take any magic numbers from their drivers (or what macOS does, > for that matter) without making an attempt to understand what they do, > unless it becomes clear it's incomprehensible. This has already bit us > in the past (the SError disable thing). The driver should really only unmask the interrupts it handles in its interrupt handler. We should know the meaning of those bits so using the appropriate symbolic names shouldn't be too difficult. Didn't delve into this yet since U-Boot doesn't do interrupts (so I don't touch the port interrupt registers there) and on OpenBSD I only implemented MSIs for now as all the integrated devices support MSIs just fine. I'll need to revisit this at some point to support the Thunderbolt ports.