Message ID | 20240314-pci-epf-rework-v1-5-6134e6c1d491@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | [01/11] PCI: qcom-ep: Disable resources unconditionally during PERST# assert | expand |
On Thu, Mar 14, 2024 at 08:53:44PM +0530, Manivannan Sadhasivam wrote: > To maintain uniformity across EPF drivers, let's move the DMA > initialization to EPC init callback. This will also allow us to deinit DMA > during PERST# assert in the further commits. > > For EPC drivers without PERST#, DMA deinit will only happen during driver > unbind. > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > --- Reviewed-by: Niklas Cassel <cassel@kernel.org> For the record, I was debugging a problem related to EPF DMA recently and was dumping the DMA mask for the struct device of the epf driver. I was a bit confused to see it as 32-bits, even though the EPC driver has it set to 64-bits. The current code works, because e.g., pci_epf_test_write(), etc, does: struct device *dma_dev = epf->epc->dev.parent; dma_map_single(dma_dev, ...); but it also means that all EPF drivers will do this uglyness. However, if a EPF driver does e.g. dma_alloc_coherent(), and sends in the struct *device for the EPF, which is the most logical thing to do IMO, it will use the wrong DMA mask. Perhaps EPF or EPC code should make sure that the struct *device for the EPF will get the same DMA mask as epf->epc->dev.parent, so that EPF driver developer can use the struct *epf when calling e.g. dma_alloc_coherent(). (This is however not strictly related to this patch, but I thought that I should mention it.) Kind regards, Niklas > drivers/pci/endpoint/functions/pci-epf-mhi.c | 16 ++++++++-------- > drivers/pci/endpoint/functions/pci-epf-test.c | 12 ++++++------ > 2 files changed, 14 insertions(+), 14 deletions(-) > > diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c > index da894a9a447e..4e4300efd9d7 100644 > --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c > +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c > @@ -737,6 +737,14 @@ static int pci_epf_mhi_epc_init(struct pci_epf *epf) > if (!epf_mhi->epc_features) > return -ENODATA; > > + if (info->flags & MHI_EPF_USE_DMA) { > + ret = pci_epf_mhi_dma_init(epf_mhi); > + if (ret) { > + dev_err(dev, "Failed to initialize DMA: %d\n", ret); > + return ret; > + } > + } > + > return 0; > } > > @@ -749,14 +757,6 @@ static int pci_epf_mhi_link_up(struct pci_epf *epf) > struct device *dev = &epf->dev; > int ret; > > - if (info->flags & MHI_EPF_USE_DMA) { > - ret = pci_epf_mhi_dma_init(epf_mhi); > - if (ret) { > - dev_err(dev, "Failed to initialize DMA: %d\n", ret); > - return ret; > - } > - } > - > mhi_cntrl->mmio = epf_mhi->mmio; > mhi_cntrl->irq = epf_mhi->irq; > mhi_cntrl->mru = info->mru; > diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c > index 2fac36553633..8f1e0cb08814 100644 > --- a/drivers/pci/endpoint/functions/pci-epf-test.c > +++ b/drivers/pci/endpoint/functions/pci-epf-test.c > @@ -753,6 +753,12 @@ static int pci_epf_test_epc_init(struct pci_epf *epf) > bool msi_capable = true; > int ret; > > + epf_test->dma_supported = true; > + > + ret = pci_epf_test_init_dma_chan(epf_test); > + if (ret) > + epf_test->dma_supported = false; > + > epc_features = pci_epc_get_features(epc, epf->func_no, epf->vfunc_no); > if (epc_features) { > msix_capable = epc_features->msix_capable; > @@ -942,12 +948,6 @@ static int pci_epf_test_bind(struct pci_epf *epf) > if (ret) > return ret; > > - epf_test->dma_supported = true; > - > - ret = pci_epf_test_init_dma_chan(epf_test); > - if (ret) > - epf_test->dma_supported = false; > - > return 0; > } > > > -- > 2.25.1 >
On Fri, Mar 22, 2024 at 05:10:06PM +0100, Niklas Cassel wrote: > On Thu, Mar 14, 2024 at 08:53:44PM +0530, Manivannan Sadhasivam wrote: > > To maintain uniformity across EPF drivers, let's move the DMA > > initialization to EPC init callback. This will also allow us to deinit DMA > > during PERST# assert in the further commits. > > > > For EPC drivers without PERST#, DMA deinit will only happen during driver > > unbind. > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > --- > > Reviewed-by: Niklas Cassel <cassel@kernel.org> > > > For the record, I was debugging a problem related to EPF DMA recently > and was dumping the DMA mask for the struct device of the epf driver. > I was a bit confused to see it as 32-bits, even though the EPC driver > has it set to 64-bits. > > The current code works, because e.g., pci_epf_test_write(), etc, > does: > struct device *dma_dev = epf->epc->dev.parent; > dma_map_single(dma_dev, ...); > > but it also means that all EPF drivers will do this uglyness. > This ugliness is required as long as the dmaengine is associated only with the EPC. > > > However, if a EPF driver does e.g. > dma_alloc_coherent(), and sends in the struct *device for the EPF, > which is the most logical thing to do IMO, it will use the wrong DMA > mask. > > Perhaps EPF or EPC code should make sure that the struct *device > for the EPF will get the same DMA mask as epf->epc->dev.parent, > so that EPF driver developer can use the struct *epf when calling > e.g. dma_alloc_coherent(). > Makes sense. I think it can be done during bind() in the EPC core. Feel free to submit a patch if you like, otherwise I'll keep it in my todo list. - Mani
On Tue, Mar 26, 2024 at 12:05:42PM +0100, Niklas Cassel wrote: > On Tue, Mar 26, 2024 at 01:56:36PM +0530, Manivannan Sadhasivam wrote: > > On Fri, Mar 22, 2024 at 05:10:06PM +0100, Niklas Cassel wrote: > > > On Thu, Mar 14, 2024 at 08:53:44PM +0530, Manivannan Sadhasivam wrote: > > > > To maintain uniformity across EPF drivers, let's move the DMA > > > > initialization to EPC init callback. This will also allow us to deinit DMA > > > > during PERST# assert in the further commits. > > > > > > > > For EPC drivers without PERST#, DMA deinit will only happen during driver > > > > unbind. > > > > > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > > > --- > > > > > > Reviewed-by: Niklas Cassel <cassel@kernel.org> > > > > > > > > > For the record, I was debugging a problem related to EPF DMA recently > > > and was dumping the DMA mask for the struct device of the epf driver. > > > I was a bit confused to see it as 32-bits, even though the EPC driver > > > has it set to 64-bits. > > > > > > The current code works, because e.g., pci_epf_test_write(), etc, > > > does: > > > struct device *dma_dev = epf->epc->dev.parent; > > > dma_map_single(dma_dev, ...); > > > > > > but it also means that all EPF drivers will do this uglyness. > > > > > > > This ugliness is required as long as the dmaengine is associated only with the > > EPC. > > > > > > > > > > > However, if a EPF driver does e.g. > > > dma_alloc_coherent(), and sends in the struct *device for the EPF, > > > which is the most logical thing to do IMO, it will use the wrong DMA > > > mask. > > > > > > Perhaps EPF or EPC code should make sure that the struct *device > > > for the EPF will get the same DMA mask as epf->epc->dev.parent, > > > so that EPF driver developer can use the struct *epf when calling > > > e.g. dma_alloc_coherent(). > > > > > > > Makes sense. I think it can be done during bind() in the EPC core. Feel free to > > submit a patch if you like, otherwise I'll keep it in my todo list. > > So we still want to test: > -DMA API using the eDMA > -DMA API using the "dummy" memcpy dma-channel. > > However, it seems like both pci-epf-mhi.c and pci-epf-test.c > do either: > -Use DMA API > or > -Use memcpy_fromio()/memcpy_toio() instead of DMA API > > > To me, it seems like we should always be able to use > DMA API (using either a eDMA or "dummy" memcpy). > > I don't really see the need to have the path that does: > memcpy_fromio()/memcpy_toio(). > > I know that for DWC, when using memcpy (and this also > memcpy via DMA API), we need to map the address using > iATU first. > > But that could probably be done using another flag, > perhaps rename that flag FLAG_USE_DMA to NEEDS_MAP or > something. > (Such that we can change these drivers to only have a > code path that uses DMA API.) Looking at pci-epf-mhi.c, it seems to use names like: pci_epf_mhi_iatu_read() and pci_epf_mhi_edma_read(). This seems to be a very DWC focused naming. AFAICT, EPF drivers should work on any PCIe EP controller that implements the EPC API. Yes, I understand that it is only Qualcomm that uses this MHI interface/bus, but what is stopping Qualcomm from using a non-DWC based PCIe EP controller in an upcoming SoC? Surely that Qualcomm SoC could still implement the MHI interface/bus, so perhaps the naming in this EPF driver should use somewhat less "EPC vendor specific" function names? Kind regards, Niklas
On Tue, Mar 26, 2024 at 03:27:27PM +0100, Niklas Cassel wrote: > On Tue, Mar 26, 2024 at 12:05:42PM +0100, Niklas Cassel wrote: > > On Tue, Mar 26, 2024 at 01:56:36PM +0530, Manivannan Sadhasivam wrote: > > > On Fri, Mar 22, 2024 at 05:10:06PM +0100, Niklas Cassel wrote: > > > > On Thu, Mar 14, 2024 at 08:53:44PM +0530, Manivannan Sadhasivam wrote: > > > > > To maintain uniformity across EPF drivers, let's move the DMA > > > > > initialization to EPC init callback. This will also allow us to deinit DMA > > > > > during PERST# assert in the further commits. > > > > > > > > > > For EPC drivers without PERST#, DMA deinit will only happen during driver > > > > > unbind. > > > > > > > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> > > > > > --- > > > > > > > > Reviewed-by: Niklas Cassel <cassel@kernel.org> > > > > > > > > > > > > For the record, I was debugging a problem related to EPF DMA recently > > > > and was dumping the DMA mask for the struct device of the epf driver. > > > > I was a bit confused to see it as 32-bits, even though the EPC driver > > > > has it set to 64-bits. > > > > > > > > The current code works, because e.g., pci_epf_test_write(), etc, > > > > does: > > > > struct device *dma_dev = epf->epc->dev.parent; > > > > dma_map_single(dma_dev, ...); > > > > > > > > but it also means that all EPF drivers will do this uglyness. > > > > > > > > > > This ugliness is required as long as the dmaengine is associated only with the > > > EPC. > > > > > > > > > > > > > > > However, if a EPF driver does e.g. > > > > dma_alloc_coherent(), and sends in the struct *device for the EPF, > > > > which is the most logical thing to do IMO, it will use the wrong DMA > > > > mask. > > > > > > > > Perhaps EPF or EPC code should make sure that the struct *device > > > > for the EPF will get the same DMA mask as epf->epc->dev.parent, > > > > so that EPF driver developer can use the struct *epf when calling > > > > e.g. dma_alloc_coherent(). > > > > > > > > > > Makes sense. I think it can be done during bind() in the EPC core. Feel free to > > > submit a patch if you like, otherwise I'll keep it in my todo list. > > > > So we still want to test: > > -DMA API using the eDMA > > -DMA API using the "dummy" memcpy dma-channel. > > > > However, it seems like both pci-epf-mhi.c and pci-epf-test.c > > do either: > > -Use DMA API > > or > > -Use memcpy_fromio()/memcpy_toio() instead of DMA API > > > > > > To me, it seems like we should always be able to use > > DMA API (using either a eDMA or "dummy" memcpy). > > > > I don't really see the need to have the path that does: > > memcpy_fromio()/memcpy_toio(). > > > > I know that for DWC, when using memcpy (and this also > > memcpy via DMA API), we need to map the address using > > iATU first. > > > > But that could probably be done using another flag, > > perhaps rename that flag FLAG_USE_DMA to NEEDS_MAP or > > something. > > (Such that we can change these drivers to only have a > > code path that uses DMA API.) > > Looking at pci-epf-mhi.c, it seems to use names like: > pci_epf_mhi_iatu_read() and pci_epf_mhi_edma_read(). > > This seems to be a very DWC focused naming. > > AFAICT, EPF drivers should work on any PCIe EP controller that implements > the EPC API. > > Yes, I understand that it is only Qualcomm that uses this MHI interface/bus, > but what is stopping Qualcomm from using a non-DWC based PCIe EP controller > in an upcoming SoC? > > Surely that Qualcomm SoC could still implement the MHI interface/bus, > so perhaps the naming in this EPF driver should use somewhat less > "EPC vendor specific" function names? > Yeah, agree. This needs to be cleaned up. - Mani
+CC Vinod Hello Vinod, I didn't know the answer, so I chose the "call a friend option" ;) I hope that you can help me out :) If you take a look at drivers/pci/endpoint/functions/pci-epf-test.c https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L448-L471 You can see that the driver always does pci_epc_map_addr(), then it will either use: DMA API, e.g. dma_map_single() etc. or memcpy_fromio()/memcpy_toio() based on flag FLAG_USE_DMA. This flag is set via ioctl, so if we run: /usr/bin/pcitest -d the flag will be set, without the -d parameter the flag won't be set. If you look at how the DMA channel is requested: https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L224-L258 If will try to get a private DMA channel, if that fails, it will use the "dummy memcpy" DMA channel. If the FLAG_USE_DMA is set, the transfers itself will use: https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155 either dmaengine_prep_slave_single() or dmaengine_prep_dma_memcpy(), depending on if we are using "dummy memcpy" or not. If you take e.g. the DWC PCIe EP controller, it can have an embedded DMA controller on the PCIe controller, and we will try to detect it when initializing the PCIe EP controller using dw_pcie_edma_detect(): https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-designware-ep.c#L759 For the PCIe EP controller that I am using, which have eDMA built-in, I noticed that if I do not enable the eDMA driver (# CONFIG_DW_EDMA is not set), I noticed that I can still run: /usr/bin/pcitest -d Which will use the "dummy memcpy" DMA channel. Yes, the performance is poor, but it still works, so it appears that the fallback code is working properly. If I enable the eDMA driver (CONFIG_DW_EDMA=y), I can run: /usr/bin/pcitest -d And the performance is good. So my question is: Is the "dummy memcpy" DMA channel always available? Because if it is, I think we could drop the path in the pci-epf-test.c driver which uses memcpy_fromio()/memcpy_toio() instead of DMA API. (Since just having a single path to do I/O in the driver would simplify the driver IMO.) I assume that the "dummy memcpy" DMA channel just uses memcpy_fromio() and memcpy_toio() under the hood, so I assume that using the memcpy_fromio()/ memcpy_toio/() is equivalent to using DMA API + dmaengine_prep_dma_memcpy(). Although it would be nice if we didn't need to have the two separate paths in pci_epf_test_data_transfer() (dmaengine_prep_slave_single() vs dmaengine_prep_dma_memcpy()) to support the "dummy memcpy" channel. But I guess that is not possible... I hope that you can bring some clarity Vinod. (Please read my replies to Mani below before you compose your email, as it does provide more insight to this mess.) Mani, I tried to reply to you inline below, with my limited understanding of how dmaengine works. On Wed, Mar 27, 2024 at 11:48:19AM +0530, Manivannan Sadhasivam wrote: > > So we still want to test: > > -DMA API using the eDMA > > -DMA API using the "dummy" memcpy dma-channel. > > > > IMO, the test driver should just test one form of data transfer. Either CPU > memcpy (using iATU or something similar) or DMA. But I think the motive behind > using DMA memcpy is that to support platforms that do not pass DMA slave > channels in devicetree. > > It is applicable to test driver but not to MHI driver since all DMA supported > MHI platforms will pass the DMA slave channels in devicetree. I don't understand how device tree is relevant here, e.g. qcom-ep.c specifies pcie_ep->pci.edma.nr_irqs = 1; https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-qcom-ep.c#L818 which is sufficient for you to be able to probe/detect eDMA successfully, no need for anything in device tree at all. > > > However, it seems like both pci-epf-mhi.c and pci-epf-test.c > > do either: > > -Use DMA API > > or > > -Use memcpy_fromio()/memcpy_toio() instead of DMA API > > > > > > To me, it seems like we should always be able to use > > DMA API (using either a eDMA or "dummy" memcpy). > > > > No, there are platforms that don't support DMA at all. Like Qcom SDX55, so we > still need to do CPU memcpy. I assume that you mean the the PCIe controller used in SDX55 does not have the eDMA on the PCIe controller, so dw_pcie_edma_detect() will fail to detect any eDMA. That is fine no? I assume that this SoC will still able to use the "dummy" memcpy dma-channel? > > > I don't really see the need to have the path that does: > > memcpy_fromio()/memcpy_toio(). > > > > I know that for DWC, when using memcpy (and this also > > memcpy via DMA API), we need to map the address using > > iATU first. > > > > But that could probably be done using another flag, > > perhaps rename that flag FLAG_USE_DMA to NEEDS_MAP or > > something. > > (Such that we can change these drivers to only have a > > code path that uses DMA API.) > > (...and making sure that inheriting the DMA mask does > > not affect the DMA mask for DMA_MEMCPY.) I was wrong here, pci-epf-test always calls pci_epc_map_addr() regardless if FLAG_USE_DMA is set or not. (Even though this should be unnecessary when using the eDMA.) However, if we look at pci-epf-mhi.c we can see that it does NOT call pci_epc_map_addr() when using DMA API + dmaengine. Is it really safe to avoid pci_epc_map_addr() in all EPC controllers? I assume that it should be safe for all "real" DMA channels. We can see that it is not safe when using DMA API + "dummy" memcpy dma-channel. (That is why I was asking if we need a NEEDS_MAP, or MAP_NOT_NEEDED flag.) > > > > But perhaps I am missing something... and DMA_MEMCPY is > > not always available? Right now pci-epf-test driver has three ways: -DMA API + dmaengine dmaengine_prep_slave_single() -DMA API + dmaengine dmaengine_prep_dma_memcpy() -memcpy_toio()/memcpy_fromio(). pci-epf-mhi.c driver has two ways: -DMA API + dmaengine dmaengine_prep_slave_single() -memcpy_toio()/memcpy_fromio(). pci-epf-test.c: -Always calls pci_epc_map_addr() when using DMA API. pci-epf-mhi.c: -Never calls pci_epc_map_addr() when using DMA API. I honestly don't see any point of having three paths for pci-epf-test. Ideally I would want one, max two. If you think that: -DMA API + dmaengine dmaengine_prep_slave_single() + -memcpy_toio()/memcpy_fromio(). is more logical than: -DMA API + dmaengine dmaengine_prep_slave_single() + -DMA API + dmaengine dmaengine_prep_dma_memcpy() Then I think we should rip out the: -DMA API + dmaengine dmaengine_prep_dma_memcpy() it serves no purpose... if you don't have a "real" DMA channel, just run without the -d flag. Or, if you argue that the dmaengine_prep_dma_memcpy() is there to test the DMA API code (which I can't say that it does, since it doesn't use the exact same code path as a "real" DMA channel, see: https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155 so this argument is questionable). Put it under a --use_dummy_dma, and return failure by default if no "real" DMA channel is found. But even so, that would not address the pci-epf-test and pci-mhi-test inconsistency WRT pci_epc_map_addr(). I think if we rip out: -DMA API + dmaengine dmaengine_prep_dma_memcpy() we could also move the pci_epc_map_addr() so that it is only used for the memcpy_toio()/memcpy_fromio() path. (Or if we add a --use_dummy_dma, we can move the pci_epc_map_addr() to that path, and remove it from the dmaengine_prep_slave_single() path.) Kind regards, Niklas
On 27-03-24, 12:39, Niklas Cassel wrote: > +CC Vinod > > Hello Vinod, > > I didn't know the answer, so I chose the "call a friend option" ;) > I hope that you can help me out :) Anytime :-) > > > If you take a look at drivers/pci/endpoint/functions/pci-epf-test.c > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L448-L471 > > You can see that the driver always does pci_epc_map_addr(), > then it will either use: > DMA API, e.g. dma_map_single() etc. > or > memcpy_fromio()/memcpy_toio() > > based on flag FLAG_USE_DMA. > > This flag is set via ioctl, so if we run: > /usr/bin/pcitest -d > the flag will be set, without the -d parameter the flag won't be set. > > > If you look at how the DMA channel is requested: > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L224-L258 > > If will try to get a private DMA channel, if that fails, > it will use the "dummy memcpy" DMA channel. > > If the FLAG_USE_DMA is set, the transfers itself will use: > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155 > either dmaengine_prep_slave_single() or dmaengine_prep_dma_memcpy(), > depending on if we are using "dummy memcpy" or not. > > > > If you take e.g. the DWC PCIe EP controller, it can have an embedded DMA > controller on the PCIe controller, and we will try to detect it when > initializing the PCIe EP controller using dw_pcie_edma_detect(): > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-designware-ep.c#L759 > > For the PCIe EP controller that I am using, which have eDMA built-in, > I noticed that if I do not enable the eDMA driver (# CONFIG_DW_EDMA is not > set), I noticed that I can still run: > /usr/bin/pcitest -d > > Which will use the "dummy memcpy" DMA channel. > Yes, the performance is poor, but it still works, so it appears that the > fallback code is working properly. > > > If I enable the eDMA driver (CONFIG_DW_EDMA=y), > I can run: > /usr/bin/pcitest -d > > And the performance is good. > > > So my question is: > Is the "dummy memcpy" DMA channel always available? That depends on the system, you may or maynot have such a system where you have a generic memcpy dma controller which can provide you with these channels > > Because if it is, I think we could drop the path in the pci-epf-test.c > driver which uses memcpy_fromio()/memcpy_toio() instead of DMA API. > (Since just having a single path to do I/O in the driver would simplify > the driver IMO.) > > I assume that the "dummy memcpy" DMA channel just uses memcpy_fromio() and > memcpy_toio() under the hood, so I assume that using the memcpy_fromio()/ > memcpy_toio/() is equivalent to using DMA API + dmaengine_prep_dma_memcpy(). > > Although it would be nice if we didn't need to have the two separate paths > in pci_epf_test_data_transfer() (dmaengine_prep_slave_single() vs > dmaengine_prep_dma_memcpy()) to support the "dummy memcpy" channel. > But I guess that is not possible... Based on my reading you might have this mechanism: - eDMA provides dmaengine_prep_slave_single() which transfers data from mem to pci ep device, so fasted - dmaengine_prep_dma_memcpy: This will copy the data but treat it as memory. I dont pci internals to figure out how both can work... So cant really make out why it is slowed - memcpy_xxx that is IO mem functions, so ofc they will be slowest I think the code is decent from fallback pov... chooses fastest path if available on a system > > > I hope that you can bring some clarity Vinod. > (Please read my replies to Mani below before you compose your email, > as it does provide more insight to this mess.) > > Mani, I tried to reply to you inline below, with my limited understanding > of how dmaengine works. > > > On Wed, Mar 27, 2024 at 11:48:19AM +0530, Manivannan Sadhasivam wrote: > > > So we still want to test: > > > -DMA API using the eDMA > > > -DMA API using the "dummy" memcpy dma-channel. > > > > > > > IMO, the test driver should just test one form of data transfer. Either CPU > > memcpy (using iATU or something similar) or DMA. But I think the motive behind > > using DMA memcpy is that to support platforms that do not pass DMA slave > > channels in devicetree. > > > > It is applicable to test driver but not to MHI driver since all DMA supported > > MHI platforms will pass the DMA slave channels in devicetree. > > I don't understand how device tree is relevant here, e.g. qcom-ep.c > specifies pcie_ep->pci.edma.nr_irqs = 1; > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/controller/dwc/pcie-qcom-ep.c#L818 > which is sufficient for you to be able to probe/detect eDMA successfully, > no need for anything in device tree at all. > > > > > > > However, it seems like both pci-epf-mhi.c and pci-epf-test.c > > > do either: > > > -Use DMA API > > > or > > > -Use memcpy_fromio()/memcpy_toio() instead of DMA API > > > > > > > > > To me, it seems like we should always be able to use > > > DMA API (using either a eDMA or "dummy" memcpy). > > > > > > > No, there are platforms that don't support DMA at all. Like Qcom SDX55, so we > > still need to do CPU memcpy. > > I assume that you mean the the PCIe controller used in SDX55 does not > have the eDMA on the PCIe controller, so dw_pcie_edma_detect() will > fail to detect any eDMA. That is fine no? > > I assume that this SoC will still able to use the "dummy" memcpy dma-channel? > > > > > > > I don't really see the need to have the path that does: > > > memcpy_fromio()/memcpy_toio(). > > > > > > I know that for DWC, when using memcpy (and this also > > > memcpy via DMA API), we need to map the address using > > > iATU first. > > > > > > But that could probably be done using another flag, > > > perhaps rename that flag FLAG_USE_DMA to NEEDS_MAP or > > > something. > > > (Such that we can change these drivers to only have a > > > code path that uses DMA API.) > > > (...and making sure that inheriting the DMA mask does > > > not affect the DMA mask for DMA_MEMCPY.) > > I was wrong here, pci-epf-test always calls pci_epc_map_addr() > regardless if FLAG_USE_DMA is set or not. > > (Even though this should be unnecessary when using the eDMA.) > > However, if we look at pci-epf-mhi.c we can see that it does > NOT call pci_epc_map_addr() when using DMA API + dmaengine. > > Is it really safe to avoid pci_epc_map_addr() in all EPC controllers? > I assume that it should be safe for all "real" DMA channels. > We can see that it is not safe when using DMA API + "dummy" memcpy > dma-channel. (That is why I was asking if we need a NEEDS_MAP, or > MAP_NOT_NEEDED flag.) > > > > > > > > But perhaps I am missing something... and DMA_MEMCPY is > > > not always available? > > Right now pci-epf-test driver has three ways: > -DMA API + dmaengine dmaengine_prep_slave_single() > -DMA API + dmaengine dmaengine_prep_dma_memcpy() > -memcpy_toio()/memcpy_fromio(). > > pci-epf-mhi.c driver has two ways: > -DMA API + dmaengine dmaengine_prep_slave_single() > -memcpy_toio()/memcpy_fromio(). > > > pci-epf-test.c: > -Always calls pci_epc_map_addr() when using DMA API. > > pci-epf-mhi.c: > -Never calls pci_epc_map_addr() when using DMA API. > > > I honestly don't see any point of having three paths > for pci-epf-test. Ideally I would want one, max two. > > If you think that: > -DMA API + dmaengine dmaengine_prep_slave_single() > + > -memcpy_toio()/memcpy_fromio(). > > is more logical than: > -DMA API + dmaengine dmaengine_prep_slave_single() > + > -DMA API + dmaengine dmaengine_prep_dma_memcpy() > > Then I think we should rip out the: > -DMA API + dmaengine dmaengine_prep_dma_memcpy() > it serves no purpose... if you don't have a "real" DMA channel, > just run without the -d flag. > > Or, if you argue that the dmaengine_prep_dma_memcpy() is there > to test the DMA API code (which I can't say that it does, since > it doesn't use the exact same code path as a "real" DMA channel, see: > https://github.com/torvalds/linux/blob/v6.9-rc1/drivers/pci/endpoint/functions/pci-epf-test.c#L139-L155 > so this argument is questionable). > > Put it under a --use_dummy_dma, and return failure by default > if no "real" DMA channel is found. > > > But even so, that would not address the pci-epf-test and > pci-mhi-test inconsistency WRT pci_epc_map_addr(). > > I think if we rip out: > -DMA API + dmaengine dmaengine_prep_dma_memcpy() > we could also move the pci_epc_map_addr() so that it is > only used for the memcpy_toio()/memcpy_fromio() path. > > (Or if we add a --use_dummy_dma, we can move the pci_epc_map_addr() to > that path, and remove it from the dmaengine_prep_slave_single() path.) > > > Kind regards, > Niklas
On Fri, Mar 29, 2024 at 12:12:48AM +0530, Vinod Koul wrote: > On 27-03-24, 12:39, Niklas Cassel wrote: > > > > So my question is: > > Is the "dummy memcpy" DMA channel always available? > > That depends on the system, you may or maynot have such a system where > you have a generic memcpy dma controller which can provide you with > these channels I misunderstood DMA_MEMCPY then, I assumed that it was a "software emulated" DMA channel, which allowed the a driver to always use dmaengine + DMA API. It actually uses a real DMA controller. I don't have any DMA controller in the PCIe EP device tree node, but perhaps it can use any DMA controller that has been registered with dmaengine? > Based on my reading you might have this mechanism: > - eDMA provides dmaengine_prep_slave_single() which transfers data from > mem to pci ep device, so fasted > - dmaengine_prep_dma_memcpy: This will copy the data but treat it as > memory. I dont pci internals to figure out how both can work... So > cant really make out why it is slowed > - memcpy_xxx that is IO mem functions, so ofc they will be slowest > > I think the code is decent from fallback pov... chooses fastest path if > available on a system Indeed, it makes more sense to me now, thank you Vinod. > > I was wrong here, pci-epf-test always calls pci_epc_map_addr() > > regardless if FLAG_USE_DMA is set or not. > > > > (Even though this should be unnecessary when using the eDMA.) > > > > However, if we look at pci-epf-mhi.c we can see that it does > > NOT call pci_epc_map_addr() when using DMA API + dmaengine. > > > > Is it really safe to avoid pci_epc_map_addr() in all EPC controllers? > > I assume that it should be safe for all "real" DMA channels. > > We can see that it is not safe when using DMA API + "dummy" memcpy > > dma-channel. (That is why I was asking if we need a NEEDS_MAP, or > > MAP_NOT_NEEDED flag.) > > pci-epf-test.c: > > -Always calls pci_epc_map_addr() when using DMA API. > > > > pci-epf-mhi.c: > > -Never calls pci_epc_map_addr() when using DMA API. Mani, I still think that this part is inconsistent between PCI EPF drivers. Looking more at commit: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Adding Frank on CC, since he is the author of that commit. When the commit added support for eDMA to pci-epf-test, it added an extra parameter to pci_epf_test_data_transfer(), to pass the PCI/DMA address of the remote buffer, in addition to the already provided local physical address that pci_epc_map_addr() has mapped the PCI/DMA address to. So in the case of eDMA transfer, the pci_epc_map_addr() operation is still being performed, even though pci-epf-test never uses the result of the the mapping operation... This is just confusing and a waste of CPU cycles. What I would like is more consistency between the EPF drivers. I guess an if-statement that skips the pci_epc_map_addr() in pci-epf-test if using eDMA would make pci-epf-mhi and pci-epf-test most consistent. However, when reading the DWC databook: -The eDMA and HDMA always goes via the iATU table. If you do not want this, then you need to set the the appropriate bypass bit. For eDMA: "" When you do not want the iATU to translate outbound requests that are generated by the internal DMA module, then you must implement one of the following approaches: - Ensure that the combination of DMA channel programming and iATU control register programming, causes no translation of DMA traffic to be done in the iATU. or - Activate the DMA bypass mode to allow request TLPs which are initiated by the DMA controller to pass through the iATU untranslated. You can activate DMA bypass mode by setting the DMA Bypass field of the iATU Control 2 Register (IATU_REGION_C- TRL_OFF_2_OUTBOUND_0). "" For HDMA: "" When you do not want the iATU to translate outbound requests that are generated by the internal HDMA module, then you must implement one of the following approaches: - Ensure that the combination of HDMA channel programming and iATU control register programming, causes no translation of DMA traffic to be done in the iATU. or - Activate the HDMA bypass mode to allow request TLPs which are initiated by the HDMA controller to pass through the iATU untranslated. You can activate HDMA bypass mode by setting the HDMA Bypass field of the iATU Control 2 Register (IATU_REGION_C- TRL_OFF_2_OUTBOUND_0). "" We also know that, if there is no match in the iATU table: "" The default behavior of the ATU when there is no address match in the outbound direction or no TLP attribute match in the inbound direction, is to pass the transaction through. "" So even if we do not call pci_epc_map_addr(), the eDMA and HDMA will go via the iATU table, it will most likely not find a match, so it will go through untranslated. So I think we need to answer these questions: 1) Do we want to rely on the fact that hopefully none of the iATUs in the DWC controller has configured a mapping that might mess things up for us? I don't see why the PCI/DMA address of the remote buffer, supplied to pci-epf-test via test_reg BAR, might not fall within the physical iATU window on the local EP system. (As long as the PCI EPF driver has mapped any address using pci_epc_map_addr().) This is a big argument that EPF drivers running on a DWC-based EPC should definitely NOT call pci_epc_map_addr() needlessly when using eDMA, as it can be catastrophic. (pci-epf-test needs to be patched.) 2) Can we really assume that both pci-epf-test and pci-epf-mhi does not need to call pci_epc_map_addr() when using a DMA_SLAVE DMA controller? This seems to be designed only with DWC in mind. Other PCIe endpoint controllers might require this. (Yes, for DWC-based controllers, this definitely should be skipped, but EPF drivers are supposed to be independent from a specific EPC.) I'm fine with just avoiding the pci_epc_map_addr() call when using DMA_SLAVE DMA in pci-epf-test for now, as that is the only DMA controller that I'm familiar with. This second question was more a question for how EPF drivers are should be designed now and in the future. Kind regards, Niklas
diff --git a/drivers/pci/endpoint/functions/pci-epf-mhi.c b/drivers/pci/endpoint/functions/pci-epf-mhi.c index da894a9a447e..4e4300efd9d7 100644 --- a/drivers/pci/endpoint/functions/pci-epf-mhi.c +++ b/drivers/pci/endpoint/functions/pci-epf-mhi.c @@ -737,6 +737,14 @@ static int pci_epf_mhi_epc_init(struct pci_epf *epf) if (!epf_mhi->epc_features) return -ENODATA; + if (info->flags & MHI_EPF_USE_DMA) { + ret = pci_epf_mhi_dma_init(epf_mhi); + if (ret) { + dev_err(dev, "Failed to initialize DMA: %d\n", ret); + return ret; + } + } + return 0; } @@ -749,14 +757,6 @@ static int pci_epf_mhi_link_up(struct pci_epf *epf) struct device *dev = &epf->dev; int ret; - if (info->flags & MHI_EPF_USE_DMA) { - ret = pci_epf_mhi_dma_init(epf_mhi); - if (ret) { - dev_err(dev, "Failed to initialize DMA: %d\n", ret); - return ret; - } - } - mhi_cntrl->mmio = epf_mhi->mmio; mhi_cntrl->irq = epf_mhi->irq; mhi_cntrl->mru = info->mru; diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c index 2fac36553633..8f1e0cb08814 100644 --- a/drivers/pci/endpoint/functions/pci-epf-test.c +++ b/drivers/pci/endpoint/functions/pci-epf-test.c @@ -753,6 +753,12 @@ static int pci_epf_test_epc_init(struct pci_epf *epf) bool msi_capable = true; int ret; + epf_test->dma_supported = true; + + ret = pci_epf_test_init_dma_chan(epf_test); + if (ret) + epf_test->dma_supported = false; + epc_features = pci_epc_get_features(epc, epf->func_no, epf->vfunc_no); if (epc_features) { msix_capable = epc_features->msix_capable; @@ -942,12 +948,6 @@ static int pci_epf_test_bind(struct pci_epf *epf) if (ret) return ret; - epf_test->dma_supported = true; - - ret = pci_epf_test_init_dma_chan(epf_test); - if (ret) - epf_test->dma_supported = false; - return 0; }
To maintain uniformity across EPF drivers, let's move the DMA initialization to EPC init callback. This will also allow us to deinit DMA during PERST# assert in the further commits. For EPC drivers without PERST#, DMA deinit will only happen during driver unbind. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> --- drivers/pci/endpoint/functions/pci-epf-mhi.c | 16 ++++++++-------- drivers/pci/endpoint/functions/pci-epf-test.c | 12 ++++++------ 2 files changed, 14 insertions(+), 14 deletions(-)