Message ID | 20241002-u-boot-dwc3-gadget-dcache-fixup-v3-2-5398088ef93c@linaro.org |
---|---|
State | New |
Headers | show |
Series | dwc3: gadget: properly fix cache operations | expand |
On 10/2/24 4:39 PM, Neil Armstrong wrote: > The current flush operation will omit doing a flush/invalidate on > the first and last bytes if the base address and size are not aligned > with DMA_MINALIGN. > > This causes operation failures Qualcomm platforms. > > Take in account the alignment and size of the buffer and also > flush the previous and last cacheline. > > Remove CACHELINE_SIZE which was the same as DMA_MINALIGN. It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times where there used to be one DMA engine on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to really correctly align the buffer. > Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> > --- > drivers/usb/dwc3/io.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h > index 04791d4c9be..a6c2bb0f47d 100644 > --- a/drivers/usb/dwc3/io.h > +++ b/drivers/usb/dwc3/io.h > @@ -20,7 +20,6 @@ > #include <cpu_func.h> > #include <asm/io.h> > > -#define CACHELINE_SIZE CONFIG_SYS_CACHELINE_SIZE > static inline u32 dwc3_readl(void __iomem *base, u32 offset) > { > unsigned long offs = offset - DWC3_GLOBALS_REGS_START; > @@ -50,6 +49,9 @@ static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value) > > static inline void dwc3_flush_cache(uintptr_t addr, int length) > { > - flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE)); > + uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1); > + uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN); > + > + flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr); > }
On 02/10/2024 16:55, Marek Vasut wrote: > On 10/2/24 4:39 PM, Neil Armstrong wrote: >> The current flush operation will omit doing a flush/invalidate on >> the first and last bytes if the base address and size are not aligned >> with DMA_MINALIGN. >> >> This causes operation failures Qualcomm platforms. >> >> Take in account the alignment and size of the buffer and also >> flush the previous and last cacheline. >> >> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN. > > It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times where there used to be one DMA engine on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to really correctly align the buffer. It is definitely true for platforms declaring dma_alloc_coherent() (arm, riscv, x86) except nios2 but there's 0 chance dwc3 appears on a nios2 platform. Neil > >> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> >> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> >> --- >> drivers/usb/dwc3/io.h | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h >> index 04791d4c9be..a6c2bb0f47d 100644 >> --- a/drivers/usb/dwc3/io.h >> +++ b/drivers/usb/dwc3/io.h >> @@ -20,7 +20,6 @@ >> #include <cpu_func.h> >> #include <asm/io.h> >> -#define CACHELINE_SIZE CONFIG_SYS_CACHELINE_SIZE >> static inline u32 dwc3_readl(void __iomem *base, u32 offset) >> { >> unsigned long offs = offset - DWC3_GLOBALS_REGS_START; >> @@ -50,6 +49,9 @@ static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value) >> static inline void dwc3_flush_cache(uintptr_t addr, int length) >> { >> - flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE)); >> + uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1); >> + uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN); >> + >> + flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr); >> } >
On 10/3/24 2:49 PM, Neil Armstrong wrote: > On 02/10/2024 16:55, Marek Vasut wrote: >> On 10/2/24 4:39 PM, Neil Armstrong wrote: >>> The current flush operation will omit doing a flush/invalidate on >>> the first and last bytes if the base address and size are not aligned >>> with DMA_MINALIGN. >>> >>> This causes operation failures Qualcomm platforms. >>> >>> Take in account the alignment and size of the buffer and also >>> flush the previous and last cacheline. >>> >>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN. >> >> It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE >> (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine >> alignment requirement (from times where there used to be one DMA >> engine on most devices). You likely want a >> max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to >> really correctly align the buffer. > > It is definitely true for platforms declaring dma_alloc_coherent() (arm, > riscv, x86) > except nios2 but there's 0 chance dwc3 appears on a nios2 platform. There is real chance of that, because on modern SoCFPGA platforms (Agilex) you can have the FPGA content access the SoC peripherals, and one of the SoC peripherals is DWC3 controller. If anyone would actually synthesize it is another question ... but it is an FPGA, so that option exists.
On 03/10/2024 15:19, Marek Vasut wrote: > On 10/3/24 2:49 PM, Neil Armstrong wrote: >> On 02/10/2024 16:55, Marek Vasut wrote: >>> On 10/2/24 4:39 PM, Neil Armstrong wrote: >>>> The current flush operation will omit doing a flush/invalidate on >>>> the first and last bytes if the base address and size are not aligned >>>> with DMA_MINALIGN. >>>> >>>> This causes operation failures Qualcomm platforms. >>>> >>>> Take in account the alignment and size of the buffer and also >>>> flush the previous and last cacheline. >>>> >>>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN. >>> >>> It isn't the same, CACHELINE_SIZE was set to CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times where there used to be one DMA engine on most devices). You likely want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment-requirement) to really correctly align the buffer. >> >> It is definitely true for platforms declaring dma_alloc_coherent() (arm, riscv, x86) >> except nios2 but there's 0 chance dwc3 appears on a nios2 platform. > There is real chance of that, because on modern SoCFPGA platforms (Agilex) you can have the FPGA content access the SoC peripherals, and one of the SoC peripherals is DWC3 controller. If anyone would actually synthesize it is another question ... but it is an FPGA, so that option exists. Guess I'll switch to CACHELINE_SIZE instead of DMA_MINALIGN for nios2. Neil
On 10/4/24 9:16 AM, neil.armstrong@linaro.org wrote: > On 03/10/2024 15:19, Marek Vasut wrote: >> On 10/3/24 2:49 PM, Neil Armstrong wrote: >>> On 02/10/2024 16:55, Marek Vasut wrote: >>>> On 10/2/24 4:39 PM, Neil Armstrong wrote: >>>>> The current flush operation will omit doing a flush/invalidate on >>>>> the first and last bytes if the base address and size are not aligned >>>>> with DMA_MINALIGN. >>>>> >>>>> This causes operation failures Qualcomm platforms. >>>>> >>>>> Take in account the alignment and size of the buffer and also >>>>> flush the previous and last cacheline. >>>>> >>>>> Remove CACHELINE_SIZE which was the same as DMA_MINALIGN. >>>> >>>> It isn't the same, CACHELINE_SIZE was set to >>>> CONFIG_SYS_CACHELINE_SIZE (CPU L1 cache cacheline length) while >>>> ARCH_DMA_MINALIGN is DMA engine alignment requirement (from times >>>> where there used to be one DMA engine on most devices). You likely >>>> want a max(CONFIG_SYS_CACHELINE_SIZE, dwc3-buffer-alignment- >>>> requirement) to really correctly align the buffer. >>> >>> It is definitely true for platforms declaring dma_alloc_coherent() >>> (arm, riscv, x86) >>> except nios2 but there's 0 chance dwc3 appears on a nios2 platform. >> There is real chance of that, because on modern SoCFPGA platforms >> (Agilex) you can have the FPGA content access the SoC peripherals, and >> one of the SoC peripherals is DWC3 controller. If anyone would >> actually synthesize it is another question ... but it is an FPGA, so >> that option exists. > > Guess I'll switch to CACHELINE_SIZE instead of DMA_MINALIGN for nios2. I think max(CONFIG_SYS_CACHELINE_SIZE, ARCH_DMA_MINALIGN) should cover all the cases ?
diff --git a/drivers/usb/dwc3/io.h b/drivers/usb/dwc3/io.h index 04791d4c9be..a6c2bb0f47d 100644 --- a/drivers/usb/dwc3/io.h +++ b/drivers/usb/dwc3/io.h @@ -20,7 +20,6 @@ #include <cpu_func.h> #include <asm/io.h> -#define CACHELINE_SIZE CONFIG_SYS_CACHELINE_SIZE static inline u32 dwc3_readl(void __iomem *base, u32 offset) { unsigned long offs = offset - DWC3_GLOBALS_REGS_START; @@ -50,6 +49,9 @@ static inline void dwc3_writel(void __iomem *base, u32 offset, u32 value) static inline void dwc3_flush_cache(uintptr_t addr, int length) { - flush_dcache_range(addr, addr + ROUND(length, CACHELINE_SIZE)); + uintptr_t start_addr = (uintptr_t)addr & ~(ARCH_DMA_MINALIGN - 1); + uintptr_t end_addr = ALIGN((uintptr_t)addr + length, ARCH_DMA_MINALIGN); + + flush_dcache_range((unsigned long)start_addr, (unsigned long)end_addr); } #endif /* __DRIVERS_USB_DWC3_IO_H */