diff mbox series

[v7,1/6] riscv: mm: dma-noncoherent: Switch using function pointers for cache management

Message ID 20230330204217.47666-2-prabhakar.mahadev-lad.rj@bp.renesas.com
State New
Headers show
Series RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP | expand

Commit Message

Prabhakar March 30, 2023, 8:42 p.m. UTC
From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Currently, selecting which CMOs to use on a given platform is done using
and ALTERNATIVE_X() macro. This was manageable when there were just two
CMO implementations, but now that there are more and more platforms coming
needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.

To avoid such issues this patch switches to use of function pointers
instead of ALTERNATIVE_X() macro for cache management (the only drawback
being performance over the previous approach).

void (*clean_range)(unsigned long addr, unsigned long size);
void (*inv_range)(unsigned long addr, unsigned long size);
void (*flush_range)(unsigned long addr, unsigned long size);

The above function pointers are provided to be overridden for platforms
needing CMO.

Convert ZICBOM and T-HEAD CMO to use function pointers.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
---
v6->v7
* Updated commit description
* Fixed build issues when CONFIG_ERRATA_THEAD_CMO=n
* Used static const struct ptr to register CMO ops
* Dropped riscv_dma_noncoherent_cmo_ops
* Moved ZICBOM CMO setup to setup.c

v5->v6
* New patch
---
 arch/riscv/errata/thead/errata.c         | 76 ++++++++++++++++++++++++
 arch/riscv/include/asm/dma-noncoherent.h | 73 +++++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h     | 53 -----------------
 arch/riscv/kernel/setup.c                | 49 ++++++++++++++-
 arch/riscv/mm/dma-noncoherent.c          | 25 ++++++--
 arch/riscv/mm/pmem.c                     |  6 +-
 6 files changed, 221 insertions(+), 61 deletions(-)
 create mode 100644 arch/riscv/include/asm/dma-noncoherent.h

Comments

Arnd Bergmann March 30, 2023, 9:34 p.m. UTC | #1
On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two
> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers
> instead of ALTERNATIVE_X() macro for cache management (the only drawback
> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

This is looking pretty good. There are a few things that I
still see sticking out, and I think I've mentioned some of 
them before, but don't remember if there was a reason for
doing it like this:

> +#ifdef CONFIG_ERRATA_THEAD_CMO

I would rename this to not call this an 'ERRATA' but
just make it a driver. Not important though, and there
was probably a reason you did it like this.

> +extern struct riscv_cache_ops noncoherent_cache_ops;
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops 
> *ops);
> +
> +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t 
> size)
> +{
> +	if (noncoherent_cache_ops.clean_range) {
> +		unsigned long addr = (unsigned long)vaddr;
> +
> +		noncoherent_cache_ops.clean_range(addr, size);
> +	}
> +}

The type case should not be necessary here. Instead I would 
make the callbacks use 'void *' as well, not 'unsigned long'.

It's possible that some future cache controller driver requires
passing physical addresses, as is common for last level cache,
but as long as all drivers pass virtual addresses, it's easier
to do the phys_to_virt() in common code.

It also seems wrong to have the fallback be to do nothing
when the pointer is NULL, since that cannot actually work
when a device is not cache coherent.

I would either initialize the function pointer to the
zicbom version and remove the NULL check, or keep the
pointer NULL and have an explicit
'else zicbom_clean_range()' fallback.

> +static void zicbom_register_cmo_ops(void)
> +{
> +	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> +}
> +#else
> +static void zicbom_register_cmo_ops(void) {}
> +#endif

As far as I recall, the #else path here was needed previously
to work around a binutils dependency, but with the current
code, it should be possible to just always enable
CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.

     Arnd
Geert Uytterhoeven March 31, 2023, 7:31 a.m. UTC | #2
Hi Prabhakar,

Thanks for your patch!

On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> Currently, selecting which CMOs to use on a given platform is done using
> and ALTERNATIVE_X() macro. This was manageable when there were just two

the ALTERNATIVE_X()

> CMO implementations, but now that there are more and more platforms coming
> needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
>
> To avoid such issues this patch switches to use of function pointers

"the use" or "using"

> instead of ALTERNATIVE_X() macro for cache management (the only drawback

the ALTERNATIVE_X()

> being performance over the previous approach).
>
> void (*clean_range)(unsigned long addr, unsigned long size);
> void (*inv_range)(unsigned long addr, unsigned long size);
> void (*flush_range)(unsigned long addr, unsigned long size);
>
> The above function pointers are provided to be overridden for platforms
> needing CMO.
>
> Convert ZICBOM and T-HEAD CMO to use function pointers.
>
> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c

> +#ifdef CONFIG_ERRATA_THEAD_CMO

> +static void thead_register_cmo_ops(void)
> +{
> +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> +}
> +#else
> +static void thead_register_cmo_ops(void) {}
> +#endif

> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c

> @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
>              "Non-coherent DMA support enabled without a block size\n");
>         noncoherent_supported = true;
>  }
> +
> +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> +{
> +       if (!ops)
> +               return;

This is never true.
I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
unconditionally from common code, instead of the various *register_cmo_ops()?
But that would have required something like

#ifdef CONFIG_ERRATA_THEAD_CMO
#define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
#else
#define THEAD_CMO_OPS_PTR   NULL
#endif

Or can we come up with some macro like pm_ptr(), but that also takes
care of the "&", so we can do "#define thead_cmo_ops NULL"?

> +
> +       noncoherent_cache_ops = *ops;
> +}
> +EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);

Gr{oetje,eeting}s,

                        Geert
Conor Dooley March 31, 2023, 7:54 a.m. UTC | #3
On Thu, Mar 30, 2023 at 11:34:02PM +0200, Arnd Bergmann wrote:
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

> > +#ifdef CONFIG_ERRATA_THEAD_CMO
> 
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.

I think what was discussed in a prior iteration was that we'd leave
refactoring the T-HEAD bits into a driver for a subsequent work.
Prabhakar March 31, 2023, 10:37 a.m. UTC | #4
Hi Arnd,

Thank you for the review.

On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Mar 30, 2023, at 22:42, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> This is looking pretty good. There are a few things that I
> still see sticking out, and I think I've mentioned some of
> them before, but don't remember if there was a reason for
> doing it like this:
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> I would rename this to not call this an 'ERRATA' but
> just make it a driver. Not important though, and there
> was probably a reason you did it like this.
>
As agreed, we will keep this as is.

> > +extern struct riscv_cache_ops noncoherent_cache_ops;
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops
> > *ops);
> > +
> > +static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t
> > size)
> > +{
> > +     if (noncoherent_cache_ops.clean_range) {
> > +             unsigned long addr = (unsigned long)vaddr;
> > +
> > +             noncoherent_cache_ops.clean_range(addr, size);
> > +     }
> > +}
>
> The type case should not be necessary here. Instead I would
> make the callbacks use 'void *' as well, not 'unsigned long'.
>
Ok, I'll update this on using void *

> It's possible that some future cache controller driver requires
> passing physical addresses, as is common for last level cache,
> but as long as all drivers pass virtual addresses, it's easier
> to do the phys_to_virt() in common code.
>
Agreed.

> It also seems wrong to have the fallback be to do nothing
> when the pointer is NULL, since that cannot actually work
> when a device is not cache coherent.
>
If the device is non cache coherent and if it doesn't support ZICBOM
ISA extension the device won't work anyway. So non-cache coherent
devices until they have their CMO config enabled won't work anyway. So
I didn't see any benefit in enabling ZICBOM by default. Please let me
know if I am misunderstanding.

> I would either initialize the function pointer to the
> zicbom version and remove the NULL check, or keep the
> pointer NULL and have an explicit
> 'else zicbom_clean_range()' fallback.
>
> > +static void zicbom_register_cmo_ops(void)
> > +{
> > +     riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
> > +}
> > +#else
> > +static void zicbom_register_cmo_ops(void) {}
> > +#endif
>
> As far as I recall, the #else path here was needed previously
> to work around a binutils dependency, but with the current
> code, it should be possible to just always enable
> CONFIG_RISCV_ISA_ZICBOM when RISCV_DMA_NONCOHERENT is used.
>
Are you suggesting something like below?

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4dadf35ac721..a55dee98ccf8 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -242,6 +242,7 @@ config RISCV_DMA_NONCOHERENT
        select ARCH_HAS_SYNC_DMA_FOR_CPU
        select ARCH_HAS_SYNC_DMA_FOR_DEVICE
        select DMA_DIRECT_REMAP
+       select RISCV_ISA_ZICBOM

 config AS_HAS_INSN
        def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma)
t0$(comma) t0$(comma) zero)
@@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
        depends on MMU
        depends on RISCV_ALTERNATIVE
        default y
-       select RISCV_DMA_NONCOHERENT
        help
           Adds support to dynamically detect the presence of the ZICBOM
           extension (Cache Block Management Operations) and enable its

But what if the platform doesn't have the ZICBOM ISA extension?

Cheers,
Prabhakar
Arnd Bergmann March 31, 2023, 10:44 a.m. UTC | #5
On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
>> It also seems wrong to have the fallback be to do nothing
>> when the pointer is NULL, since that cannot actually work
>> when a device is not cache coherent.
>>
> If the device is non cache coherent and if it doesn't support ZICBOM
> ISA extension the device won't work anyway. So non-cache coherent
> devices until they have their CMO config enabled won't work anyway. So
> I didn't see any benefit in enabling ZICBOM by default. Please let me
> know if I am misunderstanding.

Two things:

- Having a broken machine crash with in invalid instruction
  exception is better than having it run into silent data
  corruption.

- a correctly predicted branch is typically faster than an
  indirect function call, so the fallback to zicbom makes the
  expected (at least in the future) case the fast one.

> @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
>         depends on MMU
>         depends on RISCV_ALTERNATIVE
>         default y
> -       select RISCV_DMA_NONCOHERENT
>         help
>            Adds support to dynamically detect the presence of the ZICBOM
>            extension (Cache Block Management Operations) and enable its
>
> But what if the platform doesn't have the ZICBOM ISA extension?

Then it needs to register its cache operations before the first
DMA, which is something that it should do anyway. With your
current code, it may work by accident depending on the state of
the cache, but with the version I suggested, it will either work
correctly all the time or crash in an obvious way when misconfigured.

     Arnd
Prabhakar March 31, 2023, 10:45 a.m. UTC | #6
Hi Geert,

Thank you for the review.

On Fri, Mar 31, 2023 at 8:31 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Prabhakar,
>
> Thanks for your patch!
>
> On Thu, Mar 30, 2023 at 10:42 PM Prabhakar <prabhakar.csengg@gmail.com> wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
>
> the ALTERNATIVE_X()
>
OK.

> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
>
> "the use" or "using"
>
OK.

> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
>
> the ALTERNATIVE_X()
>
OK.

> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
> > The above function pointers are provided to be overridden for platforms
> > needing CMO.
> >
> > Convert ZICBOM and T-HEAD CMO to use function pointers.
> >
> > Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
>
> > --- a/arch/riscv/errata/thead/errata.c
> > +++ b/arch/riscv/errata/thead/errata.c
>
> > +#ifdef CONFIG_ERRATA_THEAD_CMO
>
> > +static void thead_register_cmo_ops(void)
> > +{
> > +       riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
> > +}
> > +#else
> > +static void thead_register_cmo_ops(void) {}
> > +#endif
>
> > --- a/arch/riscv/mm/dma-noncoherent.c
> > +++ b/arch/riscv/mm/dma-noncoherent.c
>
> > @@ -75,3 +83,12 @@ void riscv_noncoherent_supported(void)
> >              "Non-coherent DMA support enabled without a block size\n");
> >         noncoherent_supported = true;
> >  }
> > +
> > +void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
> > +{
> > +       if (!ops)
> > +               return;
>
> This is never true.
I just wanted to add a sanity check.

> I guess originally you wanted to call riscv_noncoherent_register_cache_ops()
> unconditionally from common code, instead of the various *register_cmo_ops()?
> But that would have required something like
>
> #ifdef CONFIG_ERRATA_THEAD_CMO
> #define THEAD_CMO_OPS_PTR   (&thead_cmo_ops)
> #else
> #define THEAD_CMO_OPS_PTR   NULL
> #endif
>
riscv_noncoherent_register_cache_ops() will only be called if the
ISA/Errata needs to be applied so I'll drop this check.

Cheers,
Prabhakar
Arnd Bergmann March 31, 2023, 11:36 a.m. UTC | #7
On Fri, Mar 31, 2023, at 12:55, Conor Dooley wrote:
> On Fri, Mar 31, 2023 at 11:37:30AM +0100, Lad, Prabhakar wrote:
>>
>
> Does that actually work? I don't think it does.
> If you try to enable RISCV_ISA_ZICBOM then you won't get
> RISC_DMA_NONCOHERENT turned on. Run menuconfig and disable support for
> Renesas, SiFive and T-Head SoCs & you can replicate.

Right, the circular dependency has to be broken in some form.

> I think one of RISCV_ISA_ZICBOM and RISCV_DMA_NONCOHERENT should just be
> dropped, although I don't know which one to pick!
> Making RISCV_DMA_NONCOHERENT user selectable probably makes the most
> sense.

That sounds good to me.

    Arnd
Prabhakar March 31, 2023, 12:11 p.m. UTC | #8
On Fri, Mar 31, 2023 at 11:45 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Mar 31, 2023, at 12:37, Lad, Prabhakar wrote:
> > On Thu, Mar 30, 2023 at 10:34 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> >> It also seems wrong to have the fallback be to do nothing
> >> when the pointer is NULL, since that cannot actually work
> >> when a device is not cache coherent.
> >>
> > If the device is non cache coherent and if it doesn't support ZICBOM
> > ISA extension the device won't work anyway. So non-cache coherent
> > devices until they have their CMO config enabled won't work anyway. So
> > I didn't see any benefit in enabling ZICBOM by default. Please let me
> > know if I am misunderstanding.
>
> Two things:
>
> - Having a broken machine crash with in invalid instruction
>   exception is better than having it run into silent data
>   corruption.
>
> - a correctly predicted branch is typically faster than an
>   indirect function call, so the fallback to zicbom makes the
>   expected (at least in the future) case the fast one.
>
Ok, thank you for the clarification. I'll default to zicbom.

> > @@ -465,7 +466,6 @@ config RISCV_ISA_ZICBOM
> >         depends on MMU
> >         depends on RISCV_ALTERNATIVE
> >         default y
> > -       select RISCV_DMA_NONCOHERENT
> >         help
> >            Adds support to dynamically detect the presence of the ZICBOM
> >            extension (Cache Block Management Operations) and enable its
> >
> > But what if the platform doesn't have the ZICBOM ISA extension?
>
> Then it needs to register its cache operations before the first
> DMA, which is something that it should do anyway. With your
> current code, it may work by accident depending on the state of
> the cache, but with the version I suggested, it will either work
> correctly all the time or crash in an obvious way when misconfigured.
>
Okay, agreed.

Cheers,
Prabhakar
Christoph Hellwig April 4, 2023, 3:42 p.m. UTC | #9
On Tue, Apr 04, 2023 at 06:24:16AM +0000, Biju Das wrote:
> Just a question, how does function pointer makes a performance difference compared to
> ALTERNATIVE_X() macros?
> 
> On both cases, we are pushing function parameters to stack, jumping to the actual routine
> And then on return pop the variables from stack. Am I missing something here?

Indirect calls have always been more expensive, and with the hard- and
software mitigations for spectre-like attacks they are becoming even
more expensive.

But other other point is adding more cache flushing variants should not
be easy.  Everyone should be using the standardize version.  If it's not
implemented in hardware despite having ratified extensions you can fake
it up in SBI.  Yes, that's way more expensive than indirect calls, but
that's what you get for taping out new hardware that ignores the actual
architecture specification and just does their own made up shit.
Prabhakar April 6, 2023, 6:59 p.m. UTC | #10
Hi Christoph,

On Tue, Apr 4, 2023 at 6:29 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:12PM +0100, Prabhakar wrote:
> > From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
> >
> > Currently, selecting which CMOs to use on a given platform is done using
> > and ALTERNATIVE_X() macro. This was manageable when there were just two
> > CMO implementations, but now that there are more and more platforms coming
> > needing custom CMOs, the use of the ALTERNATIVE_X() macro is unmanageable.
> >
> > To avoid such issues this patch switches to use of function pointers
> > instead of ALTERNATIVE_X() macro for cache management (the only drawback
> > being performance over the previous approach).
> >
> > void (*clean_range)(unsigned long addr, unsigned long size);
> > void (*inv_range)(unsigned long addr, unsigned long size);
> > void (*flush_range)(unsigned long addr, unsigned long size);
> >
>
> NAK.  Function pointers for somthing high performance as cache
> maintainance is a complete no-go.
>
Ok, I will take the ALTERNATIVE() macro route.

Cheers,
Prabhakar
Andrea Parri April 7, 2023, 12:03 a.m. UTC | #11
> But other other point is adding more cache flushing variants should not
> be easy.  Everyone should be using the standardize version.  If it's not
> implemented in hardware despite having ratified extensions you can fake
> it up in SBI.  Yes, that's way more expensive than indirect calls, but
> that's what you get for taping out new hardware that ignores the actual
> architecture specification and just does their own made up shit.

FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
crystal-clear semantics" already smells like "we're doing it wrong" to
me, function pointers would be closer to "we're looking for trouble".

Thanks,
  Andrea
Christoph Hellwig April 7, 2023, 5:33 a.m. UTC | #12
On Fri, Apr 07, 2023 at 02:03:36AM +0200, Andrea Parri wrote:
> > But other other point is adding more cache flushing variants should not
> > be easy.  Everyone should be using the standardize version.  If it's not
> > implemented in hardware despite having ratified extensions you can fake
> > it up in SBI.  Yes, that's way more expensive than indirect calls, but
> > that's what you get for taping out new hardware that ignores the actual
> > architecture specification and just does their own made up shit.
> 
> FWIW, ALTERNATIVE_X() for "three instructions with (what should be a)
> crystal-clear semantics" already smells like "we're doing it wrong" to
> me, function pointers would be closer to "we're looking for trouble".

Thanks for putting my feelings into such concise words.
diff mbox series

Patch

diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index 7e8d50ebb71a..4bb3f2baee03 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -11,10 +11,83 @@ 
 #include <linux/uaccess.h>
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/errata_list.h>
 #include <asm/patch.h>
 #include <asm/vendorid_list.h>
 
+#ifdef CONFIG_ERRATA_THEAD_CMO
+/*
+ * dcache.ipa rs1 (invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01010      rs1       000      00000  0001011
+ * dache.iva rs1 (invalida, virtual address)
+ *   0000001    00110      rs1       000      00000  0001011
+ *
+ * dcache.cpa rs1 (clean, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01001      rs1       000      00000  0001011
+ * dcache.cva rs1 (clean, virtual address)
+ *   0000001    00100      rs1       000      00000  0001011
+ *
+ * dcache.cipa rs1 (clean then invalidate, physical address)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000001    01011      rs1       000      00000  0001011
+ * dcache.civa rs1 (... virtual address)
+ *   0000001    00111      rs1       000      00000  0001011
+ *
+ * sync.s (make sure all cache operations finished)
+ * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
+ *   0000000    11001     00000      000      00000  0001011
+ */
+#define THEAD_inval_A0	".long 0x0265000b"
+#define THEAD_clean_A0	".long 0x0245000b"
+#define THEAD_flush_A0	".long 0x0275000b"
+#define THEAD_SYNC_S	".long 0x0190000b"
+
+#define THEAD_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     THEAD_##_op##_A0 "\n\t"					\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     THEAD_SYNC_S						\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void thead_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void thead_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	THEAD_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+static const struct riscv_cache_ops thead_cmo_ops = {
+	.clean_range = &thead_cmo_clean_range,
+	.inv_range = &thead_cmo_inval_range,
+	.flush_range = &thead_cmo_flush_range,
+};
+
+static void thead_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&thead_cmo_ops);
+}
+#else
+static void thead_register_cmo_ops(void) {}
+#endif
+
 static bool errata_probe_pbmt(unsigned int stage,
 			      unsigned long arch_id, unsigned long impid)
 {
@@ -45,6 +118,9 @@  static bool errata_probe_cmo(unsigned int stage,
 
 	riscv_cbom_block_size = L1_CACHE_BYTES;
 	riscv_noncoherent_supported();
+
+	thead_register_cmo_ops();
+
 	return true;
 }
 
diff --git a/arch/riscv/include/asm/dma-noncoherent.h b/arch/riscv/include/asm/dma-noncoherent.h
new file mode 100644
index 000000000000..fc8d16c89f01
--- /dev/null
+++ b/arch/riscv/include/asm/dma-noncoherent.h
@@ -0,0 +1,73 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#ifndef __ASM_DMA_NONCOHERENT_H
+#define __ASM_DMA_NONCOHERENT_H
+
+#include <linux/dma-direct.h>
+
+#ifdef CONFIG_RISCV_DMA_NONCOHERENT
+
+/*
+ * struct riscv_cache_ops - Structure for CMO function pointers
+ *
+ * @clean_range: Function pointer for clean cache
+ * @inv_range: Function pointer for invalidate cache
+ * @flush_range: Function pointer for flushing the cache
+ */
+struct riscv_cache_ops {
+	void (*clean_range)(unsigned long addr, unsigned long size);
+	void (*inv_range)(unsigned long addr, unsigned long size);
+	void (*flush_range)(unsigned long addr, unsigned long size);
+};
+
+extern struct riscv_cache_ops noncoherent_cache_ops;
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops);
+
+static inline void riscv_dma_noncoherent_clean(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.clean_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.clean_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_flush(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.flush_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.flush_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_inval(void *vaddr, size_t size)
+{
+	if (noncoherent_cache_ops.inv_range) {
+		unsigned long addr = (unsigned long)vaddr;
+
+		noncoherent_cache_ops.inv_range(addr, size);
+	}
+}
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_clean(vaddr, size);
+}
+
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size)
+{
+	riscv_dma_noncoherent_inval(vaddr, size);
+}
+#else
+
+static inline void riscv_dma_noncoherent_pmem_clean(void *vaddr, size_t size) {}
+static inline void riscv_dma_noncoherent_pmem_inval(void *vaddr, size_t size) {}
+
+#endif /* CONFIG_RISCV_DMA_NONCOHERENT */
+
+#endif	/* __ASM_DMA_NONCOHERENT_H */
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fb1a810f3d8c..112429910ee6 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -89,59 +89,6 @@  asm volatile(ALTERNATIVE(						\
 #define ALT_THEAD_PMA(_val)
 #endif
 
-/*
- * dcache.ipa rs1 (invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01010      rs1       000      00000  0001011
- * dache.iva rs1 (invalida, virtual address)
- *   0000001    00110      rs1       000      00000  0001011
- *
- * dcache.cpa rs1 (clean, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01001      rs1       000      00000  0001011
- * dcache.cva rs1 (clean, virtual address)
- *   0000001    00100      rs1       000      00000  0001011
- *
- * dcache.cipa rs1 (clean then invalidate, physical address)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000001    01011      rs1       000      00000  0001011
- * dcache.civa rs1 (... virtual address)
- *   0000001    00111      rs1       000      00000  0001011
- *
- * sync.s (make sure all cache operations finished)
- * | 31 - 25 | 24 - 20 | 19 - 15 | 14 - 12 | 11 - 7 | 6 - 0 |
- *   0000000    11001     00000      000      00000  0001011
- */
-#define THEAD_inval_A0	".long 0x0265000b"
-#define THEAD_clean_A0	".long 0x0245000b"
-#define THEAD_flush_A0	".long 0x0275000b"
-#define THEAD_SYNC_S	".long 0x0190000b"
-
-#define ALT_CMO_OP(_op, _start, _size, _cachesize)			\
-asm volatile(ALTERNATIVE_2(						\
-	__nops(6),							\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	CBO_##_op(a0)							\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	"nop", 0, RISCV_ISA_EXT_ZICBOM, CONFIG_RISCV_ISA_ZICBOM,	\
-	"mv a0, %1\n\t"							\
-	"j 2f\n\t"							\
-	"3:\n\t"							\
-	THEAD_##_op##_A0 "\n\t"						\
-	"add a0, a0, %0\n\t"						\
-	"2:\n\t"							\
-	"bltu a0, %2, 3b\n\t"						\
-	THEAD_SYNC_S, THEAD_VENDOR_ID,					\
-			ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO)	\
-	: : "r"(_cachesize),						\
-	    "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
-	    "r"((unsigned long)(_start) + (_size))			\
-	: "a0")
-
 #define THEAD_C9XX_RV_IRQ_PMU			17
 #define THEAD_C9XX_CSR_SCOUNTEROF		0x5c5
 
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 5d3184cbf518..b2b69d1dec23 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -24,6 +24,7 @@ 
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/dma-noncoherent.h>
 #include <asm/early_ioremap.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
@@ -262,6 +263,50 @@  static void __init parse_dtb(void)
 #endif
 }
 
+#ifdef CONFIG_RISCV_ISA_ZICBOM
+
+#define ZICBOM_CMO_OP(_op, _start, _size, _cachesize)				\
+	asm volatile("mv a0, %1\n\t"						\
+		     "j 2f\n\t"							\
+		     "3:\n\t"							\
+		     CBO_##_op(a0)						\
+		     "add a0, a0, %0\n\t"					\
+		     "2:\n\t"							\
+		     "bltu a0, %2, 3b\n\t"					\
+		     : : "r"(_cachesize),					\
+			 "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)),	\
+			 "r"((unsigned long)(_start) + (_size))			\
+		     : "a0")
+
+static void zicbom_cmo_clean_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_flush_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(flush, addr, size, riscv_cbom_block_size);
+}
+
+static void zicbom_cmo_inval_range(unsigned long addr, unsigned long size)
+{
+	ZICBOM_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+
+const struct riscv_cache_ops zicbom_cmo_ops = {
+	.clean_range = &zicbom_cmo_clean_range,
+	.inv_range = &zicbom_cmo_inval_range,
+	.flush_range = &zicbom_cmo_flush_range,
+};
+
+static void zicbom_register_cmo_ops(void)
+{
+	riscv_noncoherent_register_cache_ops(&zicbom_cmo_ops);
+}
+#else
+static void zicbom_register_cmo_ops(void) {}
+#endif
+
 void __init setup_arch(char **cmdline_p)
 {
 	parse_dtb();
@@ -301,8 +346,10 @@  void __init setup_arch(char **cmdline_p)
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 	if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
-	    riscv_isa_extension_available(NULL, ZICBOM))
+	    riscv_isa_extension_available(NULL, ZICBOM)) {
 		riscv_noncoherent_supported();
+		zicbom_register_cmo_ops();
+	}
 }
 
 static int __init topology_init(void)
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index b9a9f57e02be..ab8f6dc67914 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -9,28 +9,36 @@ 
 #include <linux/dma-map-ops.h>
 #include <linux/mm.h>
 #include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 static bool noncoherent_supported;
 
+struct riscv_cache_ops noncoherent_cache_ops = {
+	.clean_range = NULL,
+	.inv_range = NULL,
+	.flush_range = NULL,
+};
+EXPORT_SYMBOL_GPL(noncoherent_cache_ops);
+
 static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_clean(vaddr, size);
 }
 
 static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_inval(vaddr, size);
 }
 
 static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size)
 {
 	void *vaddr = phys_to_virt(paddr);
 
-	ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(vaddr, size);
 }
 
 static inline bool arch_sync_dma_clean_before_fromdevice(void)
@@ -50,7 +58,7 @@  void arch_dma_prep_coherent(struct page *page, size_t size)
 {
 	void *flush_addr = page_address(page);
 
-	ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_flush(flush_addr, size);
 }
 
 void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
@@ -75,3 +83,12 @@  void riscv_noncoherent_supported(void)
 	     "Non-coherent DMA support enabled without a block size\n");
 	noncoherent_supported = true;
 }
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
+{
+	if (!ops)
+		return;
+
+	noncoherent_cache_ops = *ops;
+}
+EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);
diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
index 089df92ae876..aad7c2209eca 100644
--- a/arch/riscv/mm/pmem.c
+++ b/arch/riscv/mm/pmem.c
@@ -6,16 +6,16 @@ 
 #include <linux/export.h>
 #include <linux/libnvdimm.h>
 
-#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
 
 void arch_wb_cache_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_clean(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
 
 void arch_invalidate_pmem(void *addr, size_t size)
 {
-	ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+	riscv_dma_noncoherent_pmem_inval(addr, size);
 }
 EXPORT_SYMBOL_GPL(arch_invalidate_pmem);