diff mbox series

media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline

Message ID 20241018151448.3694052-1-arnd@kernel.org
State New
Headers show
Series media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline | expand

Commit Message

Arnd Bergmann Oct. 18, 2024, 3:14 p.m. UTC
From: Arnd Bergmann <arnd@arndb.de>

With KASAN enabled, clang fails to optimize the inline version of
vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes
of temporary values spilled to the stack:

drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than]

This seems to affect all versions of clang including the latest (clang-20),
but the degree of stack overhead is different per release.

Marking the function as noinline_for_stack is harmless here and avoids
the problem completely.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
I have not come to a conclusion on how exactly clang fails to do this
right, but can provide the .config and/or preprocessed source files
and command line if we think this should be fixed in clang.
---
 .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c         | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Nathan Chancellor Oct. 18, 2024, 10:45 p.m. UTC | #1
On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> With KASAN enabled, clang fails to optimize the inline version of
> vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes
> of temporary values spilled to the stack:
> 
> drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than]
> 
> This seems to affect all versions of clang including the latest (clang-20),
> but the degree of stack overhead is different per release.
> 
> Marking the function as noinline_for_stack is harmless here and avoids
> the problem completely.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> I have not come to a conclusion on how exactly clang fails to do this
> right, but can provide the .config and/or preprocessed source files
> and command line if we think this should be fixed in clang.

I think this might be related to the issue I reported to upstream LLVM,
as a regression within the past couple of weeks:

https://github.com/llvm/llvm-project/issues/111903

If this is a reasonable workaround, it might be worth doing but I will
probably wait until after the LLVM Developers Meeting next week to ping
the thread to have a better chance of visibility. If we want to work
around this in the kernel, we should Cc stable, as this warning is
present there too.

> ---
>  .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c         | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> index eea709d93820..47c302745c1d 100644
> --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
>  	return ret;
>  }
>  
> -static
> +/* clang stack usage explodes if this is inlined */
> +static noinline_for_stack
>  void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k,
>  					struct vdec_vp9_slice_frame_counts *counts,
>  					struct v4l2_vp9_frame_symbol_counts *counts_helper)
> -- 
> 2.39.5
>
Nathan Chancellor Nov. 18, 2024, 8:06 p.m. UTC | #2
On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> With KASAN enabled, clang fails to optimize the inline version of
> vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes
> of temporary values spilled to the stack:
> 
> drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than]
> 
> This seems to affect all versions of clang including the latest (clang-20),
> but the degree of stack overhead is different per release.
> 
> Marking the function as noinline_for_stack is harmless here and avoids
> the problem completely.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Unfortunately, I have seen no moment on my upstream report and this
warning is breaking allmodconfig builds because of -Werror. Can this be
applied as a workaround for now (preferrably with a Cc: stable on it)?

Reviewed-by: Nathan Chancellor <nathan@kernel.org>

> ---
> I have not come to a conclusion on how exactly clang fails to do this
> right, but can provide the .config and/or preprocessed source files
> and command line if we think this should be fixed in clang.
> ---
>  .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c         | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> index eea709d93820..47c302745c1d 100644
> --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
> @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
>  	return ret;
>  }
>  
> -static
> +/* clang stack usage explodes if this is inlined */
> +static noinline_for_stack
>  void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k,
>  					struct vdec_vp9_slice_frame_counts *counts,
>  					struct v4l2_vp9_frame_symbol_counts *counts_helper)
> -- 
> 2.39.5
>
Sebastian Fricke Nov. 19, 2024, 11:02 a.m. UTC | #3
Hey Nathan,

On 18.11.2024 13:06, Nathan Chancellor wrote:
>On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote:
>> From: Arnd Bergmann <arnd@arndb.de>
>>
>> With KASAN enabled, clang fails to optimize the inline version of
>> vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes
>> of temporary values spilled to the stack:
>>
>> drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than]
>>
>> This seems to affect all versions of clang including the latest (clang-20),
>> but the degree of stack overhead is different per release.
>>
>> Marking the function as noinline_for_stack is harmless here and avoids
>> the problem completely.
>>
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
>Unfortunately, I have seen no moment on my upstream report and this
>warning is breaking allmodconfig builds because of -Werror. Can this be
>applied as a workaround for now (preferrably with a Cc: stable on it)?
>
>Reviewed-by: Nathan Chancellor <nathan@kernel.org>

I'll handle it asap, it will be part of 6.13

Regards,
Sebastian

>
>> ---
>> I have not come to a conclusion on how exactly clang fails to do this
>> right, but can provide the .config and/or preprocessed source files
>> and command line if we think this should be fixed in clang.
>> ---
>>  .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c         | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> index eea709d93820..47c302745c1d 100644
>> --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
>>  	return ret;
>>  }
>>
>> -static
>> +/* clang stack usage explodes if this is inlined */
>> +static noinline_for_stack
>>  void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k,
>>  					struct vdec_vp9_slice_frame_counts *counts,
>>  					struct v4l2_vp9_frame_symbol_counts *counts_helper)
>> --
>> 2.39.5
>>
>
Sebastian Fricke Dec. 18, 2024, 12:45 p.m. UTC | #4
Hey Nathan,

On 17.12.2024 10:46, Nathan Chancellor wrote:
>Hi Sebastian,
>
>On Tue, Nov 19, 2024 at 12:02:26PM +0100, Sebastian Fricke wrote:
>> Hey Nathan,
>>
>> On 18.11.2024 13:06, Nathan Chancellor wrote:
>> > On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote:
>> > > From: Arnd Bergmann <arnd@arndb.de>
>> > >
>> > > With KASAN enabled, clang fails to optimize the inline version of
>> > > vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes
>> > > of temporary values spilled to the stack:
>> > >
>> > > drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than]
>> > >
>> > > This seems to affect all versions of clang including the latest (clang-20),
>> > > but the degree of stack overhead is different per release.
>> > >
>> > > Marking the function as noinline_for_stack is harmless here and avoids
>> > > the problem completely.
>> > >
>> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> >
>> > Unfortunately, I have seen no moment on my upstream report and this
>> > warning is breaking allmodconfig builds because of -Werror. Can this be
>> > applied as a workaround for now (preferrably with a Cc: stable on it)?
>> >
>> > Reviewed-by: Nathan Chancellor <nathan@kernel.org>
>>
>> I'll handle it asap, it will be part of 6.13
>
>Is there any update here? I don't see this patch applied in -next.

I have sent out the PR and it has been applied to the -fixes branch in
the media tree, due to the holiday period there might be some delays but
I expect the change to soon land in Linux-next.

>
>Cheers,
>Nathan

Regards,
Sebastian

>
>> > > ---
>> > > I have not come to a conclusion on how exactly clang fails to do this
>> > > right, but can provide the .config and/or preprocessed source files
>> > > and command line if we think this should be fixed in clang.
>> > > ---
>> > >  .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c         | 3 ++-
>> > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> > > index eea709d93820..47c302745c1d 100644
>> > > --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> > > +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
>> > > @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
>> > >  	return ret;
>> > >  }
>> > >
>> > > -static
>> > > +/* clang stack usage explodes if this is inlined */
>> > > +static noinline_for_stack
>> > >  void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k,
>> > >  					struct vdec_vp9_slice_frame_counts *counts,
>> > >  					struct v4l2_vp9_frame_symbol_counts *counts_helper)
>> > > --
>> > > 2.39.5
>> > >
>> >
Sebastian Fricke
Consultant Software Engineer

Collabora Ltd
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK
Registered in England & Wales no 5513718.
Nathan Chancellor Dec. 18, 2024, 6:11 p.m. UTC | #5
On Wed, Dec 18, 2024 at 01:45:05PM +0100, Sebastian Fricke wrote:
> I have sent out the PR and it has been applied to the -fixes branch in
> the media tree, due to the holiday period there might be some delays but
> I expect the change to soon land in Linux-next.

Ah, thanks a lot! I was looking at the main media tree's fixes branch
but I see it is only in the media-pending tree right now:

https://git.linuxtv.org/media-ci/media-pending.git/commit/?id=8b55f8818900c99dd4f55a59a103f5b29e41eb2c

As long as it is on its way, that's all that matters to me. Appreciate
you getting that process moving!

Cheers,
Nathan
diff mbox series

Patch

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
index eea709d93820..47c302745c1d 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c
@@ -1188,7 +1188,8 @@  static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance,
 	return ret;
 }
 
-static
+/* clang stack usage explodes if this is inlined */
+static noinline_for_stack
 void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k,
 					struct vdec_vp9_slice_frame_counts *counts,
 					struct v4l2_vp9_frame_symbol_counts *counts_helper)