Message ID | 20241018151448.3694052-1-arnd@kernel.org |
---|---|
State | New |
Headers | show |
Series | media: mediatek: vcodec: mark vdec_vp9_slice_map_counts_eob_coef noinline | expand |
On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > With KASAN enabled, clang fails to optimize the inline version of > vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes > of temporary values spilled to the stack: > > drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than] > > This seems to affect all versions of clang including the latest (clang-20), > but the degree of stack overhead is different per release. > > Marking the function as noinline_for_stack is harmless here and avoids > the problem completely. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > I have not come to a conclusion on how exactly clang fails to do this > right, but can provide the .config and/or preprocessed source files > and command line if we think this should be fixed in clang. I think this might be related to the issue I reported to upstream LLVM, as a regression within the past couple of weeks: https://github.com/llvm/llvm-project/issues/111903 If this is a reasonable workaround, it might be worth doing but I will probably wait until after the LLVM Developers Meeting next week to ping the thread to have a better chance of visibility. If we want to work around this in the kernel, we should Cc stable, as this warning is present there too. > --- > .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > index eea709d93820..47c302745c1d 100644 > --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance, > return ret; > } > > -static > +/* clang stack usage explodes if this is inlined */ > +static noinline_for_stack > void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, > struct vdec_vp9_slice_frame_counts *counts, > struct v4l2_vp9_frame_symbol_counts *counts_helper) > -- > 2.39.5 >
On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > With KASAN enabled, clang fails to optimize the inline version of > vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes > of temporary values spilled to the stack: > > drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than] > > This seems to affect all versions of clang including the latest (clang-20), > but the degree of stack overhead is different per release. > > Marking the function as noinline_for_stack is harmless here and avoids > the problem completely. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> Unfortunately, I have seen no moment on my upstream report and this warning is breaking allmodconfig builds because of -Werror. Can this be applied as a workaround for now (preferrably with a Cc: stable on it)? Reviewed-by: Nathan Chancellor <nathan@kernel.org> > --- > I have not come to a conclusion on how exactly clang fails to do this > right, but can provide the .config and/or preprocessed source files > and command line if we think this should be fixed in clang. > --- > .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > index eea709d93820..47c302745c1d 100644 > --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c > @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance, > return ret; > } > > -static > +/* clang stack usage explodes if this is inlined */ > +static noinline_for_stack > void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, > struct vdec_vp9_slice_frame_counts *counts, > struct v4l2_vp9_frame_symbol_counts *counts_helper) > -- > 2.39.5 >
Hey Nathan, On 18.11.2024 13:06, Nathan Chancellor wrote: >On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote: >> From: Arnd Bergmann <arnd@arndb.de> >> >> With KASAN enabled, clang fails to optimize the inline version of >> vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes >> of temporary values spilled to the stack: >> >> drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than] >> >> This seems to affect all versions of clang including the latest (clang-20), >> but the degree of stack overhead is different per release. >> >> Marking the function as noinline_for_stack is harmless here and avoids >> the problem completely. >> >> Signed-off-by: Arnd Bergmann <arnd@arndb.de> > >Unfortunately, I have seen no moment on my upstream report and this >warning is breaking allmodconfig builds because of -Werror. Can this be >applied as a workaround for now (preferrably with a Cc: stable on it)? > >Reviewed-by: Nathan Chancellor <nathan@kernel.org> I'll handle it asap, it will be part of 6.13 Regards, Sebastian > >> --- >> I have not come to a conclusion on how exactly clang fails to do this >> right, but can provide the .config and/or preprocessed source files >> and command line if we think this should be fixed in clang. >> --- >> .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> index eea709d93820..47c302745c1d 100644 >> --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance, >> return ret; >> } >> >> -static >> +/* clang stack usage explodes if this is inlined */ >> +static noinline_for_stack >> void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, >> struct vdec_vp9_slice_frame_counts *counts, >> struct v4l2_vp9_frame_symbol_counts *counts_helper) >> -- >> 2.39.5 >> >
Hey Nathan, On 17.12.2024 10:46, Nathan Chancellor wrote: >Hi Sebastian, > >On Tue, Nov 19, 2024 at 12:02:26PM +0100, Sebastian Fricke wrote: >> Hey Nathan, >> >> On 18.11.2024 13:06, Nathan Chancellor wrote: >> > On Fri, Oct 18, 2024 at 03:14:42PM +0000, Arnd Bergmann wrote: >> > > From: Arnd Bergmann <arnd@arndb.de> >> > > >> > > With KASAN enabled, clang fails to optimize the inline version of >> > > vdec_vp9_slice_map_counts_eob_coef() properly, leading to kilobytes >> > > of temporary values spilled to the stack: >> > > >> > > drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c:1526:12: error: stack frame size (2160) exceeds limit (2048) in 'vdec_vp9_slice_update_prob' [-Werror,-Wframe-larger-than] >> > > >> > > This seems to affect all versions of clang including the latest (clang-20), >> > > but the degree of stack overhead is different per release. >> > > >> > > Marking the function as noinline_for_stack is harmless here and avoids >> > > the problem completely. >> > > >> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> >> > >> > Unfortunately, I have seen no moment on my upstream report and this >> > warning is breaking allmodconfig builds because of -Werror. Can this be >> > applied as a workaround for now (preferrably with a Cc: stable on it)? >> > >> > Reviewed-by: Nathan Chancellor <nathan@kernel.org> >> >> I'll handle it asap, it will be part of 6.13 > >Is there any update here? I don't see this patch applied in -next. I have sent out the PR and it has been applied to the -fixes branch in the media tree, due to the holiday period there might be some delays but I expect the change to soon land in Linux-next. > >Cheers, >Nathan Regards, Sebastian > >> > > --- >> > > I have not come to a conclusion on how exactly clang fails to do this >> > > right, but can provide the .config and/or preprocessed source files >> > > and command line if we think this should be fixed in clang. >> > > --- >> > > .../mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c | 3 ++- >> > > 1 file changed, 2 insertions(+), 1 deletion(-) >> > > >> > > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> > > index eea709d93820..47c302745c1d 100644 >> > > --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> > > +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c >> > > @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance, >> > > return ret; >> > > } >> > > >> > > -static >> > > +/* clang stack usage explodes if this is inlined */ >> > > +static noinline_for_stack >> > > void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, >> > > struct vdec_vp9_slice_frame_counts *counts, >> > > struct v4l2_vp9_frame_symbol_counts *counts_helper) >> > > -- >> > > 2.39.5 >> > > >> > Sebastian Fricke Consultant Software Engineer Collabora Ltd Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, UK Registered in England & Wales no 5513718.
On Wed, Dec 18, 2024 at 01:45:05PM +0100, Sebastian Fricke wrote: > I have sent out the PR and it has been applied to the -fixes branch in > the media tree, due to the holiday period there might be some delays but > I expect the change to soon land in Linux-next. Ah, thanks a lot! I was looking at the main media tree's fixes branch but I see it is only in the media-pending tree right now: https://git.linuxtv.org/media-ci/media-pending.git/commit/?id=8b55f8818900c99dd4f55a59a103f5b29e41eb2c As long as it is on its way, that's all that matters to me. Appreciate you getting that process moving! Cheers, Nathan
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c index eea709d93820..47c302745c1d 100644 --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_vp9_req_lat_if.c @@ -1188,7 +1188,8 @@ static int vdec_vp9_slice_setup_lat(struct vdec_vp9_slice_instance *instance, return ret; } -static +/* clang stack usage explodes if this is inlined */ +static noinline_for_stack void vdec_vp9_slice_map_counts_eob_coef(unsigned int i, unsigned int j, unsigned int k, struct vdec_vp9_slice_frame_counts *counts, struct v4l2_vp9_frame_symbol_counts *counts_helper)