Message ID | 20170922212930.620249-5-arnd@arndb.de |
---|---|
State | New |
Headers | show |
Series | None | expand |
On Mon, Sep 25, 2017 at 7:41 AM, David Laight <David.Laight@aculab.com> wrote: > From: Arnd Bergmann >> Sent: 22 September 2017 22:29 > ... >> It seems that this is triggered in part by using strlcpy(), which the >> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy >> is not part of the C standard. > > Neither is strncpy(). > > It'll almost certainly be a marker in a header file somewhere, > so it should be possibly to teach it about other functions. I'm currently travelling and haven't investigated in detail, but from taking a closer look here, I found that the hardened 'strlcpy()' in include/linux/string.h triggers it. There is also a hardened (much shorted) 'strncpy()' that doesn't trigger it in the same file, and having only the extern declaration of strncpy also doesn't. Arnd
On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann <arnd@arndb.de> wrote: > On Mon, Sep 25, 2017 at 7:41 AM, David Laight <David.Laight@aculab.com> wrote: >> From: Arnd Bergmann >>> Sent: 22 September 2017 22:29 >> ... >>> It seems that this is triggered in part by using strlcpy(), which the >>> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy >>> is not part of the C standard. >> >> Neither is strncpy(). >> >> It'll almost certainly be a marker in a header file somewhere, >> so it should be possibly to teach it about other functions. > > I'm currently travelling and haven't investigated in detail, but from > taking a closer look here, I found that the hardened 'strlcpy()' > in include/linux/string.h triggers it. There is also a hardened > (much shorted) 'strncpy()' that doesn't trigger it in the same file, > and having only the extern declaration of strncpy also doesn't. And a little more experimenting leads to this simple patch that fixes the problem: --- a/include/linux/string.h +++ b/include/linux/string.h @@ -254,7 +254,7 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const char *q, size_t size) size_t q_size = __builtin_object_size(q, 0); if (p_size == (size_t)-1 && q_size == (size_t)-1) return __real_strlcpy(p, q, size); - ret = strlen(q); + ret = __builtin_strlen(q); if (size) { size_t len = (ret >= size) ? size - 1 : ret; if (__builtin_constant_p(len) && len >= p_size) The problem is apparently that the fortified strlcpy calls the fortified strlen, which in turn calls strnlen and that ends up calling the extern '__real_strnlen' that gcc cannot reduce to a constant expression for a constant input. Not sure if that change is the best fix, but it seems to address the problem in this driver and probably leads to better code in other places as well. Arnd
On 09/26/2017 09:47 AM, Arnd Bergmann wrote: > On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann <arnd@arndb.de> wrote: >> On Mon, Sep 25, 2017 at 7:41 AM, David Laight <David.Laight@aculab.com> wrote: >>> From: Arnd Bergmann >>>> Sent: 22 September 2017 22:29 >>> ... >>>> It seems that this is triggered in part by using strlcpy(), which the >>>> compiler doesn't recognize as copying at most 'len' bytes, since strlcpy >>>> is not part of the C standard. >>> >>> Neither is strncpy(). >>> >>> It'll almost certainly be a marker in a header file somewhere, >>> so it should be possibly to teach it about other functions. >> >> I'm currently travelling and haven't investigated in detail, but from >> taking a closer look here, I found that the hardened 'strlcpy()' >> in include/linux/string.h triggers it. There is also a hardened >> (much shorted) 'strncpy()' that doesn't trigger it in the same file, >> and having only the extern declaration of strncpy also doesn't. > > And a little more experimenting leads to this simple patch that fixes > the problem: > > --- a/include/linux/string.h > +++ b/include/linux/string.h > @@ -254,7 +254,7 @@ __FORTIFY_INLINE size_t strlcpy(char *p, const > char *q, size_t size) > size_t q_size = __builtin_object_size(q, 0); > if (p_size == (size_t)-1 && q_size == (size_t)-1) > return __real_strlcpy(p, q, size); > - ret = strlen(q); > + ret = __builtin_strlen(q); I think this is not correct. Fortified strlen called here on purpose. If sizeof q is known at compile time and 'q' contains not-null fortified strlen() will panic. > if (size) { > size_t len = (ret >= size) ? size - 1 : ret; > if (__builtin_constant_p(len) && len >= p_size) > > The problem is apparently that the fortified strlcpy calls the fortified strlen, > which in turn calls strnlen and that ends up calling the extern '__real_strnlen' > that gcc cannot reduce to a constant expression for a constant input. Per my observation, it's the code like this: if () fortify_panic(__func__); somehow prevent gcc to merge several "struct i2c_board_info info;" into one stack slot. With the hack bellow, stack usage reduced to ~1,6K: --- include/linux/string.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/include/linux/string.h b/include/linux/string.h index 54d21783e18d..9a96ff3ebf94 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p) if (p_size == (size_t)-1) return __builtin_strlen(p); ret = strnlen(p, p_size); - if (p_size <= ret) - fortify_panic(__func__); return ret; } @@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, __kernel_size_t maxlen) { size_t p_size = __builtin_object_size(p, 0); __kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : p_size); - if (p_size <= ret && maxlen != ret) - fortify_panic(__func__); return ret; } > Not sure if that change is the best fix, but it seems to address the problem in > this driver and probably leads to better code in other places as well. > Probably it would be better to solve this on the strlcpy side, but I haven't found the way to do this right. Alternative solutions: - use memcpy() instead of strlcpy(). All source strings are smaller than I2C_NAME_SIZE, so we could do something like this - memcpy(info.type, "si2168", sizeof("si2168")); Also this should be faster. - Move code under different "case:" in the switch(dev->model) to the separate function should help as well. But it might be harder to backport into stables.
On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin <aryabinin@virtuozzo.com> wrote: > > > On 09/26/2017 09:47 AM, Arnd Bergmann wrote: >> On Mon, Sep 25, 2017 at 11:32 PM, Arnd Bergmann <arnd@arndb.de> wrote: >> + ret = __builtin_strlen(q); > > > I think this is not correct. Fortified strlen called here on purpose. If sizeof q is known at compile time > and 'q' contains not-null fortified strlen() will panic. Ok, got it. >> if (size) { >> size_t len = (ret >= size) ? size - 1 : ret; >> if (__builtin_constant_p(len) && len >= p_size) >> >> The problem is apparently that the fortified strlcpy calls the fortified strlen, >> which in turn calls strnlen and that ends up calling the extern '__real_strnlen' >> that gcc cannot reduce to a constant expression for a constant input. > > > Per my observation, it's the code like this: > if () > fortify_panic(__func__); > > > somehow prevent gcc to merge several "struct i2c_board_info info;" into one stack slot. > With the hack bellow, stack usage reduced to ~1,6K: 1.6k is also what I see with my patch, or any other approach I tried that changes string.h. With the split up em28xx_dvb_init() function (and without changes to string.h), I got down to a few hundred bytes for the largest handler. > --- > include/linux/string.h | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/include/linux/string.h b/include/linux/string.h > index 54d21783e18d..9a96ff3ebf94 100644 > --- a/include/linux/string.h > +++ b/include/linux/string.h > @@ -261,8 +261,6 @@ __FORTIFY_INLINE __kernel_size_t strlen(const char *p) > if (p_size == (size_t)-1) > return __builtin_strlen(p); > ret = strnlen(p, p_size); > - if (p_size <= ret) > - fortify_panic(__func__); > return ret; > } > > @@ -271,8 +269,6 @@ __FORTIFY_INLINE __kernel_size_t strnlen(const char *p, __kernel_size_t maxlen) > { > size_t p_size = __builtin_object_size(p, 0); > __kernel_size_t ret = __real_strnlen(p, maxlen < p_size ? maxlen : p_size); > - if (p_size <= ret && maxlen != ret) > - fortify_panic(__func__); > return ret; I've reduced it further to this change: --- a/include/linux/string.h +++ b/include/linux/string.h @@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path) #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline)) #define __RENAME(x) __asm__(#x) -void fortify_panic(const char *name) __noreturn __cold; +void fortify_panic(const char *name) __cold; void __read_overflow(void) __compiletime_error("detected read beyond size of object passed as 1st parameter"); void __read_overflow2(void) __compiletime_error("detected read beyond size of object passed as 2nd parameter"); void __read_overflow3(void) __compiletime_error("detected read beyond size of object passed as 3rd parameter"); I don't immediately see why the __noreturn changes the behavior here, any idea? >> Not sure if that change is the best fix, but it seems to address the problem in >> this driver and probably leads to better code in other places as well. >> > > Probably it would be better to solve this on the strlcpy side, but I haven't found the way to do this right. > Alternative solutions: > > - use memcpy() instead of strlcpy(). All source strings are smaller than I2C_NAME_SIZE, so we could > do something like this - memcpy(info.type, "si2168", sizeof("si2168")); > Also this should be faster. This would be very similar to the patch I posted at the start of this thread to use strncpy(), right? I was hoping that changing strlcpy() here could also improve other users that might run into the same situation, but stay below the 2048-byte stack frame limit. > - Move code under different "case:" in the switch(dev->model) to the separate function should help as well. > But it might be harder to backport into stables. Agreed, I posted this in earlier versions of the patch series, see https://patchwork.kernel.org/patch/9601025/ The new patch was a result of me trying to come up with a less invasive version to make it easier to backport, since I would like to backport the last patch in the series that depends on all the earlier ones. Arnd
On Thu, Sep 28, 2017 at 4:30 PM, Arnd Bergmann <arnd@arndb.de> wrote: > On Thu, Sep 28, 2017 at 6:09 AM, Andrey Ryabinin > <aryabinin@virtuozzo.com> wrote: >> On 09/27/2017 04:26 PM, Arnd Bergmann wrote: >>> On Tue, Sep 26, 2017 at 9:49 AM, Andrey Ryabinin >>> <aryabinin@virtuozzo.com> wrote: > >>> --- a/include/linux/string.h >>> +++ b/include/linux/string.h >>> @@ -227,7 +227,7 @@ static inline const char *kbasename(const char *path) >>> #define __FORTIFY_INLINE extern __always_inline __attribute__((gnu_inline)) >>> #define __RENAME(x) __asm__(#x) >>> >>> -void fortify_panic(const char *name) __noreturn __cold; >>> +void fortify_panic(const char *name) __cold; >>> void __read_overflow(void) __compiletime_error("detected read beyond >>> size of object passed as 1st parameter"); >>> void __read_overflow2(void) __compiletime_error("detected read beyond >>> size of object passed as 2nd parameter"); >>> void __read_overflow3(void) __compiletime_error("detected read beyond >>> size of object passed as 3rd parameter"); >>> >>> I don't immediately see why the __noreturn changes the behavior here, any idea? >>> >> >> >> At first I thought that this somehow might be related to __asan_handle_no_return(). GCC calls it >> before noreturn function. So I made patch to remove generation of these calls (we don't need them in the kernel anyway) >> but it didn't help. It must be something else than. > > I made a reduced test case yesterday (see http://paste.ubuntu.com/25628030/), > and it shows the same behavior with and without the sanitizer, it uses 128 > bytes without the noreturn attribute and 480 bytes when its added, the sanitizer > adds a factor of 1.5x on top. It's possible that I did something wrong while > reducing, since the original driver file uses very little stack (a few hundred > bytes) without -fsanitize=kernel-address, but finding out what happens in > the reduced case may still help understand the other one. This is now GCC PR82365, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365 I've come up with a workaround, but I'm not sure if that is any better than the alternatives, will send the patch as a follow-up in a bit. Arnd
diff --git a/drivers/media/usb/em28xx/em28xx-dvb.c b/drivers/media/usb/em28xx/em28xx-dvb.c index 4a7db623fe29..06c363dc55ed 100644 --- a/drivers/media/usb/em28xx/em28xx-dvb.c +++ b/drivers/media/usb/em28xx/em28xx-dvb.c @@ -1440,7 +1440,7 @@ static int em28xx_dvb_init(struct em28xx *dev) tda10071_pdata.pll_multiplier = 20, tda10071_pdata.tuner_i2c_addr = 0x14, memset(&board_info, 0, sizeof(board_info)); - strlcpy(board_info.type, "tda10071_cx24118", I2C_NAME_SIZE); + strncpy(board_info.type, "tda10071_cx24118", I2C_NAME_SIZE - 1); board_info.addr = 0x55; board_info.platform_data = &tda10071_pdata; request_module("tda10071"); @@ -1460,7 +1460,7 @@ static int em28xx_dvb_init(struct em28xx *dev) /* attach SEC */ a8293_pdata.dvb_frontend = dvb->fe[0]; memset(&board_info, 0, sizeof(board_info)); - strlcpy(board_info.type, "a8293", I2C_NAME_SIZE); + strncpy(board_info.type, "a8293", I2C_NAME_SIZE - 1); board_info.addr = 0x08; board_info.platform_data = &a8293_pdata; request_module("a8293"); @@ -1643,7 +1643,7 @@ static int em28xx_dvb_init(struct em28xx *dev) m88ds3103_pdata.ts_clk_pol = 1; m88ds3103_pdata.agc = 0x99; memset(&board_info, 0, sizeof(board_info)); - strlcpy(board_info.type, "m88ds3103", I2C_NAME_SIZE); + strncpy(board_info.type, "m88ds3103", I2C_NAME_SIZE - 1); board_info.addr = 0x68; board_info.platform_data = &m88ds3103_pdata; request_module("m88ds3103"); @@ -1664,7 +1664,7 @@ static int em28xx_dvb_init(struct em28xx *dev) /* attach tuner */ ts2020_config.fe = dvb->fe[0]; memset(&board_info, 0, sizeof(board_info)); - strlcpy(board_info.type, "ts2022", I2C_NAME_SIZE); + strncpy(board_info.type, "ts2022", I2C_NAME_SIZE - 1); board_info.addr = 0x60; board_info.platform_data = &ts2020_config; request_module("ts2020"); @@ -1690,7 +1690,7 @@ static int em28xx_dvb_init(struct em28xx *dev) /* attach SEC */ a8293_pdata.dvb_frontend = dvb->fe[0]; memset(&board_info, 0, sizeof(board_info)); - strlcpy(board_info.type, "a8293", I2C_NAME_SIZE); + strncpy(board_info.type, "a8293", I2C_NAME_SIZE - 1); board_info.addr = 0x08; board_info.platform_data = &a8293_pdata; request_module("a8293"); @@ -1729,7 +1729,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2168_config.fe = &dvb->fe[0]; si2168_config.ts_mode = SI2168_TS_PARALLEL; memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2168", I2C_NAME_SIZE); + strncpy(info.type, "si2168", I2C_NAME_SIZE - 1); info.addr = 0x64; info.platform_data = &si2168_config; request_module(info.type); @@ -1755,7 +1755,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2157_config.mdev = dev->media_dev; #endif memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2157", I2C_NAME_SIZE); + strncpy(info.type, "si2157", I2C_NAME_SIZE - 1); info.addr = 0x60; info.platform_data = &si2157_config; request_module(info.type); @@ -1793,7 +1793,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2168_config.fe = &dvb->fe[0]; si2168_config.ts_mode = SI2168_TS_PARALLEL; memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2168", I2C_NAME_SIZE); + strncpy(info.type, "si2168", I2C_NAME_SIZE - 1); info.addr = 0x64; info.platform_data = &si2168_config; request_module(info.type); @@ -1819,7 +1819,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2157_config.mdev = dev->media_dev; #endif memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2146", I2C_NAME_SIZE); + strncpy(info.type, "si2146", I2C_NAME_SIZE - 1); info.addr = 0x60; info.platform_data = &si2157_config; request_module("si2157"); @@ -1853,7 +1853,7 @@ static int em28xx_dvb_init(struct em28xx *dev) /* attach demod */ memset(&tc90522_config, 0, sizeof(tc90522_config)); memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "tc90522sat", I2C_NAME_SIZE); + strncpy(info.type, "tc90522sat", I2C_NAME_SIZE - 1); info.addr = 0x15; info.platform_data = &tc90522_config; request_module("tc90522"); @@ -1875,7 +1875,7 @@ static int em28xx_dvb_init(struct em28xx *dev) qm1d1c0042_config.fe = tc90522_config.fe; qm1d1c0042_config.lpf = 1; memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "qm1d1c0042", I2C_NAME_SIZE); + strncpy(info.type, "qm1d1c0042", I2C_NAME_SIZE - 1); info.addr = 0x61; info.platform_data = &qm1d1c0042_config; request_module(info.type); @@ -1913,7 +1913,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2168_config.fe = &dvb->fe[0]; si2168_config.ts_mode = SI2168_TS_SERIAL; memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2168", I2C_NAME_SIZE); + strncpy(info.type, "si2168", I2C_NAME_SIZE - 1); info.addr = 0x64; info.platform_data = &si2168_config; request_module(info.type); @@ -1939,7 +1939,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2157_config.mdev = dev->media_dev; #endif memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2157", I2C_NAME_SIZE); + strncpy(info.type, "si2157", I2C_NAME_SIZE - 1); info.addr = 0x60; info.platform_data = &si2157_config; request_module(info.type); @@ -1975,7 +1975,7 @@ static int em28xx_dvb_init(struct em28xx *dev) lgdt3306a_config = hauppauge_01595_lgdt3306a_config; lgdt3306a_config.fe = &dvb->fe[0]; lgdt3306a_config.i2c_adapter = &adapter; - strlcpy(info.type, "lgdt3306a", sizeof(info.type)); + strncpy(info.type, "lgdt3306a", sizeof(info.type) - 1); info.addr = 0x59; info.platform_data = &lgdt3306a_config; request_module(info.type); @@ -2002,7 +2002,7 @@ static int em28xx_dvb_init(struct em28xx *dev) si2157_config.mdev = dev->media_dev; #endif memset(&info, 0, sizeof(struct i2c_board_info)); - strlcpy(info.type, "si2157", sizeof(info.type)); + strncpy(info.type, "si2157", sizeof(info.type) - 1); info.addr = 0x60; info.platform_data = &si2157_config; request_module(info.type);
With CONFIG_KASAN, the init function uses a large amount of kernel stack: drivers/media/usb/em28xx/em28xx-dvb.c: In function 'em28xx_dvb_init.part.4': drivers/media/usb/em28xx/em28xx-dvb.c:2061:1: error: the frame size of 3232 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] It seems that this is triggered in part by using strlcpy(), which the compiler doesn't recognize as copying at most 'len' bytes, since strlcpy is not part of the C standard. It does however recognize the standard strncpy() and optimizes away the extra checks for that, using only 1688 bytes in the end. I have another larger patch that we could use in addition to this one, in order to shrink the stack for -fsanitize-address-use-after-scope (with gcc-7.1.1) as well, but that would not be appropriate for stable backports, so let's focus on this one first. Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- drivers/media/usb/em28xx/em28xx-dvb.c | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) -- 2.9.0