Message ID | 20200607053645.1408462-1-seanga2@gmail.com |
---|---|
State | Accepted |
Commit | 1fae74125b644bb7f21e4afef3b9084c58e7863a |
Headers | show |
Series | Revert "lib: Improve _parse_integer_fixup_radix base 16 detection" | expand |
On 6/7/20 7:36 AM, Sean Anderson wrote: > This reverts commit 0486497e2b5f4d36fa968a1a60fea358cbf70b65. > > The strtoul has well-defined semantics. It is defined by the C standard and > POSIX. To quote the relevant section of the man pages, > >> If base is zero or 16, the string may then include a "0x" prefix, and the >> number will be read in base 16; otherwise, a zero base is taken as 10 >> (decimal) unless the next character is '0', in which case it is taken as >> 8 (octal). > > Keeping these semantics is important for several reasons. First, it is very > surprising for standard library functions to behave differently than usual. > Every other implementation of strtoul has different semantics than the > implementation in U-Boot at the moment. Second, it can result in very > surprising results from small changes. For example, changing the string > "1f" to "20" causes the parsed value to *decrease*. Forcing use of the "0x" > prefix to specify hexidecimal numbers is a feature, not a bug. Lastly, this > is slightly less performant, since the entire number is parsed twice. > > This fixes the str_simple_strtoul test failing with > > test/str_ut.c:29, run_strtoul(): expect_val == val: Expected 0x44b (1099), got 0x1099ab (1087915) > test/str_ut.c:46, str_simple_strtoul(): 0 == run_strtoul(uts, str2, 0, 1099, 4): Expected 0x0 (0), got 0x1 (1) > I suggest to add a Fixes line here: Fixes: 0486497e2b5f ("lib: Improve _parse_integer_fixup_radix base 16 detection") Cf. https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes > Signed-off-by: Sean Anderson <seanga2 at gmail.com> > CC: Michal Simek <michal.simek at xilinx.com> > CC: Shiril Tichkule <shirilt at xilinx.com> > --- > > lib/strto.c | 18 +----------------- > 1 file changed, 1 insertion(+), 17 deletions(-) > > diff --git a/lib/strto.c b/lib/strto.c > index 3d77115d4d..c00bb5895d 100644 > --- a/lib/strto.c > +++ b/lib/strto.c > @@ -22,25 +22,9 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) > *base = 16; > else > *base = 8; > - } else { > - int i = 0; > - char var; > - > + } else scripts/checkpatch.pl reports a problem: CHECK: Unbalanced braces around else statement #50: FILE: lib/strto.c:25: + } else Best regards Heinrich > *base = 10; > - > - do { > - var = tolower(s[i++]); > - if (var >= 'a' && var <= 'f') { > - *base = 16; > - break; > - } > - > - if (!(var >= '0' && var <= '9')) > - break; > - } while (var); > - } > } > - > if (*base == 16 && s[0] == '0' && tolower(s[1]) == 'x') > s += 2; > return s; >
On 6/7/20 2:40 AM, Heinrich Schuchardt wrote: > On 6/7/20 7:36 AM, Sean Anderson wrote: >> This reverts commit 0486497e2b5f4d36fa968a1a60fea358cbf70b65. >> >> The strtoul has well-defined semantics. It is defined by the C standard and >> POSIX. To quote the relevant section of the man pages, >> >>> If base is zero or 16, the string may then include a "0x" prefix, and the >>> number will be read in base 16; otherwise, a zero base is taken as 10 >>> (decimal) unless the next character is '0', in which case it is taken as >>> 8 (octal). >> >> Keeping these semantics is important for several reasons. First, it is very >> surprising for standard library functions to behave differently than usual. >> Every other implementation of strtoul has different semantics than the >> implementation in U-Boot at the moment. Second, it can result in very >> surprising results from small changes. For example, changing the string >> "1f" to "20" causes the parsed value to *decrease*. Forcing use of the "0x" >> prefix to specify hexidecimal numbers is a feature, not a bug. Lastly, this >> is slightly less performant, since the entire number is parsed twice. >> >> This fixes the str_simple_strtoul test failing with >> >> test/str_ut.c:29, run_strtoul(): expect_val == val: Expected 0x44b (1099), got 0x1099ab (1087915) >> test/str_ut.c:46, str_simple_strtoul(): 0 == run_strtoul(uts, str2, 0, 1099, 4): Expected 0x0 (0), got 0x1 (1) >> > > I suggest to add a Fixes line here: > > Fixes: 0486497e2b5f ("lib: Improve _parse_integer_fixup_radix base 16 > detection") Thanks, I will add that. > > Cf. > https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes > >> Signed-off-by: Sean Anderson <seanga2 at gmail.com> >> CC: Michal Simek <michal.simek at xilinx.com> >> CC: Shiril Tichkule <shirilt at xilinx.com> >> --- >> >> lib/strto.c | 18 +----------------- >> 1 file changed, 1 insertion(+), 17 deletions(-) >> >> diff --git a/lib/strto.c b/lib/strto.c >> index 3d77115d4d..c00bb5895d 100644 >> --- a/lib/strto.c >> +++ b/lib/strto.c >> @@ -22,25 +22,9 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) >> *base = 16; >> else >> *base = 8; >> - } else { >> - int i = 0; >> - char var; >> - >> + } else > > scripts/checkpatch.pl reports a problem: > > CHECK: Unbalanced braces around else statement > #50: FILE: lib/strto.c:25: > + } else I chose this form since that is what is used in Linux, and that is how it formatted before the commit this reverts. I do think that it is better form to have braces around this else block, though. --Sean
On 07. 06. 20 7:36, Sean Anderson wrote: > This reverts commit 0486497e2b5f4d36fa968a1a60fea358cbf70b65. > > The strtoul has well-defined semantics. It is defined by the C standard and > POSIX. To quote the relevant section of the man pages, > >> If base is zero or 16, the string may then include a "0x" prefix, and the >> number will be read in base 16; otherwise, a zero base is taken as 10 >> (decimal) unless the next character is '0', in which case it is taken as >> 8 (octal). > > Keeping these semantics is important for several reasons. First, it is very > surprising for standard library functions to behave differently than usual. > Every other implementation of strtoul has different semantics than the > implementation in U-Boot at the moment. Second, it can result in very > surprising results from small changes. For example, changing the string > "1f" to "20" causes the parsed value to *decrease*. Forcing use of the "0x" > prefix to specify hexidecimal numbers is a feature, not a bug. Lastly, this > is slightly less performant, since the entire number is parsed twice. > > This fixes the str_simple_strtoul test failing with > > test/str_ut.c:29, run_strtoul(): expect_val == val: Expected 0x44b (1099), got 0x1099ab (1087915) > test/str_ut.c:46, str_simple_strtoul(): 0 == run_strtoul(uts, str2, 0, 1099, 4): Expected 0x0 (0), got 0x1 (1) > > Signed-off-by: Sean Anderson <seanga2 at gmail.com> > CC: Michal Simek <michal.simek at xilinx.com> > CC: Shiril Tichkule <shirilt at xilinx.com> > --- > > lib/strto.c | 18 +----------------- > 1 file changed, 1 insertion(+), 17 deletions(-) > > diff --git a/lib/strto.c b/lib/strto.c > index 3d77115d4d..c00bb5895d 100644 > --- a/lib/strto.c > +++ b/lib/strto.c > @@ -22,25 +22,9 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) > *base = 16; > else > *base = 8; > - } else { > - int i = 0; > - char var; > - > + } else > *base = 10; > - > - do { > - var = tolower(s[i++]); > - if (var >= 'a' && var <= 'f') { > - *base = 16; > - break; > - } > - > - if (!(var >= '0' && var <= '9')) > - break; > - } while (var); > - } > } > - > if (*base == 16 && s[0] == '0' && tolower(s[1]) == 'x') > s += 2; > return s; > It is in u-boot mainline from February. Then we had to fix it in April. In the middle of this I have seen IIC one patchset which improves hex handling which is likely better way then this. I am fine with reverting this patch but I would also like to see more comments in the code to say that we shouldn't really change this. Thanks, Michal
On 6/8/20 2:24 AM, Michal Simek wrote: > It is in u-boot mainline from February. Then we had to fix it in April. Do you have a link/commit hash for said fix? > In the middle of this I have seen IIC one patchset which improves hex > handling which is likely better way then this. Same for this. > I am fine with reverting this patch but I would also like to see more > comments in the code to say that we shouldn't really change this. This seems like a larger issue. There is a lot of code in U-Boot implementing the C standard library, especially the string library. Should we have a comment in each file which says something like "The functions here implement the C standard, think carefully before making semantic changes"? --Sean
Hi, On Mon, 8 Jun 2020 at 00:24, Michal Simek <michal.simek at xilinx.com> wrote: > > On 07. 06. 20 7:36, Sean Anderson wrote: > > This reverts commit 0486497e2b5f4d36fa968a1a60fea358cbf70b65. > > > > The strtoul has well-defined semantics. It is defined by the C standard and > > POSIX. To quote the relevant section of the man pages, > > > >> If base is zero or 16, the string may then include a "0x" prefix, and the > >> number will be read in base 16; otherwise, a zero base is taken as 10 > >> (decimal) unless the next character is '0', in which case it is taken as > >> 8 (octal). > > > > Keeping these semantics is important for several reasons. First, it is very > > surprising for standard library functions to behave differently than usual. > > Every other implementation of strtoul has different semantics than the > > implementation in U-Boot at the moment. Second, it can result in very > > surprising results from small changes. For example, changing the string > > "1f" to "20" causes the parsed value to *decrease*. Forcing use of the "0x" > > prefix to specify hexidecimal numbers is a feature, not a bug. Lastly, this > > is slightly less performant, since the entire number is parsed twice. > > > > This fixes the str_simple_strtoul test failing with > > > > test/str_ut.c:29, run_strtoul(): expect_val == val: Expected 0x44b (1099), got 0x1099ab (1087915) > > test/str_ut.c:46, str_simple_strtoul(): 0 == run_strtoul(uts, str2, 0, 1099, 4): Expected 0x0 (0), got 0x1 (1) > > > > Signed-off-by: Sean Anderson <seanga2 at gmail.com> > > CC: Michal Simek <michal.simek at xilinx.com> > > CC: Shiril Tichkule <shirilt at xilinx.com> > > --- > > > > lib/strto.c | 18 +----------------- > > 1 file changed, 1 insertion(+), 17 deletions(-) > > > > diff --git a/lib/strto.c b/lib/strto.c > > index 3d77115d4d..c00bb5895d 100644 > > --- a/lib/strto.c > > +++ b/lib/strto.c > > @@ -22,25 +22,9 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) > > *base = 16; > > else > > *base = 8; > > - } else { > > - int i = 0; > > - char var; > > - > > + } else > > *base = 10; > > - > > - do { > > - var = tolower(s[i++]); > > - if (var >= 'a' && var <= 'f') { > > - *base = 16; > > - break; > > - } > > - > > - if (!(var >= '0' && var <= '9')) > > - break; > > - } while (var); > > - } > > } > > - > > if (*base == 16 && s[0] == '0' && tolower(s[1]) == 'x') > > s += 2; > > return s; > > > > > It is in u-boot mainline from February. Then we had to fix it in April. > In the middle of this I have seen IIC one patchset which improves hex > handling which is likely better way then this. > I am fine with reverting this patch but I would also like to see more > comments in the code to say that we shouldn't really change this. Once this revert is in I think we should refrain from changing it until there are tests to cover its current behaviour, e.g. in str_ut.c Reviewed-by: Simon Glass <sjg at chromium.org> Regards, Simon
On Mon, Jun 08, 2020 at 01:05:21PM -0400, Sean Anderson wrote: > On 6/8/20 2:24 AM, Michal Simek wrote: > > It is in u-boot mainline from February. Then we had to fix it in April. > > Do you have a link/commit hash for said fix? > > > In the middle of this I have seen IIC one patchset which improves hex > > handling which is likely better way then this. > > Same for this. > > > I am fine with reverting this patch but I would also like to see more > > comments in the code to say that we shouldn't really change this. > > This seems like a larger issue. There is a lot of code in U-Boot > implementing the C standard library, especially the string library. > Should we have a comment in each file which says something like "The > functions here implement the C standard, think carefully before making > semantic changes"? Honestly? Yes, it would be good to note which functions do their best to implement / conform to C standard and which _may_ deviate from it. For example, we have an alternative printf that doesn't support all format characters. Making that clearer (and then what ones our normal printf doesn't handle) would be good.
On Sun, Jun 07, 2020 at 01:36:45AM -0400, Sean Anderson wrote: > This reverts commit 0486497e2b5f4d36fa968a1a60fea358cbf70b65. > > The strtoul has well-defined semantics. It is defined by the C standard and > POSIX. To quote the relevant section of the man pages, > > > If base is zero or 16, the string may then include a "0x" prefix, and the > > number will be read in base 16; otherwise, a zero base is taken as 10 > > (decimal) unless the next character is '0', in which case it is taken as > > 8 (octal). > > Keeping these semantics is important for several reasons. First, it is very > surprising for standard library functions to behave differently than usual. > Every other implementation of strtoul has different semantics than the > implementation in U-Boot at the moment. Second, it can result in very > surprising results from small changes. For example, changing the string > "1f" to "20" causes the parsed value to *decrease*. Forcing use of the "0x" > prefix to specify hexidecimal numbers is a feature, not a bug. Lastly, this > is slightly less performant, since the entire number is parsed twice. > > This fixes the str_simple_strtoul test failing with > > test/str_ut.c:29, run_strtoul(): expect_val == val: Expected 0x44b (1099), got 0x1099ab (1087915) > test/str_ut.c:46, str_simple_strtoul(): 0 == run_strtoul(uts, str2, 0, 1099, 4): Expected 0x0 (0), got 0x1 (1) > > Signed-off-by: Sean Anderson <seanga2 at gmail.com> > CC: Michal Simek <michal.simek at xilinx.com> > CC: Shiril Tichkule <shirilt at xilinx.com> > Reviewed-by: Simon Glass <sjg at chromium.org> Applied to u-boot/master, thanks!
diff --git a/lib/strto.c b/lib/strto.c index 3d77115d4d..c00bb5895d 100644 --- a/lib/strto.c +++ b/lib/strto.c @@ -22,25 +22,9 @@ static const char *_parse_integer_fixup_radix(const char *s, unsigned int *base) *base = 16; else *base = 8; - } else { - int i = 0; - char var; - + } else *base = 10; - - do { - var = tolower(s[i++]); - if (var >= 'a' && var <= 'f') { - *base = 16; - break; - } - - if (!(var >= '0' && var <= '9')) - break; - } while (var); - } } - if (*base == 16 && s[0] == '0' && tolower(s[1]) == 'x') s += 2; return s;