Message ID | 20220902203940.2385967-16-adhemerval.zanella@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | [01/17] Parameterize op_t from memcopy.h | expand |
On 02/09/2022 21:39, Adhemerval Zanella via Libc-alpha wrote: > From: Richard Henderson <rth@twiddle.net> > > While arm has the more important string functions in assembly, > there are still a few generic routines used. > > Use the UQSUB8 insn for testing of zeros. UQSUB8 requires ARMv6 or above. While that's pretty likely these days, you might want to consider a fall-back for Armv5 or earlier if you still want to support those. R. > > Checked on armv7-linux-gnueabihf > --- > sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ > 1 file changed, 70 insertions(+) > create mode 100644 sysdeps/arm/armv6t2/string-fza.h > > diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h > new file mode 100644 > index 0000000000..4fe2e8383f > --- /dev/null > +++ b/sysdeps/arm/armv6t2/string-fza.h > @@ -0,0 +1,70 @@ > +/* Zero byte detection; basics. ARM version. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef _STRING_FZA_H > +#define _STRING_FZA_H 1 > + > +#include <string-optype.h> > +#include <string-maskoff.h> > + > +/* This function returns at least one bit set within every byte > + of X that is zero. */ > + > +static inline op_t > +find_zero_all (op_t x) > +{ > + /* Use unsigned saturated subtraction from 1 in each byte. > + That leaves 1 for every byte that was zero. */ > + op_t ret, ones = repeat_bytes (0x01); > + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); > + return ret; > +} > + > +/* Identify bytes that are equal between X1 and X2. */ > + > +static inline op_t > +find_eq_all (op_t x1, op_t x2) > +{ > + return find_zero_all (x1 ^ x2); > +} > + > +/* Identify zero bytes in X1 or equality between X1 and X2. */ > + > +static inline op_t > +find_zero_eq_all (op_t x1, op_t x2) > +{ > + return find_zero_all (x1) | find_zero_all (x1 ^ x2); > +} > + > +/* Identify zero bytes in X1 or inequality between X1 and X2. */ > + > +static inline op_t > +find_zero_ne_all (op_t x1, op_t x2) > +{ > + /* Make use of the fact that we'll already have ONES in a register. */ > + op_t ones = repeat_bytes (0x01); > + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); > +} > + > +/* Define the "inexact" versions in terms of the exact versions. */ > +#define find_zero_low find_zero_all > +#define find_eq_low find_eq_all > +#define find_zero_eq_low find_zero_eq_all > +#define find_zero_ne_low find_zero_ne_all > + > +#endif /* _STRING_FZA_H */
On 05/09/2022 16:40, Richard Earnshaw via Libc-alpha wrote: > > > On 02/09/2022 21:39, Adhemerval Zanella via Libc-alpha wrote: >> From: Richard Henderson <rth@twiddle.net> >> >> While arm has the more important string functions in assembly, >> there are still a few generic routines used. >> >> Use the UQSUB8 insn for testing of zeros. > > UQSUB8 requires ARMv6 or above. While that's pretty likely these days, > you might want to consider a fall-back for Armv5 or earlier if you still > want to support those. > Hmm, nevermind, I've just noticed this is in the armv6t2 directory, so ARMv6 is a given. Sorry for the noise. R. > R. > >> >> Checked on armv7-linux-gnueabihf >> --- >> sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ >> 1 file changed, 70 insertions(+) >> create mode 100644 sysdeps/arm/armv6t2/string-fza.h >> >> diff --git a/sysdeps/arm/armv6t2/string-fza.h >> b/sysdeps/arm/armv6t2/string-fza.h >> new file mode 100644 >> index 0000000000..4fe2e8383f >> --- /dev/null >> +++ b/sysdeps/arm/armv6t2/string-fza.h >> @@ -0,0 +1,70 @@ >> +/* Zero byte detection; basics. ARM version. >> + Copyright (C) 2022 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <http://www.gnu.org/licenses/>. */ >> + >> +#ifndef _STRING_FZA_H >> +#define _STRING_FZA_H 1 >> + >> +#include <string-optype.h> >> +#include <string-maskoff.h> >> + >> +/* This function returns at least one bit set within every byte >> + of X that is zero. */ >> + >> +static inline op_t >> +find_zero_all (op_t x) >> +{ >> + /* Use unsigned saturated subtraction from 1 in each byte. >> + That leaves 1 for every byte that was zero. */ >> + op_t ret, ones = repeat_bytes (0x01); >> + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); >> + return ret; >> +} >> + >> +/* Identify bytes that are equal between X1 and X2. */ >> + >> +static inline op_t >> +find_eq_all (op_t x1, op_t x2) >> +{ >> + return find_zero_all (x1 ^ x2); >> +} >> + >> +/* Identify zero bytes in X1 or equality between X1 and X2. */ >> + >> +static inline op_t >> +find_zero_eq_all (op_t x1, op_t x2) >> +{ >> + return find_zero_all (x1) | find_zero_all (x1 ^ x2); >> +} >> + >> +/* Identify zero bytes in X1 or inequality between X1 and X2. */ >> + >> +static inline op_t >> +find_zero_ne_all (op_t x1, op_t x2) >> +{ >> + /* Make use of the fact that we'll already have ONES in a >> register. */ >> + op_t ones = repeat_bytes (0x01); >> + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); >> +} >> + >> +/* Define the "inexact" versions in terms of the exact versions. */ >> +#define find_zero_low find_zero_all >> +#define find_eq_low find_eq_all >> +#define find_zero_eq_low find_zero_eq_all >> +#define find_zero_ne_low find_zero_ne_all >> + >> +#endif /* _STRING_FZA_H */
diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h new file mode 100644 index 0000000000..4fe2e8383f --- /dev/null +++ b/sysdeps/arm/armv6t2/string-fza.h @@ -0,0 +1,70 @@ +/* Zero byte detection; basics. ARM version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include <string-optype.h> +#include <string-maskoff.h> + +/* This function returns at least one bit set within every byte + of X that is zero. */ + +static inline op_t +find_zero_all (op_t x) +{ + /* Use unsigned saturated subtraction from 1 in each byte. + That leaves 1 for every byte that was zero. */ + op_t ret, ones = repeat_bytes (0x01); + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); + return ret; +} + +/* Identify bytes that are equal between X1 and X2. */ + +static inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + /* Make use of the fact that we'll already have ONES in a register. */ + op_t ones = repeat_bytes (0x01); + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +#define find_zero_low find_zero_all +#define find_eq_low find_eq_all +#define find_zero_eq_low find_zero_eq_all +#define find_zero_ne_low find_zero_ne_all + +#endif /* _STRING_FZA_H */
From: Richard Henderson <rth@twiddle.net> While arm has the more important string functions in assembly, there are still a few generic routines used. Use the UQSUB8 insn for testing of zeros. Checked on armv7-linux-gnueabihf --- sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 sysdeps/arm/armv6t2/string-fza.h