Message ID | 1496956411-25594-8-git-send-email-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | posix: glob fixes and refactor | expand |
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > The provided function are not extensive and meant mainly to be use in > subsequent glob implementation cleanup. For internal object > consistency only the function provided by char_array.c should be used, > including internal object manipulation. It's surprising that glob needs these many functions. I'm not sure if the automatic NUL termination makes sense if we ever want to generalize this interface. How ingrained is that to the glob use case? > +++ b/malloc/char_array.c Should this be malloc/char_array-skeleton.c, to indicate that this file is parameterized and intended for inclusion into another .c file? > +#define DYNARRAY_ELEMENT_INIT(__e) (*__e = '\0') Why is this needed? I don't think the code cares for the contents of freshly added bytes. For the trailing NUL byte, it seems better to add that separately from the initialization operations. > +static bool __attribute_used__ > +char_array_replace_str_pos (struct char_array *array, size_t pos, > + const char *str, size_t len) > +{ > + if (pos > array->dynarray_header.used) > + return false; > + > + size_t newsize; > + if (check_add_wrapv_size_t (pos, len, &newsize) > + || check_add_wrapv_size_t (newsize, 1, &newsize) > + || !char_array_resize (array, newsize)) > + return false; I don't think it is a good idea to mix the reporting of usage errors (index out of bounds) with environmental conditions (memory allocation failure). Have you considered calling __libc_fatal if pos is out of range, similar to the existing *_at function? > +++ b/malloc/malloc-internal.h > +/* Set *R = A + B. Return true if the answer is mathematically incorrect due > + to overflow; in this case, *R is the low order bits of the correct > + answer. */ > +static inline bool > +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) Why not use check_add_overflow_size_t, for consistency with check_mul_overflow_size_t?
On 13/06/2017 06:46, Florian Weimer wrote: > Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > >> The provided function are not extensive and meant mainly to be use in >> subsequent glob implementation cleanup. For internal object >> consistency only the function provided by char_array.c should be used, >> including internal object manipulation. > > It's surprising that glob needs these many functions. > > I'm not sure if the automatic NUL termination makes sense if we ever > want to generalize this interface. How ingrained is that to the glob > use case? glob internally prepends the search directory by replacing the tilde with a external home directory (GLOB_TILDE, either by environment variable or through getXXX functions) or append new data by calling new patterns recursively (GLOB_BRACE). It is not an specific usage to glob, but rather a generic one which with more higher level languages all the memory buffer bikeshedding will be avoided by a string object. The idea is to keep a consistent C string buffer stack-allocated (for general use) or heap based (to not bound due the internal expansion) which can be used as constant argument for other function call (fnmatch or recursively in case of GLOB_BRACE). > >> +++ b/malloc/char_array.c > > Should this be malloc/char_array-skeleton.c, to indicate that this file > is parameterized and intended for inclusion into another .c file? I am not sure, the idea of char_array is to be self-contained, there is no need to define anything in the file to include it. Also, if we see other usage of dynamic C string inside glibc it would be a good idea to add internal symbols for common symbols. > >> +#define DYNARRAY_ELEMENT_INIT(__e) (*__e = '\0') > > Why is this needed? I don't think the code cares for the contents of > freshly added bytes. For the trailing NUL byte, it seems better to add > that separately from the initialization operations. It does not, it is an artefact I added and it should be removed. > >> +static bool __attribute_used__ >> +char_array_replace_str_pos (struct char_array *array, size_t pos, >> + const char *str, size_t len) >> +{ >> + if (pos > array->dynarray_header.used) >> + return false; >> + >> + size_t newsize; >> + if (check_add_wrapv_size_t (pos, len, &newsize) >> + || check_add_wrapv_size_t (newsize, 1, &newsize) >> + || !char_array_resize (array, newsize)) >> + return false; > > I don't think it is a good idea to mix the reporting of usage errors > (index out of bounds) with environmental conditions (memory allocation > failure). Have you considered calling __libc_fatal if pos is out of > range, similar to the existing *_at function? Indeed, I will change to call __libc_fatal in such cases. > >> +++ b/malloc/malloc-internal.h > >> +/* Set *R = A + B. Return true if the answer is mathematically incorrect due >> + to overflow; in this case, *R is the low order bits of the correct >> + answer. */ >> +static inline bool >> +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) > > Why not use check_add_overflow_size_t, for consistency with > check_mul_overflow_size_t? > I will change it.
On 13/06/2017 09:17, Adhemerval Zanella wrote: > > > On 13/06/2017 06:46, Florian Weimer wrote: >> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: >> >>> The provided function are not extensive and meant mainly to be use in >>> subsequent glob implementation cleanup. For internal object >>> consistency only the function provided by char_array.c should be used, >>> including internal object manipulation. >> >> It's surprising that glob needs these many functions. >> >> I'm not sure if the automatic NUL termination makes sense if we ever >> want to generalize this interface. How ingrained is that to the glob >> use case? > > glob internally prepends the search directory by replacing the tilde with > a external home directory (GLOB_TILDE, either by environment variable or > through getXXX functions) or append new data by calling new patterns > recursively (GLOB_BRACE). > > It is not an specific usage to glob, but rather a generic one which with > more higher level languages all the memory buffer bikeshedding will be > avoided by a string object. The idea is to keep a consistent C string buffer > stack-allocated (for general use) or heap based (to not bound due the > internal expansion) which can be used as constant argument for other > function call (fnmatch or recursively in case of GLOB_BRACE). > >> >>> +++ b/malloc/char_array.c >> >> Should this be malloc/char_array-skeleton.c, to indicate that this file >> is parameterized and intended for inclusion into another .c file? > > I am not sure, the idea of char_array is to be self-contained, there is > no need to define anything in the file to include it. Also, if we see > other usage of dynamic C string inside glibc it would be a good idea to > add internal symbols for common symbols. > >> >>> +#define DYNARRAY_ELEMENT_INIT(__e) (*__e = '\0') >> >> Why is this needed? I don't think the code cares for the contents of >> freshly added bytes. For the trailing NUL byte, it seems better to add >> that separately from the initialization operations. > > It does not, it is an artefact I added and it should be removed. > >> >>> +static bool __attribute_used__ >>> +char_array_replace_str_pos (struct char_array *array, size_t pos, >>> + const char *str, size_t len) >>> +{ >>> + if (pos > array->dynarray_header.used) >>> + return false; >>> + >>> + size_t newsize; >>> + if (check_add_wrapv_size_t (pos, len, &newsize) >>> + || check_add_wrapv_size_t (newsize, 1, &newsize) >>> + || !char_array_resize (array, newsize)) >>> + return false; >> >> I don't think it is a good idea to mix the reporting of usage errors >> (index out of bounds) with environmental conditions (memory allocation >> failure). Have you considered calling __libc_fatal if pos is out of >> range, similar to the existing *_at function? > > Indeed, I will change to call __libc_fatal in such cases. > >> >>> +++ b/malloc/malloc-internal.h >> >>> +/* Set *R = A + B. Return true if the answer is mathematically incorrect due >>> + to overflow; in this case, *R is the low order bits of the correct >>> + answer. */ >>> +static inline bool >>> +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) >> >> Why not use check_add_overflow_size_t, for consistency with >> check_mul_overflow_size_t? >> > > I will change it. > Here is an updated patch for the specialized dynarray. I have incorporated all your suggestion but the skeleton name change (which I am not sure if it should follow the idea since it should be a complete 'api'). I also added some nonull attribute as for dynarray implementation. ----diff --git a/malloc/Makefile b/malloc/Makefile index 14c13f1..323b3fb 100644 --- a/malloc/Makefile +++ b/malloc/Makefile @@ -46,6 +46,7 @@ tests-internal += \ tst-dynarray \ tst-dynarray-fail \ tst-dynarray-at-fail \ + tst-char_array ifneq (no,$(have-tunables)) tests += tst-malloc-usable-tunables @@ -58,7 +59,7 @@ test-srcs = tst-mtrace routines = malloc morecore mcheck mtrace obstack reallocarray \ scratch_buffer_grow scratch_buffer_grow_preserve \ scratch_buffer_set_array_size \ - dynarray_at_failure \ + dynarray_at_failure dynarray_overflow_failure \ dynarray_emplace_enlarge \ dynarray_finalize \ dynarray_resize \ diff --git a/malloc/char_array.c b/malloc/char_array.c new file mode 100644 index 0000000..4d73203 --- /dev/null +++ b/malloc/char_array.c @@ -0,0 +1,281 @@ +/* Specialized dynarray for C strings. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* This file provides a dynamic C string with an initial stack allocated + buffer. Since it is based on dynarray, it provided dynamic size + expansion and heap usage for large strings. + + The following parameters are optional: + + CHAR_ARRAY_INITIAL_SIZE + The size of the statically allocated array (default is 256). It will + be used to define DYNARRAY_INITIAL_SIZE. + + The following functions are provided: + + bool char_array_init_empty (struct char_array *); + bool char_array_init_str (struct char_array *, const char *); + bool char_array_init_str_size (struct char_array *, const char *, size_t); + bool char_array_is_empty (struct char_array *); + const char *char_array_str (struct char_array *); + char char_array_pos (struct char_array *, size_t); + size_t char_array_length (struct char_array *); + bool char_array_set_str (struct char_array *, const char *); + bool char_array_set_str_size (struct char_array *, const char *, size_t); + void char_array_erase (struct char_array *, size_t); + bool char_array_resize_str (struct char_array *, size_t); + bool char_array_prepend_str (struct char_array *, const char *); + bool char_array_append_str (struct char_array *, const char *); + bool char_array_replace_str_pos (struct char_array *, size_t, const char *, + size_t); + + For instance: + + struct char_array str; + // str == "testing"; + char_array_init_str (&str, "testing"); + // c == 's' + char c = char_array_pos (&str, 2); + // str = "testing2"; + char_array_set_str (&str, "testing2"); + // str = "testi"; + char_array_erase (&str, 5); + // str = "123testi"; + char_array_prepend_str (&str, "123"); + // len = 8; + size_t len = char_array_length (&str); + // str = "123testi456"; + char_array_append_str (&str, "456"); + // str = "123testi789"; + char_array_replace_str_pos (&str, 7, "789", 3); + */ + +#define DYNARRAY_STRUCT char_array +#define DYNARRAY_ELEMENT char +#define DYNARRAY_PREFIX char_array_ +#ifndef CHAR_ARRAY_INITIAL_SIZE +# define CHAR_ARRAY_INITIAL_SIZE 256 +#endif +#define DYNARRAY_INITIAL_SIZE CHAR_ARRAY_INITIAL_SIZE +#include <malloc/dynarray-skeleton.c> +#include <malloc/malloc-internal.h> + +/* Return a const char for the internal C string handled by 'array'. */ +__attribute__ ((nonnull (1))) +static const char * +char_array_str (struct char_array *array) +{ + return char_array_at (array, 0); +} + +/* Return the character at position 'pos' from the char_array 'array'. */ +__attribute__ ((nonnull (1))) +static char +char_array_pos (struct char_array *array, size_t pos) +{ + return *char_array_at (array, pos); +} + +/* Calculate the length of the string, excluding the terminating null. */ +__attribute__ ((unused, nonnull (1))) +static size_t +char_array_length (struct char_array *array) +{ + /* Exclude the final '\0'. */ + return array->dynarray_header.used - 1; +} + +/* Copy the contents of string 'str' to char_array 'array', including the + final '\0'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str) + 1; + if (!char_array_resize (array, size)) + return false; + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = size; + return true; +} + +/* Copy up 'size' bytes from string 'str' to char_array 'array'. A final + '\0' is appended in the char_array. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str_size (struct char_array *array, const char *str, + size_t size) +{ + size_t newsize; + if (check_add_wrapv_size_t (size, 1, &newsize)) + __libc_dynarray_overflow_failure (size, 1); + + if (!char_array_resize (array, newsize)) + return false; + + *((char *) mempcpy (array->dynarray_header.array, str, size)) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Initialize the char_array 'array' and sets it to an empty string (""). */ +__attribute__ ((nonnull (1))) +static bool +char_array_init_empty (struct char_array *array) +{ + char_array_init (array); + return char_array_set_str (array, ""); +} + +/* Initialize the char_array 'array' and copy the content of string 'str'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str (struct char_array *array, const char *str) +{ + char_array_init (array); + return char_array_set_str (array, str); +} + +/* Initialize the char_array 'array' and copy the content of string 'str' + up to 'size' characteres. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str_size (struct char_array *array, const char *str, + size_t size) +{ + char_array_init (array); + return char_array_set_str_size (array, str, size); +} + +/* Return if the char_array contain any characteres. */ +__attribute__ ((nonnull (1))) +static bool +char_array_is_empty (struct char_array *array) +{ + return *char_array_at (array, 0) == '\0'; +} + +/* Remove the byte at position 'pos' from char_array 'array'. The contents + are moved internally if the position is not at the end of the internal + buffer. */ +__attribute__ ((nonnull (1))) +static bool +char_array_erase (struct char_array *array, size_t pos) +{ + if (pos >= array->dynarray_header.used - 1) + return false; + + char *ppos = char_array_at (array, pos); + char *lpos = array->dynarray_header.array + array->dynarray_header.used; + ptrdiff_t size = lpos - ppos; + memmove (ppos, ppos + 1, size); + array->dynarray_header.used--; + return true; +} + +/* Resize the char_array 'array' to size 'count' maintaining the ending + '\0' byte. */ +__attribute__ ((nonnull (1))) +static bool +char_array_crop (struct char_array *array, size_t size) +{ + if (size >= (array->dynarray_header.used - 1) + || !char_array_resize (array, size + 1)) + return false; + + array->dynarray_header.array[array->dynarray_header.used] = '\0'; + return true; +} + +/* Prepend the contents of string 'str' to char_array 'array', including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_prepend_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and we need below + to correct copy the elements. */ + size_t used = array->dynarray_header.used; + + size_t newsize; + if (check_add_wrapv_size_t (used, size, &newsize)) + __libc_dynarray_overflow_failure (used, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Make room for the string and copy it. */ + memmove (array->dynarray_header.array + size, array->dynarray_header.array, + used); + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = newsize; + return true; +} + +/* Append the contents of string 'str' to char_array 'array, including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_append_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and it used it below + to correct copy the elements. */ + size_t used = array->dynarray_header.used - 1; + + /* 'used' does account for final '\0', so there is no need to add + an extra element to calculate the final required size. */ + size_t newsize; + if (check_add_wrapv_size_t (used + 1, size, &newsize)) + __libc_dynarray_overflow_failure (used + 1, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Start to append at '\0' up to string length and add a final '\0'. */ + *(char*) mempcpy (array->dynarray_header.array + used, str, size) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Replace the contents starting of position 'pos' of char_array 'array' + with the contents of string 'str' up to 'len' bytes. A final '\0' + is appended in the string. */ +__attribute__ ((nonnull (1, 3))) +static bool +char_array_replace_str_pos (struct char_array *array, size_t pos, + const char *str, size_t len) +{ + if (pos > array->dynarray_header.used) + return false; + + size_t newsize; + if (check_add_wrapv_size_t (pos, len, &newsize)) + __libc_dynarray_overflow_failure (pos, len); + if (check_add_wrapv_size_t (newsize, 1, &newsize)) + __libc_dynarray_overflow_failure (newsize, 1); + + if (!char_array_resize (array, newsize)) + return false; + + char *start = char_array_at (array, pos); + *(char *) mempcpy (start, str, len) = '\0'; + array->dynarray_header.used = newsize; + return true; +} diff --git a/malloc/dynarray.h b/malloc/dynarray.h index c73e08b..5a491b4 100644 --- a/malloc/dynarray.h +++ b/malloc/dynarray.h @@ -173,4 +173,12 @@ void __libc_dynarray_at_failure (size_t size, size_t index) __attribute__ ((noreturn)); libc_hidden_proto (__libc_dynarray_at_failure) +/* Internal function. TErminate the process after an overflow in + new size allocation. SIZE is the current number of elements in + dynamic array and INCR is the new elements to add on current + size. */ +void __libc_dynarray_overflow_failure (size_t size, size_t incr) + __attribute__ ((noreturn)); +libc_hidden_proto (__libc_dynarray_overflow_failure) + #endif /* _DYNARRAY_H */ diff --git a/malloc/dynarray_overflow_failure.c b/malloc/dynarray_overflow_failure.c new file mode 100644 index 0000000..14936b0 --- /dev/null +++ b/malloc/dynarray_overflow_failure.c @@ -0,0 +1,31 @@ +/* Report an dynamic array size overflow condition. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <dynarray.h> +#include <stdio.h> + +void +__libc_dynarray_overflow_failure (size_t size, size_t incr) +{ + char buf[200]; + __snprintf (buf, sizeof (buf), "Fatal glibc error: " + "new size overflows (old %zu and increment %zu)\n", + size, incr); + __libc_fatal (buf); +} +libc_hidden_def (__libc_dynarray_overflow_failure) diff --git a/malloc/malloc-internal.h b/malloc/malloc-internal.h index dbd801a..3066cd3 100644 --- a/malloc/malloc-internal.h +++ b/malloc/malloc-internal.h @@ -101,4 +101,18 @@ check_mul_overflow_size_t (size_t left, size_t right, size_t *result) #endif } +/* Set *R = A + B. Return true if the answer is mathematically incorrect due + to overflow; in this case, *R is the low order bits of the correct + answer. */ +static inline bool +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) +{ +#if 5 <= __GNUC__ + return __builtin_add_overflow (a, b, r); +#else + *r = a + b; + return *r < a; +#endif +} + #endif /* _MALLOC_INTERNAL_H */ diff --git a/malloc/tst-char_array.c b/malloc/tst-char_array.c new file mode 100644 index 0000000..53f9482 --- /dev/null +++ b/malloc/tst-char_array.c @@ -0,0 +1,107 @@ +/* Test for char_array. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <string.h> + +#include <malloc/char_array.c> + +#include <malloc.h> +#include <mcheck.h> +#include <stdint.h> +#include <support/check.h> +#include <support/support.h> + +static int +do_test (void) +{ + mtrace (); + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_empty (&str) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == 0); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == true); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing")); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testing") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str_size (&str, "testing", 4)); + TEST_VERIFY_EXIT (char_array_length (&str) == 4); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "test") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_set_str (&str, "abcdef")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcdef") == 0); + TEST_VERIFY_EXIT (char_array_set_str_size (&str, "abcdef", 4)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcd") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_erase (&str, 4) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testng") == 0); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str)) + == false); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str) - 1) + == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 2); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testn") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "test")); + TEST_VERIFY_EXIT (char_array_prepend_str (&str, "123")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test")); + TEST_VERIFY_EXIT (char_array_append_str (&str, "456")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test456") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test456")); + TEST_VERIFY_EXIT (char_array_replace_str_pos (&str, 7, "789", 3)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test789") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test789")); + char_array_free (&str); + } + + return 0; +} + +#include <support/test-driver.c>
On 13/06/2017 11:24, Adhemerval Zanella wrote: > Here is an updated patch for the specialized dynarray. I have incorporated all > your suggestion but the skeleton name change (which I am not sure if it should > follow the idea since it should be a complete 'api'). I also added some nonull > attribute as for dynarray implementation. I rebase against the new begin/end additions and fixed some issues regarding the make check (which I forgot to actually run before patch submission...). ---diff --git a/malloc/Makefile b/malloc/Makefile index 14c13f1..323b3fb 100644 --- a/malloc/Makefile +++ b/malloc/Makefile @@ -46,6 +46,7 @@ tests-internal += \ tst-dynarray \ tst-dynarray-fail \ tst-dynarray-at-fail \ + tst-char_array ifneq (no,$(have-tunables)) tests += tst-malloc-usable-tunables @@ -58,7 +59,7 @@ test-srcs = tst-mtrace routines = malloc morecore mcheck mtrace obstack reallocarray \ scratch_buffer_grow scratch_buffer_grow_preserve \ scratch_buffer_set_array_size \ - dynarray_at_failure \ + dynarray_at_failure dynarray_overflow_failure \ dynarray_emplace_enlarge \ dynarray_finalize \ dynarray_resize \ diff --git a/malloc/Versions b/malloc/Versions index 5b54306..86b52f7 100644 --- a/malloc/Versions +++ b/malloc/Versions @@ -82,6 +82,7 @@ libc { # dynarray support __libc_dynarray_at_failure; + __libc_dynarray_overflow_failure; __libc_dynarray_emplace_enlarge; __libc_dynarray_finalize; __libc_dynarray_resize; diff --git a/malloc/char_array.c b/malloc/char_array.c new file mode 100644 index 0000000..e37df81 --- /dev/null +++ b/malloc/char_array.c @@ -0,0 +1,281 @@ +/* Specialized dynarray for C strings. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* This file provides a dynamic C string with an initial stack allocated + buffer. Since it is based on dynarray, it provided dynamic size + expansion and heap usage for large strings. + + The following parameters are optional: + + CHAR_ARRAY_INITIAL_SIZE + The size of the statically allocated array (default is 256). It will + be used to define DYNARRAY_INITIAL_SIZE. + + The following functions are provided: + + bool char_array_init_empty (struct char_array *); + bool char_array_init_str (struct char_array *, const char *); + bool char_array_init_str_size (struct char_array *, const char *, size_t); + bool char_array_is_empty (struct char_array *); + const char *char_array_str (struct char_array *); + char char_array_pos (struct char_array *, size_t); + size_t char_array_length (struct char_array *); + bool char_array_set_str (struct char_array *, const char *); + bool char_array_set_str_size (struct char_array *, const char *, size_t); + void char_array_erase (struct char_array *, size_t); + bool char_array_crop (struct char_array *, size_t); + bool char_array_prepend_str (struct char_array *, const char *); + bool char_array_append_str (struct char_array *, const char *); + bool char_array_replace_str_pos (struct char_array *, size_t, const char *, + size_t); + + For instance: + + struct char_array str; + // str == "testing"; + char_array_init_str (&str, "testing"); + // c == 's' + char c = char_array_pos (&str, 2); + // str = "testing2"; + char_array_set_str (&str, "testing2"); + // str = "testi"; + char_array_erase (&str, 5); + // str = "123testi"; + char_array_prepend_str (&str, "123"); + // len = 8; + size_t len = char_array_length (&str); + // str = "123testi456"; + char_array_append_str (&str, "456"); + // str = "123testi789"; + char_array_replace_str_pos (&str, 7, "789", 3); + */ + +#define DYNARRAY_STRUCT char_array +#define DYNARRAY_ELEMENT char +#define DYNARRAY_PREFIX char_array_ +#ifndef CHAR_ARRAY_INITIAL_SIZE +# define CHAR_ARRAY_INITIAL_SIZE 256 +#endif +#define DYNARRAY_INITIAL_SIZE CHAR_ARRAY_INITIAL_SIZE +#include <malloc/dynarray-skeleton.c> +#include <malloc/malloc-internal.h> + +/* Return a const char for the internal C string handled by 'array'. */ +__attribute__ ((nonnull (1))) +static const char * +char_array_str (struct char_array *array) +{ + return char_array_at (array, 0); +} + +/* Return the character at position 'pos' from the char_array 'array'. */ +__attribute__ ((nonnull (1))) +static char +char_array_pos (struct char_array *array, size_t pos) +{ + return *char_array_at (array, pos); +} + +/* Calculate the length of the string, excluding the terminating null. */ +__attribute__ ((unused, nonnull (1))) +static size_t +char_array_length (struct char_array *array) +{ + /* Exclude the final '\0'. */ + return array->dynarray_header.used - 1; +} + +/* Copy the contents of string 'str' to char_array 'array', including the + final '\0'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str) + 1; + if (!char_array_resize (array, size)) + return false; + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = size; + return true; +} + +/* Copy up 'size' bytes from string 'str' to char_array 'array'. A final + '\0' is appended in the char_array. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_set_str_size (struct char_array *array, const char *str, + size_t size) +{ + size_t newsize; + if (check_add_wrapv_size_t (size, 1, &newsize)) + __libc_dynarray_overflow_failure (size, 1); + + if (!char_array_resize (array, newsize)) + return false; + + *((char *) mempcpy (array->dynarray_header.array, str, size)) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Initialize the char_array 'array' and sets it to an empty string (""). */ +__attribute__ ((nonnull (1))) +static bool +char_array_init_empty (struct char_array *array) +{ + char_array_init (array); + return char_array_set_str (array, ""); +} + +/* Initialize the char_array 'array' and copy the content of string 'str'. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str (struct char_array *array, const char *str) +{ + char_array_init (array); + return char_array_set_str (array, str); +} + +/* Initialize the char_array 'array' and copy the content of string 'str' + up to 'size' characteres. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_init_str_size (struct char_array *array, const char *str, + size_t size) +{ + char_array_init (array); + return char_array_set_str_size (array, str, size); +} + +/* Return if the char_array contain any characteres. */ +__attribute__ ((nonnull (1))) +static bool +char_array_is_empty (struct char_array *array) +{ + return *char_array_at (array, 0) == '\0'; +} + +/* Remove the byte at position 'pos' from char_array 'array'. The contents + are moved internally if the position is not at the end of the internal + buffer. */ +__attribute__ ((nonnull (1))) +static bool +char_array_erase (struct char_array *array, size_t pos) +{ + if (pos >= array->dynarray_header.used - 1) + return false; + + char *ppos = char_array_at (array, pos); + char *lpos = array->dynarray_header.array + array->dynarray_header.used; + ptrdiff_t size = lpos - ppos; + memmove (ppos, ppos + 1, size); + array->dynarray_header.used--; + return true; +} + +/* Resize the char_array 'array' to size 'count' maintaining the ending + '\0' byte. */ +__attribute__ ((nonnull (1))) +static bool +char_array_crop (struct char_array *array, size_t size) +{ + if (size >= (array->dynarray_header.used - 1) + || !char_array_resize (array, size + 1)) + return false; + + array->dynarray_header.array[size] = '\0'; + return true; +} + +/* Prepend the contents of string 'str' to char_array 'array', including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_prepend_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and we need below + to correct copy the elements. */ + size_t used = array->dynarray_header.used; + + size_t newsize; + if (check_add_wrapv_size_t (used, size, &newsize)) + __libc_dynarray_overflow_failure (used, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Make room for the string and copy it. */ + memmove (array->dynarray_header.array + size, array->dynarray_header.array, + used); + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = newsize; + return true; +} + +/* Append the contents of string 'str' to char_array 'array, including the + final '\0' byte. */ +__attribute__ ((nonnull (1, 2))) +static bool +char_array_append_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and it used it below + to correct copy the elements. */ + size_t used = array->dynarray_header.used - 1; + + /* 'used' does account for final '\0', so there is no need to add + an extra element to calculate the final required size. */ + size_t newsize; + if (check_add_wrapv_size_t (used + 1, size, &newsize)) + __libc_dynarray_overflow_failure (used + 1, size); + + if (!char_array_resize (array, newsize)) + return false; + + /* Start to append at '\0' up to string length and add a final '\0'. */ + *(char*) mempcpy (array->dynarray_header.array + used, str, size) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Replace the contents starting of position 'pos' of char_array 'array' + with the contents of string 'str' up to 'len' bytes. A final '\0' + is appended in the string. */ +__attribute__ ((nonnull (1, 3))) +static bool +char_array_replace_str_pos (struct char_array *array, size_t pos, + const char *str, size_t len) +{ + if (pos > array->dynarray_header.used) + return false; + + size_t newsize; + if (check_add_wrapv_size_t (pos, len, &newsize)) + __libc_dynarray_overflow_failure (pos, len); + if (check_add_wrapv_size_t (newsize, 1, &newsize)) + __libc_dynarray_overflow_failure (newsize, 1); + + if (!char_array_resize (array, newsize)) + return false; + + char *start = char_array_at (array, pos); + *(char *) mempcpy (start, str, len) = '\0'; + array->dynarray_header.used = newsize; + return true; +} diff --git a/malloc/dynarray.h b/malloc/dynarray.h index c73e08b..5a491b4 100644 --- a/malloc/dynarray.h +++ b/malloc/dynarray.h @@ -173,4 +173,12 @@ void __libc_dynarray_at_failure (size_t size, size_t index) __attribute__ ((noreturn)); libc_hidden_proto (__libc_dynarray_at_failure) +/* Internal function. TErminate the process after an overflow in + new size allocation. SIZE is the current number of elements in + dynamic array and INCR is the new elements to add on current + size. */ +void __libc_dynarray_overflow_failure (size_t size, size_t incr) + __attribute__ ((noreturn)); +libc_hidden_proto (__libc_dynarray_overflow_failure) + #endif /* _DYNARRAY_H */ diff --git a/malloc/dynarray_overflow_failure.c b/malloc/dynarray_overflow_failure.c new file mode 100644 index 0000000..14936b0 --- /dev/null +++ b/malloc/dynarray_overflow_failure.c @@ -0,0 +1,31 @@ +/* Report an dynamic array size overflow condition. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <dynarray.h> +#include <stdio.h> + +void +__libc_dynarray_overflow_failure (size_t size, size_t incr) +{ + char buf[200]; + __snprintf (buf, sizeof (buf), "Fatal glibc error: " + "new size overflows (old %zu and increment %zu)\n", + size, incr); + __libc_fatal (buf); +} +libc_hidden_def (__libc_dynarray_overflow_failure) diff --git a/malloc/malloc-internal.h b/malloc/malloc-internal.h index dbd801a..3066cd3 100644 --- a/malloc/malloc-internal.h +++ b/malloc/malloc-internal.h @@ -101,4 +101,18 @@ check_mul_overflow_size_t (size_t left, size_t right, size_t *result) #endif } +/* Set *R = A + B. Return true if the answer is mathematically incorrect due + to overflow; in this case, *R is the low order bits of the correct + answer. */ +static inline bool +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) +{ +#if 5 <= __GNUC__ + return __builtin_add_overflow (a, b, r); +#else + *r = a + b; + return *r < a; +#endif +} + #endif /* _MALLOC_INTERNAL_H */ diff --git a/malloc/tst-char_array.c b/malloc/tst-char_array.c new file mode 100644 index 0000000..cb06b9c --- /dev/null +++ b/malloc/tst-char_array.c @@ -0,0 +1,110 @@ +/* Test for char_array. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <string.h> + +#include <malloc/char_array.c> + +#include <malloc.h> +#include <mcheck.h> +#include <stdint.h> +#include <support/check.h> +#include <support/support.h> + +static int +do_test (void) +{ + mtrace (); + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_empty (&str) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == 0); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == true); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing")); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testing") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str_size (&str, "testing", 4)); + TEST_VERIFY_EXIT (char_array_length (&str) == 4); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "test") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_set_str (&str, "abcdef")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcdef") == 0); + TEST_VERIFY_EXIT (char_array_set_str_size (&str, "abcdef", 4)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcd") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_erase (&str, 4) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testng") == 0); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str)) + == false); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str) - 1) + == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 2); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testn") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "test")); + TEST_VERIFY_EXIT (char_array_prepend_str (&str, "123")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test")); + TEST_VERIFY_EXIT (char_array_append_str (&str, "456")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test456") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test456")); + TEST_VERIFY_EXIT (char_array_replace_str_pos (&str, 7, "789", 3)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test789") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test789")); + TEST_VERIFY_EXIT (char_array_crop (&str, 7)); + TEST_VERIFY_EXIT (char_array_length (&str) == 7); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0); + char_array_free (&str); + } + + return 0; +} + +#include <support/test-driver.c>
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: >>> +++ b/malloc/char_array.c >> >> Should this be malloc/char_array-skeleton.c, to indicate that this file >> is parameterized and intended for inclusion into another .c file? > > I am not sure, the idea of char_array is to be self-contained, there is > no need to define anything in the file to include it. Also, if we see > other usage of dynamic C string inside glibc it would be a good idea to > add internal symbols for common symbols. But it's still necessary to #include the file because it is what the C++ people call a header-only library. And there is a hook for customizing the size of the stack allocation. This is why I asked. I don't have a strong opinion on file naming, though. Florian
On 14/06/2017 10:35, Florian Weimer wrote: > Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > >>>> +++ b/malloc/char_array.c >>> >>> Should this be malloc/char_array-skeleton.c, to indicate that this file >>> is parameterized and intended for inclusion into another .c file? >> >> I am not sure, the idea of char_array is to be self-contained, there is >> no need to define anything in the file to include it. Also, if we see >> other usage of dynamic C string inside glibc it would be a good idea to >> add internal symbols for common symbols. > > But it's still necessary to #include the file because it is what the C++ > people call a header-only library. And there is a hook for customizing > the size of the stack allocation. This is why I asked. I don't have a > strong opinion on file naming, though. Fair enough then, although if this kind of usage pattern for dynarray become common it would be interesting to move to internal symbol. I will change to char_array-skeleton.c.
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > On 13/06/2017 11:24, Adhemerval Zanella wrote: > >> Here is an updated patch for the specialized dynarray. I have incorporated all >> your suggestion but the skeleton name change (which I am not sure if it should >> follow the idea since it should be a complete 'api'). I also added some nonull >> attribute as for dynarray implementation. > > I rebase against the new begin/end additions and fixed some issues > regarding the make check (which I forgot to actually run before patch > submission...). I don't see any begin/end references in the attached patch. > +/* Replace the contents starting of position 'pos' of char_array 'array' > + with the contents of string 'str' up to 'len' bytes. A final '\0' > + is appended in the string. */ > +__attribute__ ((nonnull (1, 3))) > +static bool > +char_array_replace_str_pos (struct char_array *array, size_t pos, > + const char *str, size_t len) > +{ > + if (pos > array->dynarray_header.used) > + return false; > + > + size_t newsize; > + if (check_add_wrapv_size_t (pos, len, &newsize)) > + __libc_dynarray_overflow_failure (pos, len); > + if (check_add_wrapv_size_t (newsize, 1, &newsize)) > + __libc_dynarray_overflow_failure (newsize, 1); This is the opposite of what I expect: pos > array->dynarray_header.used appears to be a usage error, so this could result in __libc_fatal. Integer overflow while computing sizes for memory allocation is usually treated as a memory allocation failure, so it would expect a false return (and no __libc_fatal) for that. If you want to prevent access to the underlying char_array_* functions generated by dynarray, you could use #pragma GCC poison. Florian
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > On 14/06/2017 10:35, Florian Weimer wrote: >> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: >> >>>>> +++ b/malloc/char_array.c >>>> >>>> Should this be malloc/char_array-skeleton.c, to indicate that this file >>>> is parameterized and intended for inclusion into another .c file? >>> >>> I am not sure, the idea of char_array is to be self-contained, there is >>> no need to define anything in the file to include it. Also, if we see >>> other usage of dynamic C string inside glibc it would be a good idea to >>> add internal symbols for common symbols. >> >> But it's still necessary to #include the file because it is what the C++ >> people call a header-only library. And there is a hook for customizing >> the size of the stack allocation. This is why I asked. I don't have a >> strong opinion on file naming, though. > > Fair enough then, although if this kind of usage pattern for dynarray > become common it would be interesting to move to internal symbol. I will > change to char_array-skeleton.c. Agreed. In this case, we'd switch away from static functions, too. Thanks, Florian
On 14/06/2017 11:05, Florian Weimer wrote: > Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > >> On 13/06/2017 11:24, Adhemerval Zanella wrote: >> >>> Here is an updated patch for the specialized dynarray. I have incorporated all >>> your suggestion but the skeleton name change (which I am not sure if it should >>> follow the idea since it should be a complete 'api'). I also added some nonull >>> attribute as for dynarray implementation. >> >> I rebase against the new begin/end additions and fixed some issues >> regarding the make check (which I forgot to actually run before patch >> submission...). > > I don't see any begin/end references in the attached patch. My mistake, I did not commit the two smalls changes that replace char_array_at (0) with char_array_begin. > >> +/* Replace the contents starting of position 'pos' of char_array 'array' >> + with the contents of string 'str' up to 'len' bytes. A final '\0' >> + is appended in the string. */ >> +__attribute__ ((nonnull (1, 3))) >> +static bool >> +char_array_replace_str_pos (struct char_array *array, size_t pos, >> + const char *str, size_t len) >> +{ >> + if (pos > array->dynarray_header.used) >> + return false; >> + >> + size_t newsize; >> + if (check_add_wrapv_size_t (pos, len, &newsize)) >> + __libc_dynarray_overflow_failure (pos, len); >> + if (check_add_wrapv_size_t (newsize, 1, &newsize)) >> + __libc_dynarray_overflow_failure (newsize, 1); > > This is the opposite of what I expect: pos > array->dynarray_header.used > appears to be a usage error, so this could result in __libc_fatal. > Integer overflow while computing sizes for memory allocation is usually > treated as a memory allocation failure, so it would expect a false > return (and no __libc_fatal) for that. So I think a better approach would just to use: if (pos > array->dynarray_header.used) __libc_dynarray_at_failure (char_array_size (array), pos); if (check_add_wrapv_size_t (pos, len, &newsize)) ||check_add_wrapv_size_t (newsize, 1, &newsize)) return false; > > If you want to prevent access to the underlying char_array_* functions > generated by dynarray, you could use #pragma GCC poison. It could be a nice idea, although there is no direct usage on the glob patchset. I will check this out for a future enhancement.
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: >> This is the opposite of what I expect: pos > array->dynarray_header.used >> appears to be a usage error, so this could result in __libc_fatal. >> Integer overflow while computing sizes for memory allocation is usually >> treated as a memory allocation failure, so it would expect a false >> return (and no __libc_fatal) for that. > > So I think a better approach would just to use: > > if (pos > array->dynarray_header.used) > __libc_dynarray_at_failure (char_array_size (array), pos); > > if (check_add_wrapv_size_t (pos, len, &newsize)) > ||check_add_wrapv_size_t (newsize, 1, &newsize)) > return false; Right, this is what I would expect from such an interface. This assumes that glob doesn't expect to feed bad existing indexes to this function, of course. >> If you want to prevent access to the underlying char_array_* functions >> generated by dynarray, you could use #pragma GCC poison. > It could be a nice idea, although there is no direct usage on the glob > patchset. I will check this out for a future enhancement. Understood. Thanks, Florian
diff --git a/malloc/Makefile b/malloc/Makefile index af025cb..098e3c6 100644 --- a/malloc/Makefile +++ b/malloc/Makefile @@ -46,6 +46,7 @@ tests-internal += \ tst-dynarray \ tst-dynarray-fail \ tst-dynarray-at-fail \ + tst-char_array ifneq (no,$(have-tunables)) tests += tst-malloc-usable-tunables diff --git a/malloc/char_array.c b/malloc/char_array.c new file mode 100644 index 0000000..cce9360 --- /dev/null +++ b/malloc/char_array.c @@ -0,0 +1,256 @@ +/* Specialized dynarray for C strings. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* This file provides a dynamic C string with an initial stack allocated + buffer. Since it is based on dynarray, it provided dynamic size + expansion and heap usage for large strings. + + The following parameters are optional: + + CHAR_ARRAY_INITIAL_SIZE + The size of the statically allocated array (default is 256). It will + be used to define DYNARRAY_INITIAL_SIZE. + + The following functions are provided: + + bool char_array_init_empty (struct char_array *); + bool char_array_init_str (struct char_array *, const char *); + bool char_array_init_str_size (struct char_array *, const char *, size_t); + bool char_array_is_empty (struct char_array *); + const char *char_array_str (struct char_array *); + char char_array_pos (struct char_array *, size_t); + size_t char_array_length (struct char_array *); + bool char_array_set_str (struct char_array *, const char *); + bool char_array_set_str_size (struct char_array *, const char *, size_t); + void char_array_erase (struct char_array *, size_t); + bool char_array_resize_str (struct char_array *, size_t); + bool char_array_prepend_str (struct char_array *, const char *); + bool char_array_append_str (struct char_array *, const char *); + bool char_array_replace_str_pos (struct char_array *, size_t, const char *, + size_t); + + For instance: + + struct char_array str; + // str == "testing"; + char_array_init_str (&str, "testing"); + // c == 's' + char c = char_array_pos (&str, 2); + // str = "testing2"; + char_array_set_str (&str, "testing2"); + // str = "testi"; + char_array_erase (&str, 5); + // str = "123testi"; + char_array_prepend_str (&str, "123"); + // len = 8; + size_t len = char_array_length (&str); + // str = "123testi456"; + char_array_append_str (&str, "456"); + // str = "123testi789"; + char_array_replace_str_pos (&str, 7, "789", 3); + */ + +#define DYNARRAY_STRUCT char_array +#define DYNARRAY_ELEMENT char +#define DYNARRAY_PREFIX char_array_ +#define DYNARRAY_ELEMENT_INIT(__e) (*__e = '\0') +#ifndef CHAR_ARRAY_INITIAL_SIZE +# define CHAR_ARRAY_INITIAL_SIZE 256 +#endif +#define DYNARRAY_INITIAL_SIZE CHAR_ARRAY_INITIAL_SIZE +#include <malloc/dynarray-skeleton.c> +#include <malloc/malloc-internal.h> + +/* Return a const char for the internal C string handled by 'array'. */ +static const char * __attribute_used__ +char_array_str (struct char_array *array) +{ + return char_array_at (array, 0); +} + +/* Return the character at position 'pos' from the char_array 'array'. */ +static char __attribute_used__ +char_array_pos (struct char_array *array, size_t pos) +{ + return *char_array_at (array, pos); +} + +static size_t __attribute_used__ +char_array_length (struct char_array *array) +{ + /* Exclude the final '\0'. */ + return array->dynarray_header.used - 1; +} + +/* Copy the contents of string 'str' to char_array 'array', including the + final '\0'. */ +static bool __attribute_used__ +char_array_set_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str) + 1; + if (!char_array_resize (array, size)) + return false; + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = size; + return true; +} + +/* Copy up 'size' bytes from string 'str' to char_array 'array'. A final + '\0' is appended in the char_array. */ +static bool __attribute_used__ +char_array_set_str_size (struct char_array *array, const char *str, + size_t size) +{ + size_t newsize; + if (check_add_wrapv_size_t (size, 1, &newsize) + || !char_array_resize (array, newsize)) + return false; + *((char *) mempcpy (array->dynarray_header.array, str, size)) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Initialize the char_array 'array' and sets it to an empty string (""). */ +static bool __attribute_used__ +char_array_init_empty (struct char_array *array) +{ + char_array_init (array); + return char_array_set_str (array, ""); +} + +/* Initialize the char_array 'array' and copy the content of string 'str'. */ +static bool __attribute_used__ +char_array_init_str (struct char_array *array, const char *str) +{ + char_array_init (array); + return char_array_set_str (array, str); +} + +/* Initialize the char_array 'array' and copy the content of string 'str' + up to 'size' characteres. */ +static bool __attribute_used__ +char_array_init_str_size (struct char_array *array, const char *str, + size_t size) +{ + char_array_init (array); + return char_array_set_str_size (array, str, size); +} + +static bool __attribute_used__ +char_array_is_empty (struct char_array *array) +{ + return *char_array_at (array, 0) == '\0'; +} + +/* Remove the byte at position 'pos' from char_array 'array'. The contents + are moved internally if the position is not at the end of the internal + buffer. */ +static bool __attribute_used__ +char_array_erase (struct char_array *array, size_t pos) +{ + if (pos >= array->dynarray_header.used - 1) + return false; + + char *ppos = char_array_at (array, pos); + char *lpos = array->dynarray_header.array + array->dynarray_header.used; + ptrdiff_t size = lpos - ppos; + memmove (ppos, ppos + 1, size); + array->dynarray_header.used--; + return true; +} + +/* Resize the char_array 'array' to size 'count' maintaining the ending + '\0' byte. */ +static bool __attribute_used__ +char_array_crop (struct char_array *array, size_t size) +{ + if (size >= (array->dynarray_header.used - 1) + || !char_array_resize (array, size + 1)) + return false; + + array->dynarray_header.array[array->dynarray_header.used] = '\0'; + return true; +} + +/* Prepend the contents of string 'str' to char_array 'array', including the + final '\0' byte. */ +static bool __attribute_used__ +char_array_prepend_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and we need below + to correct copy the elements. */ + size_t used = array->dynarray_header.used; + + size_t newsize; + if (check_add_wrapv_size_t (used, size, &newsize) + || !char_array_resize (array, newsize)) + return false; + + /* Make room for the string and copy it. */ + memmove (array->dynarray_header.array + size, array->dynarray_header.array, + used); + memcpy (array->dynarray_header.array, str, size); + array->dynarray_header.used = newsize; + return true; +} + +/* Append the contents of string 'str' to char_array 'array, including the + final '\0' byte. */ +static bool __attribute_used__ +char_array_append_str (struct char_array *array, const char *str) +{ + size_t size = strlen (str); + /* Resizing the array might change its used elements and it used it below + to correct copy the elements. */ + size_t used = array->dynarray_header.used - 1; + + /* array 'used' does account for final '\0', so there is no need to add + an extra element to calculate the final required size. */ + size_t newsize; + if (check_add_wrapv_size_t (used + 1, size, &newsize) + || !char_array_resize (array, newsize)) + return false; + + /* Start to append at '\0' up to string length and add a final '\0'. */ + *(char*) mempcpy (array->dynarray_header.array + used, str, size) = '\0'; + array->dynarray_header.used = newsize; + return true; +} + +/* Replace the contents starting of position 'pos' of char_array 'array' + with the contents of string 'str' up to 'len' bytes. A final '\0' + is appended in the string. */ +static bool __attribute_used__ +char_array_replace_str_pos (struct char_array *array, size_t pos, + const char *str, size_t len) +{ + if (pos > array->dynarray_header.used) + return false; + + size_t newsize; + if (check_add_wrapv_size_t (pos, len, &newsize) + || check_add_wrapv_size_t (newsize, 1, &newsize) + || !char_array_resize (array, newsize)) + return false; + + char *start = char_array_at (array, pos); + *(char *) mempcpy (start, str, len) = '\0'; + array->dynarray_header.used = newsize; + return true; +} diff --git a/malloc/malloc-internal.h b/malloc/malloc-internal.h index dbd801a..3066cd3 100644 --- a/malloc/malloc-internal.h +++ b/malloc/malloc-internal.h @@ -101,4 +101,18 @@ check_mul_overflow_size_t (size_t left, size_t right, size_t *result) #endif } +/* Set *R = A + B. Return true if the answer is mathematically incorrect due + to overflow; in this case, *R is the low order bits of the correct + answer. */ +static inline bool +check_add_wrapv_size_t (size_t a, size_t b, size_t *r) +{ +#if 5 <= __GNUC__ + return __builtin_add_overflow (a, b, r); +#else + *r = a + b; + return *r < a; +#endif +} + #endif /* _MALLOC_INTERNAL_H */ diff --git a/malloc/tst-char_array.c b/malloc/tst-char_array.c new file mode 100644 index 0000000..53f9482 --- /dev/null +++ b/malloc/tst-char_array.c @@ -0,0 +1,107 @@ +/* Test for char_array. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <string.h> + +#include <malloc/char_array.c> + +#include <malloc.h> +#include <mcheck.h> +#include <stdint.h> +#include <support/check.h> +#include <support/support.h> + +static int +do_test (void) +{ + mtrace (); + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_empty (&str) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == 0); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == true); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing")); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testing") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str_size (&str, "testing", 4)); + TEST_VERIFY_EXIT (char_array_length (&str) == 4); + TEST_VERIFY_EXIT (char_array_pos (&str, 2) == 's'); + TEST_VERIFY_EXIT (char_array_is_empty (&str) == false); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "test") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_set_str (&str, "abcdef")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcdef") == 0); + TEST_VERIFY_EXIT (char_array_set_str_size (&str, "abcdef", 4)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "abcd") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "testing")); + TEST_VERIFY_EXIT (char_array_erase (&str, 4) == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testng") == 0); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str)) + == false); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 1); + TEST_VERIFY_EXIT (char_array_erase (&str, char_array_length (&str) - 1) + == true); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("testing") - 2); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "testn") == 0); + char_array_free (&str); + } + + { + struct char_array str; + TEST_VERIFY_EXIT (char_array_init_str (&str, "test")); + TEST_VERIFY_EXIT (char_array_prepend_str (&str, "123")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test")); + TEST_VERIFY_EXIT (char_array_append_str (&str, "456")); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test456") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test456")); + TEST_VERIFY_EXIT (char_array_replace_str_pos (&str, 7, "789", 3)); + TEST_VERIFY_EXIT (strcmp (char_array_str (&str), "123test789") == 0); + TEST_VERIFY_EXIT (char_array_length (&str) == strlen ("123test789")); + char_array_free (&str); + } + + return 0; +} + +#include <support/test-driver.c>