Message ID | 1506107512-5013-1-git-send-email-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | [v2] posix: Fix glob with GLOB_NOCHECK returning modified patterns (BZ#10246) | expand |
Adhemerval Zanella wrote: > + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) Use __glibc_unlikely instead of __builtin_expect with 0, here and elsewhere. > + ? (*pglob->gl_lstat) (fname, &ust->st) No need for the "*" or for the initial pair of parens, both here and elsewhere. > + : __lstat64 (fname, &ust->st64)); The ? and : should be indented one more column than the (. > +} > + > +static void * > +glob_opendir (glob_t *pglob, int flags, const char *directory) > +{ > + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) > + ? (*pglob->gl_opendir) (directory) > + : opendir (directory)); > } The indenting is not right there: ? and : should have the same indentation. > @@ -1332,6 +1334,17 @@ glob_in_dir (const char *pattern, const char *directory, int flags, > > if (fnmatch (pattern, d.name, fnm_flags) == 0) > { > + /* We need to certify that the link is a directory. */ > + if (flags & GLOB_ONLYDIR && readdir_result_type (d) == DT_LNK > + { > + void *ss = glob_opendir (pglob, flags, d.name); I'm afraid there are several problems here. GLOB_ONLYDIR does not mean, "return only directories". It means, "the caller is interested only in directories, so if you can easily determine that the result is not a directory, don't bother to return it. However, if it is expensive to check then it is OK to return the result anyway, as it is the caller's responsibility to check that returned strings are directories." So the comment is wrong here, and the code looks wrong too: calling glob_opendir is expensive, so it should not be done here. Also, why use glob_opendir to test whether d.name is a directory, when the rest of the code uses is_dir to do that? Also, if readdir_result_type (d) is DT_UNKNOWN, d.name might be a directory, so the code should check in that case too. Also, d.name is just a file name component: it should be absolute, or relative to the working directory. Also, the code mistakenly treats leading "//" as if it were "/"; this is not correct, and does not work on some platforms. POSIX allows implementations where "//" is distinct from "/". Also, it's not clear to me that the code treats strings of one or more "/"s correctly, in the pattern.
On 23/09/2017 03:46, Paul Eggert wrote: > Adhemerval Zanella wrote: > >> + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) > > Use __glibc_unlikely instead of __builtin_expect with 0, here and elsewhere. Ack. > >> + ? (*pglob->gl_lstat) (fname, &ust->st) > > No need for the "*" or for the initial pair of parens, both here and elsewhere. Ack. > >> + : __lstat64 (fname, &ust->st64)); > > The ? and : should be indented one more column than the (. Ack. > >> +} >> + >> +static void * >> +glob_opendir (glob_t *pglob, int flags, const char *directory) >> +{ >> + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) >> + ? (*pglob->gl_opendir) (directory) >> + : opendir (directory)); >> } > > The indenting is not right there: ? and : should have the same indentation. Ack. > >> @@ -1332,6 +1334,17 @@ glob_in_dir (const char *pattern, const char *directory, int flags, >> if (fnmatch (pattern, d.name, fnm_flags) == 0) >> { >> + /* We need to certify that the link is a directory. */ >> + if (flags & GLOB_ONLYDIR && readdir_result_type (d) == DT_LNK > + { >> + void *ss = glob_opendir (pglob, flags, d.name); > > I'm afraid there are several problems here. GLOB_ONLYDIR does not mean, "return only directories". It means, "the caller is interested only in directories, so if you can easily determine that the result is not a directory, don't bother to return it. However, if it is expensive to check then it is OK to return the result anyway, as it is the caller's responsibility to check that returned strings are directories." So the comment is wrong here, and the code looks wrong too: calling glob_opendir is expensive, so it should not be done here. I tried to defer the call that check if the link is actually pointing to a directory as late as possible. Without actually following the possible directory the obtained matched name the example you provided: ln -s /no-such-file globlink1 ln -s . globlink2 Will be matched only for globlink1. > > Also, why use glob_opendir to test whether d.name is a directory, when the rest of the code uses is_dir to do that? Good point, it can be simplified to use is_dir indeed. > > Also, if readdir_result_type (d) is DT_UNKNOWN, d.name might be a directory, so the code should check in that case too. I excluded DT_UNKNOWN mainly because posix/tst-gnuglob.c uses a for the synthetic directory to tests some entries as DT_UNKNOWN and it does not expect them to be matches. If we change to also match DT_UNKNOWN for possible directories entries it would meant change slight current glob semantic. > > Also, d.name is just a file name component: it should be absolute, or relative to the working directory. Not really sure what this may lead to issue in current patch, care to elaborate? > > Also, the code mistakenly treats leading "//" as if it were "/"; this is not correct, and does not work on some platforms. POSIX allows implementations where "//" is distinct from "/". Indeed, although for *glibc* supported platforms this is not an issue. I don't consider this a block for glibc inclusion, but I am open to suggestion on how to handle correctly on platforms with handle '//' distinct from '/'. > > Also, it's not clear to me that the code treats strings of one or more "/"s correctly, in the pattern. Same as previous remark, using the same testcase: glob_t g; int res = glob ("globlink[12]/", 0, NULL, &g); assert (res == 0 && g.gl_pathc == 1); assert (strcmp (g.gl_pathv[0], "globlink2/") == 0); Using either 'globlink[12]//' or 'globlink[12]///' will handle the same results as 'globlink[12]/'. This still might be not the correct solution for platform that handle '//' distinct from '/', but again I do not think it is block for *glibc*.
Adhemerval Zanella wrote: > Without actually following the possible directory the obtained > matched name the example you provided: > > ln -s /no-such-file globlink1 > ln -s . globlink2 > > Will be matched only for globlink1. You're correct that there's a bug here, that needs to be fixed. But the proposed patch changes the semantics of GLOB_ONLYDIR contrary to its intent. GLOB_ONLYDIR is supposed to be an efficiency flag, to let callers run faster. By making GLOB_ONLYDIR run slower, the patch is contrary to this expectation. The bug needs to be fixed, but not this way. It's the caller's responsibility to check that the returned value is indeed a directory, and that's how the fix should be written: by fixing the caller to check that the result is indeed a directory, not by slowing down the callee. >> Also, if readdir_result_type (d) is DT_UNKNOWN, d.name might be a directory, so the code should check in that case too. > > I excluded DT_UNKNOWN mainly because posix/tst-gnuglob.c uses a for the synthetic > directory to tests some entries as DT_UNKNOWN and it does not expect them to > be matches. That sounds like a bug in the test case, then. DT_UNKNOWN is returned for file systems that do not have the file type in the inode. A DT_UNKNOWN entry might be a directory, or a symlink, or anything else. > If we change to also match DT_UNKNOWN for possible directories > entries it would meant change slight current glob semantic. How so? DT_UNKNOWN merely means that glob can't do some optimizations. This should not change the semantics, only performance. >> Also, d.name is just a file name component: it should be absolute, or relative to the working directory. > > Not really sure what this may lead to issue in current patch, care to > elaborate? As I recall, d.name is relative to the parent directory: it has no slashes. The argument to is_dir is supposed to be the full filename, either absolute or relative to the current working directory. >> Also, the code mistakenly treats leading "//" as if it were "/"; this is not correct, and does not work on some platforms. POSIX allows implementations where "//" is distinct from "/". > > Indeed, although for *glibc* supported platforms this is not an issue. Fair enough, we can fix that separately. > Using either 'globlink[12]//' or 'globlink[12]///' will handle the same > results as 'globlink[12]/'. Every result returned by globlink[12]/// should end in three slashes. The number of slashes in the result should equal the number of slashes in the pattern. This general issue differs from the issue about // vs / at the start of the file name, as the trailing slashes are portable to all filesystems.
diff --git a/posix/glob.c b/posix/glob.c index c699177..391aa9c 100644 --- a/posix/glob.c +++ b/posix/glob.c @@ -128,36 +128,58 @@ readdir_result_type (struct readdir_result d) # define GL_READDIR(pglob, stream) ((pglob)->gl_readdir (stream)) #endif -/* Extract name and type from directory entry. No copy of the name is - made. If SOURCE is NULL, result name is NULL. Keep in sync with - convert_dirent64 below. */ -static struct readdir_result -convert_dirent (const struct dirent *source) + +union glob_stat { - if (source == NULL) - { - struct readdir_result result = { NULL, }; - return result; - } - struct readdir_result result = READDIR_RESULT_INITIALIZER (source); - return result; + struct stat st; + struct_stat64 st64; +}; + +static int +glob_lstat (glob_t *pglob, int flags, const char *fname, + union glob_stat *ust) +{ + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) + ? (*pglob->gl_lstat) (fname, &ust->st) + : __lstat64 (fname, &ust->st64)); +} + +static void * +glob_opendir (glob_t *pglob, int flags, const char *directory) +{ + return (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) + ? (*pglob->gl_opendir) (directory) + : opendir (directory)); } -#ifndef COMPILE_GLOB64 -/* Like convert_dirent, but works on struct dirent64 instead. Keep in - sync with convert_dirent above. */ static struct readdir_result -convert_dirent64 (const struct dirent64 *source) +glob_readdir (glob_t *pglob, int flags, void *stream) { - if (source == NULL) + if (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)) { - struct readdir_result result = { NULL, }; - return result; + const struct dirent *src = GL_READDIR (pglob, stream); + if (src == NULL) + return (struct readdir_result) { NULL, }; + return (struct readdir_result) READDIR_RESULT_INITIALIZER (src); } - struct readdir_result result = READDIR_RESULT_INITIALIZER (source); - return result; -} +#ifdef COMPILE_GLOB64 + const struct dirent *source = __readdir (stream); +#else + const struct dirent64 *source = __readdir64 (stream); #endif + if (source == NULL) + return (struct readdir_result) { NULL, }; + return (struct readdir_result) READDIR_RESULT_INITIALIZER (source); +} + +static void +glob_closedir (glob_t *pglob, int flags, void *stream) +{ + if (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)) + (*pglob->gl_closedir) (stream); + else + closedir (stream); +} #ifndef _LIBC /* The results of opendir() in this file are not used with dirfd and fchdir, @@ -271,6 +293,8 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int), size_t oldcount; int meta; int dirname_modified; + /* Indicate if the directory should be prepended on return values. */ + bool dirname_prefix = true; int malloc_dirname = 0; glob_t dirs; int retval = 0; @@ -494,6 +518,10 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int), dirname = (char *) "/"; dirlen = 1; ++filename; + /* prefix_array adds a separator for each result and DIRNAME is + already '/'. So we indicate later that we should not prepend + anything for this specific case. */ + dirname_prefix = false; } else { @@ -1085,7 +1113,7 @@ glob (const char *pattern, int flags, int (*errfunc) (const char *, int), if (dirlen > 0) { /* Stick the directory on the front of each name. */ - if (prefix_array (dirname, + if (prefix_array (dirname_prefix ? dirname : "", &pglob->gl_pathv[old_pathc + pglob->gl_offs], pglob->gl_pathc - old_pathc)) { @@ -1166,11 +1194,6 @@ prefix_array (const char *dirname, char **array, size_t n) size_t dirlen = strlen (dirname); char dirsep_char = '/'; - if (dirlen == 1 && dirname[0] == '/') - /* DIRNAME is just "/", so normal prepending would get us "//foo". - We want "/foo" instead, so don't prepend any chars from DIRNAME. */ - dirlen = 0; - #if defined __MSDOS__ || defined WINDOWS32 if (dirlen > 1) { @@ -1250,11 +1273,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags, } else if (meta == GLOBPAT_NONE) { - union - { - struct stat st; - struct_stat64 st64; - } ust; + union glob_stat ust; size_t patlen = strlen (pattern); size_t fullsize; bool alloca_fullname @@ -1273,10 +1292,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags, mempcpy (mempcpy (mempcpy (fullname, directory, dirlen), "/", 1), pattern, patlen + 1); - if (((__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) - ? (*pglob->gl_lstat) (fullname, &ust.st) - : __lstat64 (fullname, &ust.st64)) - == 0) + if (glob_lstat (pglob, flags, fullname, &ust) == 0 || errno == EOVERFLOW) /* We found this file to be existing. Now tell the rest of the function to copy this name into the result. */ @@ -1287,9 +1303,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags, } else { - stream = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0) - ? (*pglob->gl_opendir) (directory) - : opendir (directory)); + stream = glob_opendir (pglob, flags, directory); if (stream == NULL) { if (errno != ENOTDIR @@ -1305,19 +1319,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags, while (1) { - struct readdir_result d; - { - if (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)) - d = convert_dirent (GL_READDIR (pglob, stream)); - else - { -#ifdef COMPILE_GLOB64 - d = convert_dirent (__readdir (stream)); -#else - d = convert_dirent64 (__readdir64 (stream)); -#endif - } - } + struct readdir_result d = glob_readdir (pglob, flags, stream); if (d.name == NULL) break; @@ -1332,6 +1334,17 @@ glob_in_dir (const char *pattern, const char *directory, int flags, if (fnmatch (pattern, d.name, fnm_flags) == 0) { + /* We need to certify that the link is a directory. */ + if (flags & GLOB_ONLYDIR && readdir_result_type (d) == DT_LNK) + { + void *ss = glob_opendir (pglob, flags, d.name); + if (ss == NULL) + { + glob_closedir (pglob, flags, ss); + continue; + } + } + if (cur == names->count) { struct globnames *newnames; @@ -1452,10 +1465,7 @@ glob_in_dir (const char *pattern, const char *directory, int flags, if (stream != NULL) { save = errno; - if (__glibc_unlikely (flags & GLOB_ALTDIRFUNC)) - (*pglob->gl_closedir) (stream); - else - closedir (stream); + glob_closedir (pglob, flags, stream); __set_errno (save); } diff --git a/posix/globtest.sh b/posix/globtest.sh index 73f7ae3..92a8e37 100755 --- a/posix/globtest.sh +++ b/posix/globtest.sh @@ -242,6 +242,43 @@ if test $failed -ne 0; then result=1 fi +# Test NOCHECK for specific cases where the pattern used starts +# with '/' (BZ#10246). +failed=0 +${test_program_prefix} \ +${common_objpfx}posix/globtest -c "$testdir" "/%" | +sort > $testout +cat <<"EOF" | $CMP - $testout >> $logfile || failed=1 +`/%' +EOF +if test $failed -ne 0; then + echo "No check test failed" >> $logfile + result=1 +fi + +${test_program_prefix} \ +${common_objpfx}posix/globtest -c "$testdir" "//%" | +sort > $testout +cat <<"EOF" | $CMP - $testout >> $logfile || failed=1 +`//%' +EOF +if test $failed -ne 0; then + echo "No check test failed" >> $logfile + result=1 +fi + +${test_program_prefix} \ +${common_objpfx}posix/globtest -c "$testdir" "///%" | +sort > $testout +cat <<"EOF" | $CMP - $testout >> $logfile || failed=1 +`///%' +EOF +if test $failed -ne 0; then + echo "No check test failed" >> $logfile + result=1 +fi + + # Test NOMAGIC without magic characters failed=0 ${test_program_prefix} \ diff --git a/posix/tst-glob_symlinks.c b/posix/tst-glob_symlinks.c index 5c4b4ec..c953204 100644 --- a/posix/tst-glob_symlinks.c +++ b/posix/tst-glob_symlinks.c @@ -131,5 +131,11 @@ do_test (void) TEST_VERIFY_EXIT (strcmp (gl.gl_pathv[0], dangling_link) == 0); globfree (&gl); + /* glob should not try to match link the dangling directories. */ + snprintf (buf, sizeof buf, "/tmp/dangling-symlink-dir-tst-glob*/"); + fprintf (stderr, "%s\n", buf); + TEST_VERIFY_EXIT (glob (buf, 0, NULL, &gl) == GLOB_NOMATCH); + globfree (&gl); + return 0; }