Message ID | CAKdteOZ0M9rxfPkhi49YHQr=0fcLWwFp-wA7spkJBOt7Z_pDSQ@mail.gmail.com |
---|---|
State | Superseded |
Headers | show |
ping ? https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html Christophe On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote: > Hello, > > I've received a complaint that GCC for AArch64 would generate > vectorized code relying on unaligned memory accesses even when using > -mstrict-align. This is a problem for code where such accesses lead to > memory faults. > > A previous patch (r243333) introduced > aarch64_builtin_support_vector_misalignment, which rejects such > accesses when the element size is 64 bits, and accept them otherwise, > which I think it shouldn't. The testcase added at that time only used > 64 bits elements, and therefore didn't fully test the patch. > > The report I received is about vectorized accesses to an array of > unsigned chars, whose start address is not aligned on a 128 bits > boundary. > > The attached patch fixes the problem by making > aarch64_builtin_support_vector_misalignment always return false when > the misalignment is not known at compile time. > > I've also added a testcase, which tries to check if the array start > address alignment is checked (using %16, and-ing with #15), so that > loop peeling is performed *before* using vectorized accesses. Without > the patch, vectorized accesses are used at the beginning of the array, > and byte accesses are used for the remainder at the end, and there is > not such 'and wX,wX,15'. > > BTW, I'm not sure about the same hook for arm... it seems to me it has > a similar problem. > > OK? > > Thanks, > > Christophe
ping^2 ? On 21 August 2017 at 15:04, Christophe Lyon <christophe.lyon@linaro.org> wrote: > ping ? > https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html > > Christophe > > > On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote: >> Hello, >> >> I've received a complaint that GCC for AArch64 would generate >> vectorized code relying on unaligned memory accesses even when using >> -mstrict-align. This is a problem for code where such accesses lead to >> memory faults. >> >> A previous patch (r243333) introduced >> aarch64_builtin_support_vector_misalignment, which rejects such >> accesses when the element size is 64 bits, and accept them otherwise, >> which I think it shouldn't. The testcase added at that time only used >> 64 bits elements, and therefore didn't fully test the patch. >> >> The report I received is about vectorized accesses to an array of >> unsigned chars, whose start address is not aligned on a 128 bits >> boundary. >> >> The attached patch fixes the problem by making >> aarch64_builtin_support_vector_misalignment always return false when >> the misalignment is not known at compile time. >> >> I've also added a testcase, which tries to check if the array start >> address alignment is checked (using %16, and-ing with #15), so that >> loop peeling is performed *before* using vectorized accesses. Without >> the patch, vectorized accesses are used at the beginning of the array, >> and byte accesses are used for the remainder at the end, and there is >> not such 'and wX,wX,15'. >> >> BTW, I'm not sure about the same hook for arm... it seems to me it has >> a similar problem. >> >> OK? >> >> Thanks, >> >> Christophe
ping^3 ? https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html On 31 August 2017 at 11:22, Christophe Lyon <christophe.lyon@linaro.org> wrote: > ping^2 ? > > > On 21 August 2017 at 15:04, Christophe Lyon <christophe.lyon@linaro.org> wrote: >> ping ? >> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01063.html >> >> Christophe >> >> >> On 18 July 2017 at 14:50, Christophe Lyon <christophe.lyon@linaro.org> wrote: >>> Hello, >>> >>> I've received a complaint that GCC for AArch64 would generate >>> vectorized code relying on unaligned memory accesses even when using >>> -mstrict-align. This is a problem for code where such accesses lead to >>> memory faults. >>> >>> A previous patch (r243333) introduced >>> aarch64_builtin_support_vector_misalignment, which rejects such >>> accesses when the element size is 64 bits, and accept them otherwise, >>> which I think it shouldn't. The testcase added at that time only used >>> 64 bits elements, and therefore didn't fully test the patch. >>> >>> The report I received is about vectorized accesses to an array of >>> unsigned chars, whose start address is not aligned on a 128 bits >>> boundary. >>> >>> The attached patch fixes the problem by making >>> aarch64_builtin_support_vector_misalignment always return false when >>> the misalignment is not known at compile time. >>> >>> I've also added a testcase, which tries to check if the array start >>> address alignment is checked (using %16, and-ing with #15), so that >>> loop peeling is performed *before* using vectorized accesses. Without >>> the patch, vectorized accesses are used at the beginning of the array, >>> and byte accesses are used for the remainder at the end, and there is >>> not such 'and wX,wX,15'. >>> >>> BTW, I'm not sure about the same hook for arm... it seems to me it has >>> a similar problem. >>> >>> OK? >>> >>> Thanks, >>> >>> Christophe
On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon <christophe.lyon@linaro.org> wrote: > Hello, > > I've received a complaint that GCC for AArch64 would generate > vectorized code relying on unaligned memory accesses even when using > -mstrict-align. This is a problem for code where such accesses lead to > memory faults. > > A previous patch (r243333) introduced > aarch64_builtin_support_vector_misalignment, which rejects such > accesses when the element size is 64 bits, and accept them otherwise, > which I think it shouldn't. The testcase added at that time only used > 64 bits elements, and therefore didn't fully test the patch. > > The report I received is about vectorized accesses to an array of > unsigned chars, whose start address is not aligned on a 128 bits > boundary. > > The attached patch fixes the problem by making > aarch64_builtin_support_vector_misalignment always return false when > the misalignment is not known at compile time. > > I've also added a testcase, which tries to check if the array start > address alignment is checked (using %16, and-ing with #15), so that > loop peeling is performed *before* using vectorized accesses. Without > the patch, vectorized accesses are used at the beginning of the array, > and byte accesses are used for the remainder at the end, and there is > not such 'and wX,wX,15'. > > BTW, I'm not sure about the same hook for arm... it seems to me it has > a similar problem. > > OK? I would keep part of the comment: - /* Misalignment factor is unknown at compile time but we know - it's word aligned. */ Something like: /* Misalignment factor is unknown at compile time. */ Otherwise it is not obvious what -1 means. Other than I think this patch is good (I cannot approve though). Thanks, Andrew > > Thanks, > > Christophe
Hi, On 11 September 2017 at 10:45, Andrew Pinski <pinskia@gmail.com> wrote: > On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon > <christophe.lyon@linaro.org> wrote: >> Hello, >> >> I've received a complaint that GCC for AArch64 would generate >> vectorized code relying on unaligned memory accesses even when using >> -mstrict-align. This is a problem for code where such accesses lead to >> memory faults. >> >> A previous patch (r243333) introduced >> aarch64_builtin_support_vector_misalignment, which rejects such >> accesses when the element size is 64 bits, and accept them otherwise, >> which I think it shouldn't. The testcase added at that time only used >> 64 bits elements, and therefore didn't fully test the patch. >> >> The report I received is about vectorized accesses to an array of >> unsigned chars, whose start address is not aligned on a 128 bits >> boundary. >> >> The attached patch fixes the problem by making >> aarch64_builtin_support_vector_misalignment always return false when >> the misalignment is not known at compile time. >> >> I've also added a testcase, which tries to check if the array start >> address alignment is checked (using %16, and-ing with #15), so that >> loop peeling is performed *before* using vectorized accesses. Without >> the patch, vectorized accesses are used at the beginning of the array, >> and byte accesses are used for the remainder at the end, and there is >> not such 'and wX,wX,15'. >> >> BTW, I'm not sure about the same hook for arm... it seems to me it has >> a similar problem. >> >> OK? > > I would keep part of the comment: > - /* Misalignment factor is unknown at compile time but we know > - it's word aligned. */ > > Something like: > /* Misalignment factor is unknown at compile time. */ > > Otherwise it is not obvious what -1 means. > > Other than I think this patch is good (I cannot approve though). > Here is an updated patch, with the comment added as Andrew suggested. OK? Thanks, Christophe > Thanks, > Andrew > >> >> Thanks, >> >> Christophe 2017-09-20 Christophe Lyon <christophe.lyon@linaro.org> PR target/71727 gcc/ * config/aarch64/aarch64.c (aarch64_builtin_support_vector_misalignment): Always return false when misalignment is unknown. gcc/testsuite/ * gcc.target/aarch64/pr71727-2.c: New test. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 799989a..7cc67ec 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11757,19 +11757,9 @@ aarch64_builtin_support_vector_misalignment (machine_mode mode, if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing) return false; + /* Misalignment factor is unknown at compile time. */ if (misalignment == -1) - { - /* Misalignment factor is unknown at compile time but we know - it's word aligned. */ - if (aarch64_simd_vector_alignment_reachable (type, is_packed)) - { - int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type)); - - if (element_size != 64) - return true; - } - return false; - } + return false; } return default_builtin_support_vector_misalignment (mode, type, misalignment, is_packed); diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727-2.c b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c new file mode 100644 index 0000000..2bc803a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mstrict-align -O3" } */ + +unsigned char foo(const unsigned char *buffer, unsigned int length) +{ + unsigned char sum; + unsigned int count; + + for (sum = 0, count = 0; count < length; count++) { + sum = (unsigned char) (sum + *(buffer + count)); + } + + return sum; +} + +/* { dg-final { scan-assembler-times "and\tw\[0-9\]+, w\[0-9\]+, 15" 1 } } */
ping? On 20 September 2017 at 15:17, Christophe Lyon <christophe.lyon@linaro.org> wrote: > Hi, > > On 11 September 2017 at 10:45, Andrew Pinski <pinskia@gmail.com> wrote: >> On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon >> <christophe.lyon@linaro.org> wrote: >>> Hello, >>> >>> I've received a complaint that GCC for AArch64 would generate >>> vectorized code relying on unaligned memory accesses even when using >>> -mstrict-align. This is a problem for code where such accesses lead to >>> memory faults. >>> >>> A previous patch (r243333) introduced >>> aarch64_builtin_support_vector_misalignment, which rejects such >>> accesses when the element size is 64 bits, and accept them otherwise, >>> which I think it shouldn't. The testcase added at that time only used >>> 64 bits elements, and therefore didn't fully test the patch. >>> >>> The report I received is about vectorized accesses to an array of >>> unsigned chars, whose start address is not aligned on a 128 bits >>> boundary. >>> >>> The attached patch fixes the problem by making >>> aarch64_builtin_support_vector_misalignment always return false when >>> the misalignment is not known at compile time. >>> >>> I've also added a testcase, which tries to check if the array start >>> address alignment is checked (using %16, and-ing with #15), so that >>> loop peeling is performed *before* using vectorized accesses. Without >>> the patch, vectorized accesses are used at the beginning of the array, >>> and byte accesses are used for the remainder at the end, and there is >>> not such 'and wX,wX,15'. >>> >>> BTW, I'm not sure about the same hook for arm... it seems to me it has >>> a similar problem. >>> >>> OK? >> >> I would keep part of the comment: >> - /* Misalignment factor is unknown at compile time but we know >> - it's word aligned. */ >> >> Something like: >> /* Misalignment factor is unknown at compile time. */ >> >> Otherwise it is not obvious what -1 means. >> >> Other than I think this patch is good (I cannot approve though). >> > > Here is an updated patch, with the comment added as Andrew suggested. > > OK? > > Thanks, > > Christophe > >> Thanks, >> Andrew >> >>> >>> Thanks, >>> >>> Christophe
On Wed, Sep 27, 2017 at 06:25:56PM +0100, Christophe Lyon wrote: > ping? OK, thanks. Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com> James > > On 20 September 2017 at 15:17, Christophe Lyon > <christophe.lyon@linaro.org> wrote: > > Hi, > > > > On 11 September 2017 at 10:45, Andrew Pinski <pinskia@gmail.com> wrote: > >> On Tue, Jul 18, 2017 at 5:50 AM, Christophe Lyon > >> <christophe.lyon@linaro.org> wrote: > >>> Hello, > >>> > >>> I've received a complaint that GCC for AArch64 would generate > >>> vectorized code relying on unaligned memory accesses even when using > >>> -mstrict-align. This is a problem for code where such accesses lead to > >>> memory faults. > >>> > >>> A previous patch (r243333) introduced > >>> aarch64_builtin_support_vector_misalignment, which rejects such > >>> accesses when the element size is 64 bits, and accept them otherwise, > >>> which I think it shouldn't. The testcase added at that time only used > >>> 64 bits elements, and therefore didn't fully test the patch. > >>> > >>> The report I received is about vectorized accesses to an array of > >>> unsigned chars, whose start address is not aligned on a 128 bits > >>> boundary. > >>> > >>> The attached patch fixes the problem by making > >>> aarch64_builtin_support_vector_misalignment always return false when > >>> the misalignment is not known at compile time. > >>> > >>> I've also added a testcase, which tries to check if the array start > >>> address alignment is checked (using %16, and-ing with #15), so that > >>> loop peeling is performed *before* using vectorized accesses. Without > >>> the patch, vectorized accesses are used at the beginning of the array, > >>> and byte accesses are used for the remainder at the end, and there is > >>> not such 'and wX,wX,15'. > >>> > >>> BTW, I'm not sure about the same hook for arm... it seems to me it has > >>> a similar problem. > >>> > >>> OK? > >> > >> I would keep part of the comment: > >> - /* Misalignment factor is unknown at compile time but we know > >> - it's word aligned. */ > >> > >> Something like: > >> /* Misalignment factor is unknown at compile time. */ > >> > >> Otherwise it is not obvious what -1 means. > >> > >> Other than I think this patch is good (I cannot approve though). > >> > > > > Here is an updated patch, with the comment added as Andrew suggested. > > > > OK? > > > > Thanks, > > > > Christophe > > > >> Thanks, > >> Andrew > >> > >>> > >>> Thanks, > >>> > >>> Christophe
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 799989a..12a9fbe 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11758,18 +11758,7 @@ aarch64_builtin_support_vector_misalignment (machine_mode mode, return false; if (misalignment == -1) - { - /* Misalignment factor is unknown at compile time but we know - it's word aligned. */ - if (aarch64_simd_vector_alignment_reachable (type, is_packed)) - { - int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type)); - - if (element_size != 64) - return true; - } - return false; - } + return false; } return default_builtin_support_vector_misalignment (mode, type, misalignment, is_packed); diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727-2.c b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c new file mode 100644 index 0000000..8935a72 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mstrict-align -O3" } */ + +unsigned char foo(const unsigned char *buffer, unsigned int length) +{ + unsigned char sum; + unsigned int count; + + for (sum = 0, count = 0; count < length; count++) { + sum = (unsigned char) (sum + *(buffer + count)); + } + + return sum; +} + +/* { dg-final { scan-assembler-times "and\tw\[0-9\]+, w\[0-9\]+, 15" 1 } } */