Message ID | 4EDE5810.1070204@codesourcery.com |
---|---|
State | New |
Headers | show |
On Tue, 06 Dec 2011 17:59:44 +0000 Andrew Stubbs <ams@codesourcery.com> wrote: > This patch adds a one's complement pattern for doing DImode 'not' in > NEON registers. > > There are already patterns for doing one's complement of vectors, and > even though it boils down to the same instruction, the DImode case > was missing. > > The patch needs to be a little more complicated than using a mode > iterator that includes DI because it needs to coexist with the > non-neon one_cmpldi2 (renamed by this patch to "one_cmpldi2_core"). > > OK for when stage 1 opens again? > > Andrew +(define_insn "*one_cmpldi2_neon" + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] + "TARGET_NEON" + "@ + vmvn\t%P0, %P1 + # + # + vmvn\t%P0, %P1" + [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") + (set_attr "length" "*,8,8,*") + (set_attr "arch" "nota8,*,*,onlya8")] +) Don't you need to specify an element type on those instructions? (".s64" maybe)? Julian
On Tue 06 Dec 2011 18:43:05 GMT, Julian Brown wrote: > On Tue, 06 Dec 2011 17:59:44 +0000 > Andrew Stubbs<ams@codesourcery.com> wrote: > >> This patch adds a one's complement pattern for doing DImode 'not' in >> NEON registers. >> >> There are already patterns for doing one's complement of vectors, and >> even though it boils down to the same instruction, the DImode case >> was missing. >> >> The patch needs to be a little more complicated than using a mode >> iterator that includes DI because it needs to coexist with the >> non-neon one_cmpldi2 (renamed by this patch to "one_cmpldi2_core"). >> >> OK for when stage 1 opens again? >> >> Andrew > > +(define_insn "*one_cmpldi2_neon" > + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") > + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] > + "TARGET_NEON" > + "@ > + vmvn\t%P0, %P1 > + # > + # > + vmvn\t%P0, %P1" > + [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") > + (set_attr "length" "*,8,8,*") > + (set_attr "arch" "nota8,*,*,onlya8")] > +) > > Don't you need to specify an element type on those instructions? (".s64" > maybe)? No; I believe you can, but it's ignored, given that it makes zero difference. Same for vand, vorr, veor etc. In any case this was copied from the existing "one_cmpl<mode>2" pattern, and gas accepts it. Andrew
On 12/06/2011 09:59 AM, Andrew Stubbs wrote: > +(define_insn "*one_cmpldi2_neon" > + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") > + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] alternative 0 == alternative 3? r~
On Tue 06 Dec 2011 21:05:30 GMT, Richard Henderson wrote: > On 12/06/2011 09:59 AM, Andrew Stubbs wrote: >> +(define_insn "*one_cmpldi2_neon" >> + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") >> + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] > > alternative 0 == alternative 3? Yes and no. This is an idiom used in several places in neon.md. They are the same, but only one or other is enabled at the same time (see the "arch" attribute) so while both 'w' and 'r' options are always available, the order of preference is different. If you have a better way of achieving this goal, then I'm eager to hear it. FYI, the reason is that Cortex-A8 supports NEON, but the cost of transferring values between the register files is prohibitively high so 64-bit NEON operations are discouraged unless the values are already in NEON, or core register pressure makes it more worthwhile. Andrew
On 12/06/2011 01:42 PM, Andrew Stubbs wrote: > On Tue 06 Dec 2011 21:05:30 GMT, Richard Henderson wrote: >> On 12/06/2011 09:59 AM, Andrew Stubbs wrote: >>> +(define_insn "*one_cmpldi2_neon" >>> + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") >>> + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] >> >> alternative 0 == alternative 3? > > Yes and no. This is an idiom used in several places in neon.md. They > are the same, but only one or other is enabled at the same time (see > the "arch" attribute) so while both 'w' and 'r' options are always > available, the order of preference is different. Except that without *, I don't think this is what is actually achieved. Perhaps I'm mistaken about how register preferencing is handled in the current state of the register allocator... r~
On 06/12/11 23:07, Richard Henderson wrote: > On 12/06/2011 01:42 PM, Andrew Stubbs wrote: >> On Tue 06 Dec 2011 21:05:30 GMT, Richard Henderson wrote: >>> On 12/06/2011 09:59 AM, Andrew Stubbs wrote: >>>> +(define_insn "*one_cmpldi2_neon" >>>> + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") >>>> + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] >>> >>> alternative 0 == alternative 3? >> >> Yes and no. This is an idiom used in several places in neon.md. They >> are the same, but only one or other is enabled at the same time (see >> the "arch" attribute) so while both 'w' and 'r' options are always >> available, the order of preference is different. > > Except that without *, I don't think this is what is actually achieved. > Perhaps I'm mistaken about how register preferencing is handled in the > current state of the register allocator... Well .... I can demonstrate that it does work in a simple testcase, and it has a visible affect in benchmark figures. Of course, that doesn't mean it doing it the right way. As far as I understand it the current implementation relies on the left-to-right priority of the constraints. On A9, and other A-series parts, we want it to prefer 'w'-constraint registers, but still use 'r'-constraint registers if those are more convenient. On A8, we want it to prefer 'r' registers, all else being equal, but still have the freedom to use 'w' registers if the values are already there. From reading the internals manual, it's not clear to me what the '*' constraint modifier really means, or how it would work in this case? Could you enlighten me? Andrew
On 06/12/11 17:59, Andrew Stubbs wrote: > This patch adds a one's complement pattern for doing DImode 'not' in > NEON registers. > > There are already patterns for doing one's complement of vectors, and > even though it boils down to the same instruction, the DImode case was > missing. > > The patch needs to be a little more complicated than using a mode > iterator that includes DI because it needs to coexist with the non-neon > one_cmpldi2 (renamed by this patch to "one_cmpldi2_core"). > > OK for when stage 1 opens again? > > Andrew > > > neon-not.patch > > > 2011-12-06 Andrew Stubbs <ams@codesourcery.com> > > gcc/ > * config/arm/arm.md (one_cmpldi2): Rename to ... > (one_cmpldi2_core): ... this, and modify it to prevent it being > used for NEON. > (one_cmpldi2): New define_expand. > * config/arm/neon.md (one_cmpldi2_neon): New define_insn. > +(define_insn_and_split "*one_cmpldi2_core" > + [(set (match_operand:DI 0 "arm_general_register_operand" "=&r,&r") > + (not:DI (match_operand:DI 1 "arm_general_register_operand" "0,r")))] Thinking about it, for an operation with one input and one output, there's no need for the earlyclobber marker when the input is tied to the output (there's no other operand that can be clobbered). Otherwise OK. R.
2011-12-06 Andrew Stubbs <ams@codesourcery.com> gcc/ * config/arm/arm.md (one_cmpldi2): Rename to ... (one_cmpldi2_core): ... this, and modify it to prevent it being used for NEON. (one_cmpldi2): New define_expand. * config/arm/neon.md (one_cmpldi2_neon): New define_insn. --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4199,10 +4199,16 @@ "TARGET_32BIT && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP_DOUBLE)" "") -(define_insn_and_split "one_cmpldi2" - [(set (match_operand:DI 0 "s_register_operand" "=&r,&r") - (not:DI (match_operand:DI 1 "s_register_operand" "0,r")))] +(define_expand "one_cmpldi2" + [(set (match_operand:DI 0 "s_register_operand" "") + (not:DI (match_operand:DI 1 "s_register_operand" "")))] "TARGET_32BIT" + "") + +(define_insn_and_split "*one_cmpldi2_core" + [(set (match_operand:DI 0 "arm_general_register_operand" "=&r,&r") + (not:DI (match_operand:DI 1 "arm_general_register_operand" "0,r")))] + "TARGET_32BIT && !TARGET_NEON" "#" "TARGET_32BIT && reload_completed" [(set (match_dup 0) (not:SI (match_dup 1))) --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -896,6 +896,20 @@ [(set_attr "neon_type" "neon_int_1")] ) +(define_insn "*one_cmpldi2_neon" + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r,?w") + (not:DI (match_operand:DI 1 "s_register_operand" " w, 0, r, w")))] + "TARGET_NEON" + "@ + vmvn\t%P0, %P1 + # + # + vmvn\t%P0, %P1" + [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") + (set_attr "length" "*,8,8,*") + (set_attr "arch" "nota8,*,*,onlya8")] +) + (define_insn "abs<mode>2" [(set (match_operand:VDQW 0 "s_register_operand" "=w") (abs:VDQW (match_operand:VDQW 1 "s_register_operand" "w")))]