Message ID | CAAgBjMmxC+Lk1Fk7o4iGhy0G=svOB3ZoYfAdf835-PUvc5rMZw@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | [aarch64] PR111702 - ICE in insert_regs after interleave+zip1 vector initialization patch | expand |
On Thu, 23 Nov 2023 at 17:06, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote: > > Hi Richard, > For the test-case mentioned in PR111702, compiling with -O2 > -frounding-math -fstack-protector-all results in following ICE during > cse2 pass: > > test.c: In function 'foo': > test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 > 119 | } > | ^ > 0xb7ebb0 insert_regs > ../../gcc/gcc/cse.cc:1120 > 0x1f95134 merge_equiv_classes > ../../gcc/gcc/cse.cc:1764 > 0x1f9b9ab cse_insn > ../../gcc/gcc/cse.cc:4793 > 0x1f9fe30 cse_extended_basic_block > ../../gcc/gcc/cse.cc:6577 > 0x1f9fe30 cse_main > ../../gcc/gcc/cse.cc:6722 > 0x1fa0984 rest_of_handle_cse2 > ../../gcc/gcc/cse.cc:7620 > 0x1fa0984 execute > ../../gcc/gcc/cse.cc:7675 > > This happens only with interleave+zip1 vector initialization with > -frounding-math -fstack-protector-all, while it compiles OK without > -fstack-protector-all. Also, it compiles OK with fallback sequence > code-gen (with or without -fstack-protector-all). Unfortunately, I > haven't been able to reduce the test-case further :/ > > From the test-case, it seems only the vector initializer for type J > uses interleave+zip1 approach, while rest of the vector initializers > use fallback sequence. > > J is defined as: > typedef _Float16 __attribute__((__vector_size__ (16))) J; > > and the initializer is: > (J) { 11654, 4801, 5535, 9743, 61680} > > interleave+zip1 sequence for above initializer J: > mode = V8HF > > vals: (parallel:V8HF [ > (reg:HF 642) > (reg:HF 645) > (reg:HF 648) > (reg:HF 651) > (reg:HF 654) > (const_double:HF 0.0 [0x0.0p+0]) repeated x3 > ]) > > target: (reg:V8HF 641) > seq: > (insn 1058 0 1059 (set (reg:V4HF 657) > (const_vector:V4HF [ > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > ])) "test.c":81:8 -1 > (nil)) > (insn 1059 1058 1060 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) > (reg:V4HF 657) > (const_int 1 [0x1]))) "test.c":81:8 -1 > (nil)) > (insn 1060 1059 1061 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) > (reg:V4HF 657) > (const_int 2 [0x2]))) "test.c":81:8 -1 > (nil)) > (insn 1061 1060 1062 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) > (reg:V4HF 657) > (const_int 4 [0x4]))) "test.c":81:8 -1 > (nil)) > (insn 1062 1061 1063 (set (reg:V4HF 658) > (const_vector:V4HF [ > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > ])) "test.c":81:8 -1 > (nil)) > (insn 1063 1062 1064 (set (reg:V4HF 658) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) > (reg:V4HF 658) > (const_int 1 [0x1]))) "test.c":81:8 -1 > (nil)) > (insn 1064 1063 1065 (set (reg:V4HF 658) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) > (reg:V4HF 658) > (const_int 2 [0x2]))) "test.c":81:8 -1 > (nil)) > (insn 1065 1064 0 (set (reg:V8HF 641) > (unspec:V8HF [ > (subreg:V8HF (reg:V4HF 657) 0) > (subreg:V8HF (reg:V4HF 658) 0) > ] UNSPEC_ZIP1)) "test.c":81:8 -1 > (nil)) > > It seems to me that the above sequence correctly initializes the > vector into r641 ? > insns 1058-1061 construct r657 = { r642, r648, r654, 0 } > insns 1062-1064 construct r658 = { r645, r651, 0, 0 } > and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } > > For the above test, it seems that with interleave+zip1 approach and > -fstack-protector-all, > in cse pass, there are two separate equivalence classes created for > (const_int 1), that need > to be merged in cse_insn: > > if (elt->first_same_value != src_eqv_elt->first_same_value) > { > /* The REG_EQUAL is indicating that two formerly distinct > classes are now equivalent. So merge them. */ > merge_equiv_classes (elt, src_eqv_elt); > > elt equivalence chain: > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > (subreg:QI (reg:V16QI 671) 0) > (const_int 1 [0x1]) > > src_eqv_elt equivalence chain: > Equivalence chain for (const_int 1 [0x1]): > (reg:QI 34 v2) > (reg:QI 32 v0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > (vec_select:QI (reg:V16QI 671) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 32 v0) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 2 [0x2]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 1 [0x1]) > ])) > > The issue is that merge_equiv_classes doesn't seem to deal correctly with > multiple occurences of same register in class2 (src_eqv_elt), which > has two occurrences of > (reg:QI 34 v2) > > In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) > from reg_equiv_table > by calling delete_equiv_reg(34), and in insert_regs it will create an > entry for (reg:QI 34) in qty_table with new quantity number, and > create new equivalence in reg_eqv_table. > > When we again come across (reg:QI 34) in class2, it will > unconditionally remove the register from reg_eqv_table, thus making > REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 > chain. > > Then in insert_regs, we have: > x: (reg:QI 34 v2) > > classp: > (subreg:QI (reg:V16QI 671) 0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > > And while iterating over elements in classp, we end up with regno == > c_regno == 34. > However, as mentioned above, merge_equiv_classes has deleted entry for > (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end > up hitting the following assert: > gcc_assert (REGNO_QTY_VALID_P (c_regno)); > > I am not sure tho why this is triggered only with interleave+zip1 approach with > -fstack-protector-all. > > The attached (untested) patch is a workaround for the above issue -- > In merge_equiv_classes, > while iterating over elements in class2, it simply checks if element > is a reg, and already inserted in class1 with equivalent mode, and > avoids deleting it from reg_eqv_table in that case. > > This avoids hitting the assert, and following is the result of merging > above two classes: > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > (subreg:QI (reg:V16QI 671) 0) > (reg:QI 34 v2) > (reg:QI 32 v0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > (const_int 1 [0x1]) > (vec_select:QI (reg:V16QI 671) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 2 [0x2]) > ])) > (vec_select:QI (reg:V16QI 32 v0) > (parallel [ > (const_int 1 [0x1]) > ])) > > Which seems to be OK (?), but am not sure if this patch is in the > right direction, > and is also not efficient. > Could you please suggest how to proceed ? Hi, ping: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637913.html Thanks, Prathamesh > > Thanks, > Prathamesh
On Mon, 4 Dec 2023 at 14:44, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote: > > On Thu, 23 Nov 2023 at 17:06, Prathamesh Kulkarni > <prathamesh.kulkarni@linaro.org> wrote: > > > > Hi Richard, > > For the test-case mentioned in PR111702, compiling with -O2 > > -frounding-math -fstack-protector-all results in following ICE during > > cse2 pass: > > > > test.c: In function 'foo': > > test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 > > 119 | } > > | ^ > > 0xb7ebb0 insert_regs > > ../../gcc/gcc/cse.cc:1120 > > 0x1f95134 merge_equiv_classes > > ../../gcc/gcc/cse.cc:1764 > > 0x1f9b9ab cse_insn > > ../../gcc/gcc/cse.cc:4793 > > 0x1f9fe30 cse_extended_basic_block > > ../../gcc/gcc/cse.cc:6577 > > 0x1f9fe30 cse_main > > ../../gcc/gcc/cse.cc:6722 > > 0x1fa0984 rest_of_handle_cse2 > > ../../gcc/gcc/cse.cc:7620 > > 0x1fa0984 execute > > ../../gcc/gcc/cse.cc:7675 > > > > This happens only with interleave+zip1 vector initialization with > > -frounding-math -fstack-protector-all, while it compiles OK without > > -fstack-protector-all. Also, it compiles OK with fallback sequence > > code-gen (with or without -fstack-protector-all). Unfortunately, I > > haven't been able to reduce the test-case further :/ > > > > From the test-case, it seems only the vector initializer for type J > > uses interleave+zip1 approach, while rest of the vector initializers > > use fallback sequence. > > > > J is defined as: > > typedef _Float16 __attribute__((__vector_size__ (16))) J; > > > > and the initializer is: > > (J) { 11654, 4801, 5535, 9743, 61680} > > > > interleave+zip1 sequence for above initializer J: > > mode = V8HF > > > > vals: (parallel:V8HF [ > > (reg:HF 642) > > (reg:HF 645) > > (reg:HF 648) > > (reg:HF 651) > > (reg:HF 654) > > (const_double:HF 0.0 [0x0.0p+0]) repeated x3 > > ]) > > > > target: (reg:V8HF 641) > > seq: > > (insn 1058 0 1059 (set (reg:V4HF 657) > > (const_vector:V4HF [ > > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > > ])) "test.c":81:8 -1 > > (nil)) > > (insn 1059 1058 1060 (set (reg:V4HF 657) > > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) > > (reg:V4HF 657) > > (const_int 1 [0x1]))) "test.c":81:8 -1 > > (nil)) > > (insn 1060 1059 1061 (set (reg:V4HF 657) > > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) > > (reg:V4HF 657) > > (const_int 2 [0x2]))) "test.c":81:8 -1 > > (nil)) > > (insn 1061 1060 1062 (set (reg:V4HF 657) > > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) > > (reg:V4HF 657) > > (const_int 4 [0x4]))) "test.c":81:8 -1 > > (nil)) > > (insn 1062 1061 1063 (set (reg:V4HF 658) > > (const_vector:V4HF [ > > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > > ])) "test.c":81:8 -1 > > (nil)) > > (insn 1063 1062 1064 (set (reg:V4HF 658) > > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) > > (reg:V4HF 658) > > (const_int 1 [0x1]))) "test.c":81:8 -1 > > (nil)) > > (insn 1064 1063 1065 (set (reg:V4HF 658) > > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) > > (reg:V4HF 658) > > (const_int 2 [0x2]))) "test.c":81:8 -1 > > (nil)) > > (insn 1065 1064 0 (set (reg:V8HF 641) > > (unspec:V8HF [ > > (subreg:V8HF (reg:V4HF 657) 0) > > (subreg:V8HF (reg:V4HF 658) 0) > > ] UNSPEC_ZIP1)) "test.c":81:8 -1 > > (nil)) > > > > It seems to me that the above sequence correctly initializes the > > vector into r641 ? > > insns 1058-1061 construct r657 = { r642, r648, r654, 0 } > > insns 1062-1064 construct r658 = { r645, r651, 0, 0 } > > and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } > > > > For the above test, it seems that with interleave+zip1 approach and > > -fstack-protector-all, > > in cse pass, there are two separate equivalence classes created for > > (const_int 1), that need > > to be merged in cse_insn: > > > > if (elt->first_same_value != src_eqv_elt->first_same_value) > > { > > /* The REG_EQUAL is indicating that two formerly distinct > > classes are now equivalent. So merge them. */ > > merge_equiv_classes (elt, src_eqv_elt); > > > > elt equivalence chain: > > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > > (subreg:QI (reg:V16QI 671) 0) > > (const_int 1 [0x1]) > > > > src_eqv_elt equivalence chain: > > Equivalence chain for (const_int 1 [0x1]): > > (reg:QI 34 v2) > > (reg:QI 32 v0) > > (reg:QI 34 v2) > > (const_int 1 [0x1]) > > (vec_select:QI (reg:V16QI 671) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > (vec_select:QI (reg:V16QI 32 v0) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > (vec_select:QI (reg:V16QI 33 v1) > > (parallel [ > > (const_int 2 [0x2]) > > ])) > > (vec_select:QI (reg:V16QI 33 v1) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > > > The issue is that merge_equiv_classes doesn't seem to deal correctly with > > multiple occurences of same register in class2 (src_eqv_elt), which > > has two occurrences of > > (reg:QI 34 v2) > > > > In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) > > from reg_equiv_table > > by calling delete_equiv_reg(34), and in insert_regs it will create an > > entry for (reg:QI 34) in qty_table with new quantity number, and > > create new equivalence in reg_eqv_table. > > > > When we again come across (reg:QI 34) in class2, it will > > unconditionally remove the register from reg_eqv_table, thus making > > REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 > > chain. > > > > Then in insert_regs, we have: > > x: (reg:QI 34 v2) > > > > classp: > > (subreg:QI (reg:V16QI 671) 0) > > (reg:QI 34 v2) > > (const_int 1 [0x1]) > > > > And while iterating over elements in classp, we end up with regno == > > c_regno == 34. > > However, as mentioned above, merge_equiv_classes has deleted entry for > > (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end > > up hitting the following assert: > > gcc_assert (REGNO_QTY_VALID_P (c_regno)); > > > > I am not sure tho why this is triggered only with interleave+zip1 approach with > > -fstack-protector-all. > > > > The attached (untested) patch is a workaround for the above issue -- > > In merge_equiv_classes, > > while iterating over elements in class2, it simply checks if element > > is a reg, and already inserted in class1 with equivalent mode, and > > avoids deleting it from reg_eqv_table in that case. > > > > This avoids hitting the assert, and following is the result of merging > > above two classes: > > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > > (subreg:QI (reg:V16QI 671) 0) > > (reg:QI 34 v2) > > (reg:QI 32 v0) > > (reg:QI 34 v2) > > (const_int 1 [0x1]) > > (const_int 1 [0x1]) > > (vec_select:QI (reg:V16QI 671) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > (vec_select:QI (reg:V16QI 33 v1) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > (vec_select:QI (reg:V16QI 33 v1) > > (parallel [ > > (const_int 2 [0x2]) > > ])) > > (vec_select:QI (reg:V16QI 32 v0) > > (parallel [ > > (const_int 1 [0x1]) > > ])) > > > > Which seems to be OK (?), but am not sure if this patch is in the > > right direction, > > and is also not efficient. > > Could you please suggest how to proceed ? > Hi, > ping: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637913.html Hi, ping * 2: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637913.html Thanks, Prathamesh > > Thanks, > Prathamesh > > > > Thanks, > > Prathamesh
Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes: > Hi Richard, > For the test-case mentioned in PR111702, compiling with -O2 > -frounding-math -fstack-protector-all results in following ICE during > cse2 pass: > > test.c: In function 'foo': > test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 > 119 | } > | ^ > 0xb7ebb0 insert_regs > ../../gcc/gcc/cse.cc:1120 > 0x1f95134 merge_equiv_classes > ../../gcc/gcc/cse.cc:1764 > 0x1f9b9ab cse_insn > ../../gcc/gcc/cse.cc:4793 > 0x1f9fe30 cse_extended_basic_block > ../../gcc/gcc/cse.cc:6577 > 0x1f9fe30 cse_main > ../../gcc/gcc/cse.cc:6722 > 0x1fa0984 rest_of_handle_cse2 > ../../gcc/gcc/cse.cc:7620 > 0x1fa0984 execute > ../../gcc/gcc/cse.cc:7675 > > This happens only with interleave+zip1 vector initialization with > -frounding-math -fstack-protector-all, while it compiles OK without > -fstack-protector-all. Also, it compiles OK with fallback sequence > code-gen (with or without -fstack-protector-all). Unfortunately, I > haven't been able to reduce the test-case further :/ > > From the test-case, it seems only the vector initializer for type J > uses interleave+zip1 approach, while rest of the vector initializers > use fallback sequence. > > J is defined as: > typedef _Float16 __attribute__((__vector_size__ (16))) J; > > and the initializer is: > (J) { 11654, 4801, 5535, 9743, 61680} > > interleave+zip1 sequence for above initializer J: > mode = V8HF > > vals: (parallel:V8HF [ > (reg:HF 642) > (reg:HF 645) > (reg:HF 648) > (reg:HF 651) > (reg:HF 654) > (const_double:HF 0.0 [0x0.0p+0]) repeated x3 > ]) > > target: (reg:V8HF 641) > seq: > (insn 1058 0 1059 (set (reg:V4HF 657) > (const_vector:V4HF [ > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > ])) "test.c":81:8 -1 > (nil)) > (insn 1059 1058 1060 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) > (reg:V4HF 657) > (const_int 1 [0x1]))) "test.c":81:8 -1 > (nil)) > (insn 1060 1059 1061 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) > (reg:V4HF 657) > (const_int 2 [0x2]))) "test.c":81:8 -1 > (nil)) > (insn 1061 1060 1062 (set (reg:V4HF 657) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) > (reg:V4HF 657) > (const_int 4 [0x4]))) "test.c":81:8 -1 > (nil)) > (insn 1062 1061 1063 (set (reg:V4HF 658) > (const_vector:V4HF [ > (const_double:HF 0.0 [0x0.0p+0]) repeated x4 > ])) "test.c":81:8 -1 > (nil)) > (insn 1063 1062 1064 (set (reg:V4HF 658) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) > (reg:V4HF 658) > (const_int 1 [0x1]))) "test.c":81:8 -1 > (nil)) > (insn 1064 1063 1065 (set (reg:V4HF 658) > (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) > (reg:V4HF 658) > (const_int 2 [0x2]))) "test.c":81:8 -1 > (nil)) > (insn 1065 1064 0 (set (reg:V8HF 641) > (unspec:V8HF [ > (subreg:V8HF (reg:V4HF 657) 0) > (subreg:V8HF (reg:V4HF 658) 0) > ] UNSPEC_ZIP1)) "test.c":81:8 -1 > (nil)) > > It seems to me that the above sequence correctly initializes the > vector into r641 ? > insns 1058-1061 construct r657 = { r642, r648, r654, 0 } > insns 1062-1064 construct r658 = { r645, r651, 0, 0 } > and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } > > For the above test, it seems that with interleave+zip1 approach and > -fstack-protector-all, > in cse pass, there are two separate equivalence classes created for > (const_int 1), that need > to be merged in cse_insn: > > if (elt->first_same_value != src_eqv_elt->first_same_value) > { > /* The REG_EQUAL is indicating that two formerly distinct > classes are now equivalent. So merge them. */ > merge_equiv_classes (elt, src_eqv_elt); > > elt equivalence chain: > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > (subreg:QI (reg:V16QI 671) 0) > (const_int 1 [0x1]) > > src_eqv_elt equivalence chain: > Equivalence chain for (const_int 1 [0x1]): > (reg:QI 34 v2) > (reg:QI 32 v0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > (vec_select:QI (reg:V16QI 671) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 32 v0) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 2 [0x2]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 1 [0x1]) > ])) > > The issue is that merge_equiv_classes doesn't seem to deal correctly with > multiple occurences of same register in class2 (src_eqv_elt), which > has two occurrences of > (reg:QI 34 v2) I don't think that's supposed to occur. > In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) > from reg_equiv_table > by calling delete_equiv_reg(34), and in insert_regs it will create an > entry for (reg:QI 34) in qty_table with new quantity number, and > create new equivalence in reg_eqv_table. > > When we again come across (reg:QI 34) in class2, it will > unconditionally remove the register from reg_eqv_table, thus making > REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 > chain. > > Then in insert_regs, we have: > x: (reg:QI 34 v2) > > classp: > (subreg:QI (reg:V16QI 671) 0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > > And while iterating over elements in classp, we end up with regno == > c_regno == 34. > However, as mentioned above, merge_equiv_classes has deleted entry for > (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end > up hitting the following assert: > gcc_assert (REGNO_QTY_VALID_P (c_regno)); > > I am not sure tho why this is triggered only with interleave+zip1 approach with > -fstack-protector-all. > > The attached (untested) patch is a workaround for the above issue -- > In merge_equiv_classes, > while iterating over elements in class2, it simply checks if element > is a reg, and already inserted in class1 with equivalent mode, and > avoids deleting it from reg_eqv_table in that case. > > This avoids hitting the assert, and following is the result of merging > above two classes: > Equivalence chain for (subreg:QI (reg:V16QI 671) 0): > (subreg:QI (reg:V16QI 671) 0) > (reg:QI 34 v2) > (reg:QI 32 v0) > (reg:QI 34 v2) > (const_int 1 [0x1]) > (const_int 1 [0x1]) > (vec_select:QI (reg:V16QI 671) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 1 [0x1]) > ])) > (vec_select:QI (reg:V16QI 33 v1) > (parallel [ > (const_int 2 [0x2]) > ])) > (vec_select:QI (reg:V16QI 32 v0) > (parallel [ > (const_int 1 [0x1]) > ])) > > Which seems to be OK (?), but am not sure if this patch is in the > right direction, > and is also not efficient. > Could you please suggest how to proceed ? I think things went wrong when we added the same value twice. Debugging how that happened suggested a further (and IMO more fundemantal) problem with the handling of CONST_VECTORs. I'll post a patch for that in a sec. Thanks, Richard
diff --git a/gcc/cse.cc b/gcc/cse.cc index f9603fdfd43..1e20be457c4 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -1747,7 +1747,16 @@ merge_equiv_classes (struct table_elt *class1, struct table_elt *class2) if (REG_P (exp)) { need_rehash = REGNO_QTY_VALID_P (REGNO (exp)); - delete_reg_equiv (REGNO (exp)); + + /* If reg is already inserted into class1 and has a valid new + quantity, avoid deleting it from reg_eqv_table. */ + table_elt *e; + for (e = class1->first_same_value; e; e = e->next_same_value) + if (REG_P (e->exp) && REGNO (e->exp) == REGNO (exp) + && e->mode == mode) + break; + if (e == NULL) + delete_reg_equiv (REGNO (exp)); } if (REG_P (exp) && REGNO (exp) >= FIRST_PSEUDO_REGISTER)