Message ID | 20190807045335.1361-4-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | target/arm: decodetree prep patches | expand |
On Wed, 7 Aug 2019 at 05:53, Richard Henderson <richard.henderson@linaro.org> wrote: > > We currently have 3 different ways of computing the architectural > value of "PC" as seen in the ARM ARM. > > The value of s->pc has been incremented past the current insn, > but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc; > for t16, PC = s->pc + 2. These differing computations make it > impossible at present to unify the various code paths. > > With the newly introduced s->pc_curr, we can compute the correct > value for all cases, using the formula given in the ARM ARM. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > target/arm/translate.c | 59 ++++++++++++++++-------------------------- > 1 file changed, 23 insertions(+), 36 deletions(-) > > diff --git a/target/arm/translate.c b/target/arm/translate.c > index 59e35aafbf..61933865d5 100644 > --- a/target/arm/translate.c > +++ b/target/arm/translate.c > @@ -196,17 +196,17 @@ static inline void store_cpu_offset(TCGv_i32 var, int offset) > #define store_cpu_field(var, name) \ > store_cpu_offset(var, offsetof(CPUARMState, name)) > > +/* The architectural value of PC. */ > +static uint32_t read_pc(DisasContext *s) > +{ > + return s->pc_curr + (s->thumb ? 4 : 8); > +} > + > /* Set a variable to the value of a CPU register. */ > static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg) > { > if (reg == 15) { > - uint32_t addr; > - /* normally, since we updated PC, we need only to add one insn */ > - if (s->thumb) > - addr = (long)s->pc + 2; > - else > - addr = (long)s->pc + 4; > - tcg_gen_movi_i32(var, addr); > + tcg_gen_movi_i32(var, read_pc(s)); So previously: * for A32 we would return s->pc + 4, which is the same as s->pc_curr + 8 * for T16 we would return s->pc + 2, which is the same as s->pc_curr + 4 * for T32 we would return s->pc + 2 -- but that's not the same as s->pc_curr + 4, it's s->pc_curr + 6... Since s->pc_curr + 4 is the right architectural answer, are we fixing a bug here? Or are all the places where T32 code calls this function UNPREDICTABLE for the reg == 15 case ? thanks -- PMM
On 8/7/19 10:27 AM, Peter Maydell wrote: >> +/* The architectural value of PC. */ >> +static uint32_t read_pc(DisasContext *s) >> +{ >> + return s->pc_curr + (s->thumb ? 4 : 8); >> +} >> + >> /* Set a variable to the value of a CPU register. */ >> static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg) >> { >> if (reg == 15) { >> - uint32_t addr; >> - /* normally, since we updated PC, we need only to add one insn */ >> - if (s->thumb) >> - addr = (long)s->pc + 2; >> - else >> - addr = (long)s->pc + 4; >> - tcg_gen_movi_i32(var, addr); >> + tcg_gen_movi_i32(var, read_pc(s)); > > So previously: > * for A32 we would return s->pc + 4, which is the same as s->pc_curr + 8 > * for T16 we would return s->pc + 2, which is the same as s->pc_curr + 4 > * for T32 we would return s->pc + 2 -- but that's not the same as > s->pc_curr + 4, it's s->pc_curr + 6... > > Since s->pc_curr + 4 is the right architectural answer, are we > fixing a bug here? Or are all the places where T32 code calls > this function UNPREDICTABLE for the reg == 15 case ? I believe that this is UNPREDICTABLE. The T32 cases that reference the PC that are not UNPREDICTABLE, literal memory references and adr, are all of the form (s->pc & ~3) and do not come through load_reg_var(). Those will be unified by add_reg_for_lit() in the next patch. r~
On Wed, 7 Aug 2019 at 19:04, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 8/7/19 10:27 AM, Peter Maydell wrote: > >> +/* The architectural value of PC. */ > >> +static uint32_t read_pc(DisasContext *s) > >> +{ > >> + return s->pc_curr + (s->thumb ? 4 : 8); > >> +} > >> + > >> /* Set a variable to the value of a CPU register. */ > >> static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg) > >> { > >> if (reg == 15) { > >> - uint32_t addr; > >> - /* normally, since we updated PC, we need only to add one insn */ > >> - if (s->thumb) > >> - addr = (long)s->pc + 2; > >> - else > >> - addr = (long)s->pc + 4; > >> - tcg_gen_movi_i32(var, addr); > >> + tcg_gen_movi_i32(var, read_pc(s)); > > > > So previously: > > * for A32 we would return s->pc + 4, which is the same as s->pc_curr + 8 > > * for T16 we would return s->pc + 2, which is the same as s->pc_curr + 4 > > * for T32 we would return s->pc + 2 -- but that's not the same as > > s->pc_curr + 4, it's s->pc_curr + 6... > > > > Since s->pc_curr + 4 is the right architectural answer, are we > > fixing a bug here? Or are all the places where T32 code calls > > this function UNPREDICTABLE for the reg == 15 case ? > > I believe that this is UNPREDICTABLE. > > The T32 cases that reference the PC that are not UNPREDICTABLE, literal memory > references and adr, are all of the form (s->pc & ~3) and do not come through > load_reg_var(). Those will be unified by add_reg_for_lit() in the next patch. Yeah, that was my assumption -- at some previous point rather than making load_reg/load_reg_var do the right thing for 32-bit Thumb insns we just fixed up all the callsites to specialcase 15... How about we add this to the commit message? This changes the behaviour for load_reg() and load_reg_var() when called with reg==15 from a 32-bit Thumb instruction: previously they would have returned the incorrect value of pc_curr + 6, and now they will return the architecturally correct value of PC, which is pc_curr + 4. This will not affect well-behaved guest software, because all of the places we call these functions from T32 code are instructions where using r15 is UNPREDICTABLE. Using the architectural PC value here is more consistent with the T16 and A32 behaviour. thanks -- PMM
On 8/7/19 11:16 AM, Peter Maydell wrote: > How about we add this to the commit message? > > This changes the behaviour for load_reg() and load_reg_var() > when called with reg==15 from a 32-bit Thumb instruction: > previously they would have returned the incorrect value > of pc_curr + 6, and now they will return the architecturally > correct value of PC, which is pc_curr + 4. This will not > affect well-behaved guest software, because all of the places > we call these functions from T32 code are instructions where > using r15 is UNPREDICTABLE. Using the architectural PC value > here is more consistent with the T16 and A32 behaviour. Looks good to me. r~
diff --git a/target/arm/translate.c b/target/arm/translate.c index 59e35aafbf..61933865d5 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -196,17 +196,17 @@ static inline void store_cpu_offset(TCGv_i32 var, int offset) #define store_cpu_field(var, name) \ store_cpu_offset(var, offsetof(CPUARMState, name)) +/* The architectural value of PC. */ +static uint32_t read_pc(DisasContext *s) +{ + return s->pc_curr + (s->thumb ? 4 : 8); +} + /* Set a variable to the value of a CPU register. */ static void load_reg_var(DisasContext *s, TCGv_i32 var, int reg) { if (reg == 15) { - uint32_t addr; - /* normally, since we updated PC, we need only to add one insn */ - if (s->thumb) - addr = (long)s->pc + 2; - else - addr = (long)s->pc + 4; - tcg_gen_movi_i32(var, addr); + tcg_gen_movi_i32(var, read_pc(s)); } else { tcg_gen_mov_i32(var, cpu_R[reg]); } @@ -7868,16 +7868,14 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) /* branch link and change to thumb (blx <offset>) */ int32_t offset; - val = (uint32_t)s->pc; tmp = tcg_temp_new_i32(); - tcg_gen_movi_i32(tmp, val); + tcg_gen_movi_i32(tmp, s->pc); store_reg(s, 14, tmp); /* Sign-extend the 24-bit offset */ offset = (((int32_t)insn) << 8) >> 8; + val = read_pc(s); /* offset * 4 + bit24 * 2 + (thumb bit) */ val += (offset << 2) | ((insn >> 23) & 2) | 1; - /* pipeline offset */ - val += 4; /* protected by ARCH(5); above, near the start of uncond block */ gen_bx_im(s, val); return; @@ -9153,10 +9151,8 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) } else { /* store */ if (i == 15) { - /* special case: r15 = PC + 8 */ - val = (long)s->pc + 4; tmp = tcg_temp_new_i32(); - tcg_gen_movi_i32(tmp, val); + tcg_gen_movi_i32(tmp, read_pc(s)); } else if (user) { tmp = tcg_temp_new_i32(); tmp2 = tcg_const_i32(i); @@ -9222,15 +9218,13 @@ static void disas_arm_insn(DisasContext *s, unsigned int insn) int32_t offset; /* branch (and link) */ - val = (int32_t)s->pc; if (insn & (1 << 24)) { tmp = tcg_temp_new_i32(); - tcg_gen_movi_i32(tmp, val); + tcg_gen_movi_i32(tmp, s->pc); store_reg(s, 14, tmp); } offset = sextract32(insn << 2, 0, 26); - val += offset + 4; - gen_jmp(s, val); + gen_jmp(s, read_pc(s) + offset); } break; case 0xc: @@ -9588,12 +9582,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn) tcg_temp_free_i32(addr); } else if ((insn & (7 << 5)) == 0) { /* Table Branch. */ - if (rn == 15) { - addr = tcg_temp_new_i32(); - tcg_gen_movi_i32(addr, s->pc); - } else { - addr = load_reg(s, rn); - } + addr = load_reg(s, rn); tmp = load_reg(s, rm); tcg_gen_add_i32(addr, addr, tmp); if (insn & (1 << 4)) { @@ -9609,7 +9598,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn) } tcg_temp_free_i32(addr); tcg_gen_shli_i32(tmp, tmp, 1); - tcg_gen_addi_i32(tmp, tmp, s->pc); + tcg_gen_addi_i32(tmp, tmp, read_pc(s)); store_reg(s, 15, tmp); } else { bool is_lasr = false; @@ -10342,7 +10331,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn) tcg_gen_movi_i32(cpu_R[14], s->pc | 1); } - offset += s->pc; + offset += read_pc(s); if (insn & (1 << 12)) { /* b/bl */ gen_jmp(s, offset); @@ -10583,7 +10572,7 @@ static void disas_thumb2_insn(DisasContext *s, uint32_t insn) offset |= (insn & (1 << 11)) << 8; /* jump to the offset */ - gen_jmp(s, s->pc + offset); + gen_jmp(s, read_pc(s) + offset); } } else { /* @@ -11077,7 +11066,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) if (insn & (1 << 11)) { rd = (insn >> 8) & 7; /* load pc-relative. Bit 1 of PC is ignored. */ - val = s->pc + 2 + ((insn & 0xff) * 4); + val = read_pc(s) + ((insn & 0xff) * 4); val &= ~(uint32_t)2; addr = tcg_temp_new_i32(); tcg_gen_movi_i32(addr, val); @@ -11464,7 +11453,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) } else { /* PC. bit 1 is ignored. */ tmp = tcg_temp_new_i32(); - tcg_gen_movi_i32(tmp, (s->pc + 2) & ~(uint32_t)2); + tcg_gen_movi_i32(tmp, read_pc(s) & ~(uint32_t)2); } val = (insn & 0xff) * 4; tcg_gen_addi_i32(tmp, tmp, val); @@ -11584,9 +11573,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) tcg_gen_brcondi_i32(TCG_COND_NE, tmp, 0, s->condlabel); tcg_temp_free_i32(tmp); offset = ((insn & 0xf8) >> 2) | (insn & 0x200) >> 3; - val = (uint32_t)s->pc + 2; - val += offset; - gen_jmp(s, val); + gen_jmp(s, read_pc(s) + offset); break; case 15: /* IT, nop-hint. */ @@ -11750,7 +11737,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) arm_skip_unless(s, cond); /* jump to the offset */ - val = (uint32_t)s->pc + 2; + val = read_pc(s); offset = ((int32_t)insn << 24) >> 24; val += offset << 1; gen_jmp(s, val); @@ -11776,9 +11763,9 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) break; } /* unconditional branch */ - val = (uint32_t)s->pc; + val = read_pc(s); offset = ((int32_t)insn << 21) >> 21; - val += (offset << 1) + 2; + val += offset << 1; gen_jmp(s, val); break; @@ -11802,7 +11789,7 @@ static void disas_thumb_insn(DisasContext *s, uint32_t insn) /* 0b1111_0xxx_xxxx_xxxx : BL/BLX prefix */ uint32_t uoffset = ((int32_t)insn << 21) >> 9; - tcg_gen_movi_i32(cpu_R[14], s->pc + 2 + uoffset); + tcg_gen_movi_i32(cpu_R[14], read_pc(s) + uoffset); } break; }
We currently have 3 different ways of computing the architectural value of "PC" as seen in the ARM ARM. The value of s->pc has been incremented past the current insn, but that is all. Thus for a32, PC = s->pc + 4; for t32, PC = s->pc; for t16, PC = s->pc + 2. These differing computations make it impossible at present to unify the various code paths. With the newly introduced s->pc_curr, we can compute the correct value for all cases, using the formula given in the ARM ARM. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/translate.c | 59 ++++++++++++++++-------------------------- 1 file changed, 23 insertions(+), 36 deletions(-) -- 2.17.1