Message ID | 20201028032703.201526-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | target/arm: Fix neon reg offsets | expand |
On Wed, 28 Oct 2020 at 03:27, Richard Henderson <richard.henderson@linaro.org> wrote: > > Much of the existing usage of neon_reg_offset is broken for > big-endian hosts, as it computes the offset of the first > 32-bit unit, not the offset of the entire vector register. > > Fix this by separating out the different usages. Make the > whole thing look a bit more like the aarch64 code. I haven't reviewed this yet but it fixes a lot of the problems I saw in my risu run on an s390x box, and I don't see any regressions on x86-64. However these still fail on s390x compared to an x86-64 host: insn_VPADD_float_f16.risu.bin FAIL insn_VPMAX_float_f16.risu.bin FAIL insn_VPMIN_float_f16.risu.bin FAIL insn_VSDOT_s.risu.bin FAIL insn_VUDOT_s.risu.bin FAIL thanks -- PMM
On Wed, 28 Oct 2020 at 16:48, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Wed, 28 Oct 2020 at 03:27, Richard Henderson > <richard.henderson@linaro.org> wrote: > > > > Much of the existing usage of neon_reg_offset is broken for > > big-endian hosts, as it computes the offset of the first > > 32-bit unit, not the offset of the entire vector register. > > > > Fix this by separating out the different usages. Make the > > whole thing look a bit more like the aarch64 code. > > I haven't reviewed this yet but it fixes a lot of the > problems I saw in my risu run on an s390x box, and I > don't see any regressions on x86-64. However these still > fail on s390x compared to an x86-64 host: > > insn_VPADD_float_f16.risu.bin FAIL > insn_VPMAX_float_f16.risu.bin FAIL > insn_VPMIN_float_f16.risu.bin FAIL These three turn out to be a silly bug (one of mine) unrelated to this series (patch written). > insn_VSDOT_s.risu.bin FAIL > insn_VUDOT_s.risu.bin FAIL I'll have a look at these next. -- PMM
On 10/28/20 9:48 AM, Peter Maydell wrote: > I haven't reviewed this yet but it fixes a lot of the > problems I saw in my risu run on an s390x box, and I > don't see any regressions on x86-64. However these still > fail on s390x compared to an x86-64 host: > > insn_VPADD_float_f16.risu.bin FAIL > insn_VPMAX_float_f16.risu.bin FAIL > insn_VPMIN_float_f16.risu.bin FAIL > insn_VSDOT_s.risu.bin FAIL > insn_VUDOT_s.risu.bin FAIL FWIW, a risu run with my cortex-a15 trace files came up clean on a ppc64 host. But that wouldn't have tested DOT... r~
On Wed, 28 Oct 2020 at 03:27, Richard Henderson <richard.henderson@linaro.org> wrote: > > Much of the existing usage of neon_reg_offset is broken for > big-endian hosts, as it computes the offset of the first > 32-bit unit, not the offset of the entire vector register. > > Fix this by separating out the different usages. Make the > whole thing look a bit more like the aarch64 code. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> I'll put this into target-arm.next since it's fixing a bug on BE hosts. thanks -- PMM
On Wed, 28 Oct 2020 at 20:03, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Wed, 28 Oct 2020 at 03:27, Richard Henderson > <richard.henderson@linaro.org> wrote: > > > > Much of the existing usage of neon_reg_offset is broken for > > big-endian hosts, as it computes the offset of the first > > 32-bit unit, not the offset of the entire vector register. > > > > Fix this by separating out the different usages. Make the > > whole thing look a bit more like the aarch64 code. > > Reviewed-by: Peter Maydell <peter.maydell@linaro.org> > > I'll put this into target-arm.next since it's fixing a > bug on BE hosts. I'm now assuming you'll send out a v2 with the VTRN/VREV missing-tcg-free bugs fixed. thanks -- PMM