Message ID | 20200326230838.31112-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | target/arm: SVE2, part 1 | expand |
Hello Richard,
I want to introduce you to Stephen Long. He is our new hire who started this week.
I want to know if you are available for a sync-up meeting to discuss how we can cooperate with qemu sve2 support.
Thank you,
Ana.
-----Original Message-----
From: Richard Henderson <richard.henderson@linaro.org>
Sent: Thursday, March 26, 2020 4:08 PM
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org; Ana Pazos <apazos@quicinc.com>; Raja Venkateswaran <rajav@quicinc.com>
Subject: [PATCH for-5.1 00/31] target/arm: SVE2, part 1
-------------------------------------------------------------------------
CAUTION: This email originated from outside of the organization.
-------------------------------------------------------------------------
Posting this for early review. It's based on some other patch sets that I have posted recently that also touch SVE, listed below. But it might just be easier to clone the devel tree [2].
While the branch itself will rebase frequently for development, I've also created a tag, post-sve2-20200326, for this posting.
This is mostly untested, as the most recently released Foundation Model does not support SVE2. Some of the new instructions overlap with old fashioned NEON, and I can verify that those have not broken, and show that SVE2 will use the same code path. But the predicated insns and bottom/top interleaved insns are not yet RISU testable, as I have nothing to compare against.
The patches are in general arranged so that one complete group of insns are added at once. The groups within the manual [1] have so far been small-ish.
r~
---
[1] ISA manual: https://static.docs.arm.com/ddi0602/d/ISA_A64_xml_futureA-2019-12_OPT.pdf
[2] Devel tree: https://github.com/rth7680/qemu/tree/tgt-arm-sve-2
Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=163610
("target/arm: sve load/store improvements")
Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164500
("target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA")
Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164048
("target/arm: Implement ARMv8.5-MemTag, system mode")
Richard Henderson (31):
target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2
target/arm: Implement SVE2 Integer Multiply - Unpredicated
target/arm: Implement SVE2 integer pairwise add and accumulate long
target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32
target/arm: Implement SVE2 integer unary operations (predicated)
target/arm: Split out saturating/rounding shifts from neon
target/arm: Implement SVE2 saturating/rounding bitwise shift left
(predicated)
target/arm: Implement SVE2 integer halving add/subtract (predicated)
target/arm: Implement SVE2 integer pairwise arithmetic
target/arm: Implement SVE2 saturating add/subtract (predicated)
target/arm: Implement SVE2 integer add/subtract long
target/arm: Implement SVE2 integer add/subtract interleaved long
target/arm: Implement SVE2 integer add/subtract wide
target/arm: Implement SVE2 integer multiply long
target/arm: Implement PMULLB and PMULLT
target/arm: Tidy SVE tszimm shift formats
target/arm: Implement SVE2 bitwise shift left long
target/arm: Implement SVE2 bitwise exclusive-or interleaved
target/arm: Implement SVE2 bitwise permute
target/arm: Implement SVE2 complex integer add
target/arm: Implement SVE2 integer absolute difference and accumulate
long
target/arm: Implement SVE2 integer add/subtract long with carry
target/arm: Create arm_gen_gvec_[us]sra
target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra}
target/arm: Implement SVE2 bitwise shift right and accumulate
target/arm: Create arm_gen_gvec_{sri,sli}
target/arm: Tidy handle_vec_simd_shri
target/arm: Implement SVE2 bitwise shift and insert
target/arm: Vectorize SABD/UABD
target/arm: Vectorize SABA/UABA
target/arm: Implement SVE2 integer absolute difference and accumulate
target/arm/cpu.h | 31 ++
target/arm/helper-sve.h | 345 +++++++++++++++++
target/arm/helper.h | 81 +++-
target/arm/translate-a64.h | 9 +
target/arm/translate.h | 24 +-
target/arm/vec_internal.h | 161 ++++++++
target/arm/sve.decode | 217 ++++++++++-
target/arm/helper.c | 3 +-
target/arm/kvm64.c | 2 +
target/arm/neon_helper.c | 515 ++++---------------------
target/arm/sve_helper.c | 757 ++++++++++++++++++++++++++++++++++---
target/arm/translate-a64.c | 557 +++++++++++++++++++++++---- target/arm/translate-sve.c | 557 +++++++++++++++++++++++++++
target/arm/translate.c | 626 ++++++++++++++++++++++--------
target/arm/vec_helper.c | 411 ++++++++++++++++++++
target/arm/vfp_helper.c | 4 +-
16 files changed, 3532 insertions(+), 768 deletions(-) create mode 100644 target/arm/vec_internal.h
--
2.20.1
Hi Richard, I find BF16 is included in the ISA. Will you extend the softfpu in this patch set? Zhiwei On 2020/3/27 7:08, Richard Henderson wrote: > Posting this for early review. It's based on some other patch > sets that I have posted recently that also touch SVE, listed > below. But it might just be easier to clone the devel tree [2]. > While the branch itself will rebase frequently for development, > I've also created a tag, post-sve2-20200326, for this posting. > > This is mostly untested, as the most recently released Foundation > Model does not support SVE2. Some of the new instructions overlap > with old fashioned NEON, and I can verify that those have not > broken, and show that SVE2 will use the same code path. But the > predicated insns and bottom/top interleaved insns are not yet > RISU testable, as I have nothing to compare against. > > The patches are in general arranged so that one complete group > of insns are added at once. The groups within the manual [1] > have so far been small-ish. > > > r~ > > --- > > [1] ISA manual: https://static.docs.arm.com/ddi0602/d/ISA_A64_xml_futureA-2019-12_OPT.pdf > > [2] Devel tree: https://github.com/rth7680/qemu/tree/tgt-arm-sve-2 > > Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=163610 > ("target/arm: sve load/store improvements") > > Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164500 > ("target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA") > > Based-on: http://patchwork.ozlabs.org/project/qemu-devel/list/?series=164048 > ("target/arm: Implement ARMv8.5-MemTag, system mode") > > Richard Henderson (31): > target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 > target/arm: Implement SVE2 Integer Multiply - Unpredicated > target/arm: Implement SVE2 integer pairwise add and accumulate long > target/arm: Remove fp_status from helper_{recpe,rsqrte}_u32 > target/arm: Implement SVE2 integer unary operations (predicated) > target/arm: Split out saturating/rounding shifts from neon > target/arm: Implement SVE2 saturating/rounding bitwise shift left > (predicated) > target/arm: Implement SVE2 integer halving add/subtract (predicated) > target/arm: Implement SVE2 integer pairwise arithmetic > target/arm: Implement SVE2 saturating add/subtract (predicated) > target/arm: Implement SVE2 integer add/subtract long > target/arm: Implement SVE2 integer add/subtract interleaved long > target/arm: Implement SVE2 integer add/subtract wide > target/arm: Implement SVE2 integer multiply long > target/arm: Implement PMULLB and PMULLT > target/arm: Tidy SVE tszimm shift formats > target/arm: Implement SVE2 bitwise shift left long > target/arm: Implement SVE2 bitwise exclusive-or interleaved > target/arm: Implement SVE2 bitwise permute > target/arm: Implement SVE2 complex integer add > target/arm: Implement SVE2 integer absolute difference and accumulate > long > target/arm: Implement SVE2 integer add/subtract long with carry > target/arm: Create arm_gen_gvec_[us]sra > target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra} > target/arm: Implement SVE2 bitwise shift right and accumulate > target/arm: Create arm_gen_gvec_{sri,sli} > target/arm: Tidy handle_vec_simd_shri > target/arm: Implement SVE2 bitwise shift and insert > target/arm: Vectorize SABD/UABD > target/arm: Vectorize SABA/UABA > target/arm: Implement SVE2 integer absolute difference and accumulate > > target/arm/cpu.h | 31 ++ > target/arm/helper-sve.h | 345 +++++++++++++++++ > target/arm/helper.h | 81 +++- > target/arm/translate-a64.h | 9 + > target/arm/translate.h | 24 +- > target/arm/vec_internal.h | 161 ++++++++ > target/arm/sve.decode | 217 ++++++++++- > target/arm/helper.c | 3 +- > target/arm/kvm64.c | 2 + > target/arm/neon_helper.c | 515 ++++--------------------- > target/arm/sve_helper.c | 757 ++++++++++++++++++++++++++++++++++--- > target/arm/translate-a64.c | 557 +++++++++++++++++++++++---- > target/arm/translate-sve.c | 557 +++++++++++++++++++++++++++ > target/arm/translate.c | 626 ++++++++++++++++++++++-------- > target/arm/vec_helper.c | 411 ++++++++++++++++++++ > target/arm/vfp_helper.c | 4 +- > 16 files changed, 3532 insertions(+), 768 deletions(-) > create mode 100644 target/arm/vec_internal.h >
On 4/21/20 7:51 PM, LIU Zhiwei wrote: > I find BF16 is included in the ISA. Will you extend the softfpu in this patch > set? I will do that eventually, but probably not part of the first full SVE2 patch set. There are several optional extensions to SVE2, of which BF16 is one. But BF16 also requires changes to the normal FPU as well, and Arm requires SVE and FPU be in sync. r~