From patchwork Wed Sep 26 19:23:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 147650 Delivered-To: patch@linaro.org Received: by 2002:a2e:8595:0:0:0:0:0 with SMTP id b21-v6csp1113874lji; Wed, 26 Sep 2018 12:29:19 -0700 (PDT) X-Google-Smtp-Source: ACcGV60sCzX3DzvUfB6A0P/a3mcVHbK0swUzqbN/yQ5RtZvIo+5TBogAvAuCIRjDqP3Yy1lwR8oZ X-Received: by 2002:ac8:7003:: with SMTP id x3-v6mr5703382qtm.325.1537990159573; Wed, 26 Sep 2018 12:29:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537990159; cv=none; d=google.com; s=arc-20160816; b=c3kE+Ti4jlpvFNx2zHIeBetlhrdfhQ22UIX2SR9kICPAFyvpixMBdzCXQ0Lfjvw5W6 BtdTH4l0zs7gqqsfRHoWfr9wP8sdbz2wY93lmT+R9Ybc3UYW+IlFvgtVHKdAZXOpielP XjG2BpeuvNlUJvtpqYT9AWoUtI0P+rVVOuTwykgNMQBXj+AP3yT+cCMSp7wArPlmODDj 3z4piusjxTB45R413JZrRw4rEctLqdvpqLAP06vcCvjTX6N/JdVICUD5EtqvBmweCs7x Cswb3giesfEjL3g3QzKIhYNyRBaxkYs8oX/imuuGXVk2Y230rIWsxSVkaUHcAJ3ywvg0 mHSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=zsNMx2ZVJ3WA83by5dtoQFSV0CfpVRZ+HXBUUu2gBqI=; b=gTMfoz4+VSqAhB+Aq46PIjmbgu2edB73h/kJncAsJg/PhLyFU/OeqfRf8Ze0OXwQQl 7+jUByLFggrqAbpuw/iWesxW4ev8FvlUd60mOhxkZlUd8uZl69yZ4Vi8p07SlMIX4izp BHS1tnpXyGq+Wx058Oolco1MwQ4204FmYZDXp/ehQEiwiYTF1i0OgMMGxMWHl2KgaMlh BsDa0axiLv8zQ5KieYZZbkiYxoSMOhHe7omUu+zT5trvnRrd/MWAd1mPBBzUXQWcUKG7 u/GJpwtBi2p+kj8paBPBXcGcmSOruFYF5GH6ylGILSMY4cj9ZuFN9FrqYNULaCqbRz+j ivrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Pfman3AQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id i16-v6si667190qvo.284.2018.09.26.12.29.19 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 26 Sep 2018 12:29:19 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Pfman3AQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:60383 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g5FUs-0007Lo-Vp for patch@linaro.org; Wed, 26 Sep 2018 15:29:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38322) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g5FPP-0001m4-61 for qemu-devel@nongnu.org; Wed, 26 Sep 2018 15:23:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g5FPL-0006nk-6A for qemu-devel@nongnu.org; Wed, 26 Sep 2018 15:23:37 -0400 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]:42655) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1g5FPJ-0006mE-3V for qemu-devel@nongnu.org; Wed, 26 Sep 2018 15:23:35 -0400 Received: by mail-pf1-x42e.google.com with SMTP id l9-v6so51392pff.9 for ; Wed, 26 Sep 2018 12:23:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zsNMx2ZVJ3WA83by5dtoQFSV0CfpVRZ+HXBUUu2gBqI=; b=Pfman3AQsKkI5DjN6gSUzSTBsLhnSUZUa+3GcK/4ZAU7ucLXHuTXNrbFdeXVs3g+3l zYA/O6B4IzlnmEahXHozMYMK6O2GXZiTX0GfzP7ZaYYCygQTjKhyNNyb/q+jqTSae+dS 6CnohaXg/TXJOOnogAnSl96/5LNenSFUMbD84= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zsNMx2ZVJ3WA83by5dtoQFSV0CfpVRZ+HXBUUu2gBqI=; b=bQMclQSkJ6BvecxBqCPp6s02OnW05yqAiwQeeLCrVDfZ1y3WzcgIAOBKjCn+KN4A/6 oH/l8jCjhNKvA+oL8K7oOlLkNh4ZqW+OI/uDNh2YCFoRTAFYoTt89AsBDuR6Ig3bqkTv U2WlJmxKlCni7irUPNqf8eyHHsAAh3e3nt8H4HXapO2UrfSoSMKuc0cspImhp1JkCnzG 9IWbFKn5Pl5YaGxuJ9RHNqaKO76YuOugYcXrudPk7rw3BXDLib6lF8R+AmX76Lg5K/nN wt1QtJrWenLfSpiq677z+ojxf3fZfAHE8xsY1WyQDMzCibsPQ89KDtka41pwF8ReZpg2 q/2A== X-Gm-Message-State: ABuFfojjVzrqprc31uX+4Sg5nS4LxXywFdSQu9J+T4HRv6HuiYUamVqD bvcjWWGl/g4ANEoEaHK9NplECI+0hVQ= X-Received: by 2002:a62:6a01:: with SMTP id f1-v6mr7684653pfc.156.1537989810612; Wed, 26 Sep 2018 12:23:30 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-8-179.tukw.qwest.net. [97.113.8.179]) by smtp.gmail.com with ESMTPSA id c2-v6sm8493486pfo.107.2018.09.26.12.23.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 26 Sep 2018 12:23:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 26 Sep 2018 12:23:12 -0700 Message-Id: <20180926192323.12659-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180926192323.12659-1-richard.henderson@linaro.org> References: <20180926192323.12659-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::42e Subject: [Qemu-devel] [PATCH v2 04/15] target/arm: Handle SVE vector length changes in system mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" SVE vector length can change when changing EL, or when writing to one of the ZCR_ELn registers. For correctness, our implementation requires that predicate bits that are inaccessible are never set. Which means noticing length changes and zeroing the appropriate register bits. Tested-by: Laurent Desnogues Signed-off-by: Richard Henderson --- target/arm/cpu.h | 4 ++ target/arm/cpu64.c | 42 -------------- target/arm/helper.c | 127 ++++++++++++++++++++++++++++++++++++----- target/arm/op_helper.c | 1 + 4 files changed, 119 insertions(+), 55 deletions(-) -- 2.17.1 diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 65c0fa0a65..a4ee83dc77 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -910,6 +910,10 @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs, int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg); int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq); +void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el); +#else +static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { } +static inline void aarch64_sve_change_el(CPUARMState *env, int o, int n) { } #endif target_ulong do_arm_semihosting(CPUARMState *env); diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 800bff780e..db71504cb5 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -410,45 +410,3 @@ static void aarch64_cpu_register_types(void) } type_init(aarch64_cpu_register_types) - -/* The manual says that when SVE is enabled and VQ is widened the - * implementation is allowed to zero the previously inaccessible - * portion of the registers. The corollary to that is that when - * SVE is enabled and VQ is narrowed we are also allowed to zero - * the now inaccessible portion of the registers. - * - * The intent of this is that no predicate bit beyond VQ is ever set. - * Which means that some operations on predicate registers themselves - * may operate on full uint64_t or even unrolled across the maximum - * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally - * may well be cheaper than conditionals to restrict the operation - * to the relevant portion of a uint16_t[16]. - * - * TODO: Need to call this for changes to the real system registers - * and EL state changes. - */ -void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) -{ - int i, j; - uint64_t pmask; - - assert(vq >= 1 && vq <= ARM_MAX_VQ); - assert(vq <= arm_env_get_cpu(env)->sve_max_vq); - - /* Zap the high bits of the zregs. */ - for (i = 0; i < 32; i++) { - memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq)); - } - - /* Zap the high bits of the pregs and ffr. */ - pmask = 0; - if (vq & 3) { - pmask = ~(-1ULL << (16 * (vq & 3))); - } - for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) { - for (i = 0; i < 17; ++i) { - env->vfp.pregs[i].p[j] &= pmask; - } - pmask = 0; - } -} diff --git a/target/arm/helper.c b/target/arm/helper.c index 52fc9d1d4c..9b1f868efa 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -4461,11 +4461,44 @@ static int sve_exception_el(CPUARMState *env, int el) return 0; } +/* + * Given that SVE is enabled, return the vector length for EL. + */ +static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el) +{ + ARMCPU *cpu = arm_env_get_cpu(env); + uint32_t zcr_len = cpu->sve_max_vq - 1; + + if (el <= 1) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]); + } + if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]); + } + if (el < 3 && arm_feature(env, ARM_FEATURE_EL3)) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]); + } + return zcr_len; +} + static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) { + int cur_el = arm_current_el(env); + int old_len = sve_zcr_len_for_el(env, cur_el); + int new_len; + /* Bits other than [3:0] are RAZ/WI. */ raw_write(env, ri, value & 0xf); + + /* + * Because we arrived here, we know both FP and SVE are enabled; + * otherwise we would have trapped access to the ZCR_ELn register. + */ + new_len = sve_zcr_len_for_el(env, cur_el); + if (new_len < old_len) { + aarch64_sve_narrow_vq(env, new_len + 1); + } } static const ARMCPRegInfo zcr_el1_reginfo = { @@ -8305,8 +8338,11 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs) unsigned int new_el = env->exception.target_el; target_ulong addr = env->cp15.vbar_el[new_el]; unsigned int new_mode = aarch64_pstate_mode(new_el, true); + unsigned int cur_el = arm_current_el(env); - if (arm_current_el(env) < new_el) { + aarch64_sve_change_el(env, cur_el, new_el); + + if (cur_el < new_el) { /* Entry vector offset depends on whether the implemented EL * immediately lower than the target level is using AArch32 or AArch64 */ @@ -12598,18 +12634,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc, if (sve_el != 0 && fp_el == 0) { zcr_len = 0; } else { - ARMCPU *cpu = arm_env_get_cpu(env); - - zcr_len = cpu->sve_max_vq - 1; - if (current_el <= 1) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]); - } - if (current_el < 2 && arm_feature(env, ARM_FEATURE_EL2)) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]); - } - if (current_el < 3 && arm_feature(env, ARM_FEATURE_EL3)) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]); - } + zcr_len = sve_zcr_len_for_el(env, current_el); } flags |= sve_el << ARM_TBFLAG_SVEEXC_EL_SHIFT; flags |= zcr_len << ARM_TBFLAG_ZCR_LEN_SHIFT; @@ -12665,3 +12690,79 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc, *pflags = flags; *cs_base = 0; } + +#ifdef TARGET_AARCH64 +/* + * The manual says that when SVE is enabled and VQ is widened the + * implementation is allowed to zero the previously inaccessible + * portion of the registers. The corollary to that is that when + * SVE is enabled and VQ is narrowed we are also allowed to zero + * the now inaccessible portion of the registers. + * + * The intent of this is that no predicate bit beyond VQ is ever set. + * Which means that some operations on predicate registers themselves + * may operate on full uint64_t or even unrolled across the maximum + * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally + * may well be cheaper than conditionals to restrict the operation + * to the relevant portion of a uint16_t[16]. + */ +void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) +{ + int i, j; + uint64_t pmask; + + assert(vq >= 1 && vq <= ARM_MAX_VQ); + assert(vq <= arm_env_get_cpu(env)->sve_max_vq); + + /* Zap the high bits of the zregs. */ + for (i = 0; i < 32; i++) { + memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq)); + } + + /* Zap the high bits of the pregs and ffr. */ + pmask = 0; + if (vq & 3) { + pmask = ~(-1ULL << (16 * (vq & 3))); + } + for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) { + for (i = 0; i < 17; ++i) { + env->vfp.pregs[i].p[j] &= pmask; + } + pmask = 0; + } +} + +/* + * Notice a change in SVE vector size when changing EL. + */ +void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el) +{ + int old_len, new_len; + + /* Nothing to do if no SVE. */ + if (!arm_feature(env, ARM_FEATURE_SVE)) { + return; + } + + /* Nothing to do if FP is disabled in either EL. */ + if (fp_exception_el(env, old_el) || fp_exception_el(env, new_el)) { + return; + } + + /* + * When FP is enabled, but SVE is disabled, the effective len is 0. + * ??? Do we need a conditional for old_el/new_el in aa32 state? + * That isn't included in the CheckSVEEnabled pseudocode, so is the + * host kernel required to explicitly disable SVE for an EL using aa32? + */ + old_len = (sve_exception_el(env, old_el) + ? 0 : sve_zcr_len_for_el(env, old_el)); + new_len = (sve_exception_el(env, new_el) + ? 0 : sve_zcr_len_for_el(env, new_el)); + + /* When changing vector length, clear inaccessible state. */ + if (new_len < old_len) { + aarch64_sve_narrow_vq(env, new_len + 1); + } +} +#endif diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c index 952b8d122b..430c50a9f9 100644 --- a/target/arm/op_helper.c +++ b/target/arm/op_helper.c @@ -1082,6 +1082,7 @@ void HELPER(exception_return)(CPUARMState *env) "AArch64 EL%d PC 0x%" PRIx64 "\n", cur_el, new_el, env->pc); } + aarch64_sve_change_el(env, cur_el, new_el); qemu_mutex_lock_iothread(); arm_call_el_change_hook(arm_env_get_cpu(env));